多模态,常用数据集
VQAVisual Question Answeing. 看图并回答用自然语言表述的相关问题. 问题包括The goal of visual question answering (VQA) (Antol et al., 2015) is to answer a natural language question related to an image. We take VQA v2.0 da..
·
VQA
Visual Question Answeing.
看图并回答用自然语言表述的相关问题. 问题包括选择题,数字题, 开放题.
The goal of visual question answering (VQA) (Antol et al., 2015) is to answer a natural language question related to an image. We take VQA v2.0 dataset (Goyal et al., 2017) which reduces the answer bias compared to VQA v1.0. The dataset contains an average of 5.4 questions per image and the total amount of questions is 1.1M.
- 例子


参考
- paper,VQA
- 官网网站,visualqa.org
更多推荐


所有评论(0)