For “yes/no" questions, they will be evaluated using accuracy.
For open-ended questions, they will be evaluated using exact match, macro-averaged F1, and BLEU.
The evaluation scripts are available at https://github.com/UCSD-AI4H/PathVQA
The submitted solutions will be ranked by the (macro) average of these metrics.