The data is available at https://github.com/UCSD-AI4H/PathVQA

It contains 4,998 pathology images and 32,799 question-answer pairs. Half of these questions are open-ended (why, what, how, where, etc.) and the other half are “yes/no” questions.  We provide an official split of train, validation, and test.

The details of this dataset are described in this preprint. https://arxiv.org/abs/2003.10286