For all tokens starting with idx_ , please refer to the vocabulary file for the corresponding word.
For all train/valid/test files, format is same, with the difference in the answer pool size :
<Domain><TAB><QUESTION><TAB><Groundtruth><TAB><Pool>
For InsuranceQA.question.anslabel.*:
<Domain><TAB><QUESTION><TAB><Groundtruth>
For InsuranceQA.label2answer.*
<Answer Label><TAB><Answer Text>
InsuranceQA.question.anslabel.token.1000.pool.solr.test
Include ground truth: 1744 no ground truth: 256 recall: 0.872
InsuranceQA.question.anslabel.token.1000.pool.solr.train
Include ground truth: 11069 no ground truth: 1820 recall: 0.858794320739
InsuranceQA.question.anslabel.token.1000.pool.solr.valid
Include ground truth: 1721 no ground truth: 279 recall: 0.8605
InsuranceQA.question.anslabel.token.100.pool.solr.test
Include ground truth: 1323 no ground truth: 677 recall: 0.6615
InsuranceQA.question.anslabel.token.100.pool.solr.train
Include ground truth: 8311 no ground truth: 4578 recall: 0.644813406781
InsuranceQA.question.anslabel.token.100.pool.solr.valid
Include ground truth: 1278 no ground truth: 722 recall: 0.639
InsuranceQA.question.anslabel.token.1500.pool.solr.test
Include ground truth: 1799 no ground truth: 201 recall: 0.8995
InsuranceQA.question.anslabel.token.1500.pool.solr.train
Include ground truth: 11428 no ground truth: 1461 recall: 0.886647528901
InsuranceQA.question.anslabel.token.1500.pool.solr.valid
Include ground truth: 1776 no ground truth: 224 recall: 0.888
InsuranceQA.question.anslabel.token.500.pool.solr.test
Include ground truth: 1625 no ground truth: 375 recall: 0.8125
InsuranceQA.question.anslabel.token.500.pool.solr.train
Include ground truth: 10391 no ground truth: 2498 recall: 0.806191325937
InsuranceQA.question.anslabel.token.500.pool.solr.valid
Include ground truth: 1592 no ground truth: 408 recall: 0.796