Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries

This repository contains code, data, and templates for crowdsourcing protocols, described by the paper: Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries.

Scripts

calculate.ipynb: to calculate the score distribution, krippendorff reliability, and SHR reliability.

Data

We released our evaluation templates and annotations to promote future work on factual consistency evaluation. The annotations can be found in for CNN&DM data, for XSUM data and templates

Model

The code for BART, ProphetNet, PEGASUS, and BERTSUM is based on Fairseq(-py). Our pretrained models can be found in for CNN&DM data and for XSUM data

Citation

If you use our code in your research, please cite our work:

@inproceedings{tang2022investigating,
   title={Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries},
   author={Tang, Xiangru and Fabbri, Alexander R and Mao, Ziming and Adams, Griffin and Wang, Borui and Li, Haoran and Mehdad, Yashar and Radev, Dragomir},
   booktitle={North American Association for Computational Linguistics (NAACL)},
   year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
crowdsourcing		crowdsourcing
fairseq		fairseq
final_result		final_result
get_final_result		get_final_result
interval_test		interval_test
scripts		scripts
shuffle_order		shuffle_order
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries

Scripts

Data

Model

Citation

About

Releases

Packages

Contributors 2

Languages

XiangruTang/FactualEval-protocols

Folders and files

Latest commit

History

Repository files navigation

Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries

Scripts

Data

Model

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages