CFEVER Homepage

About

CFEVER is a Chinese Fact Extraction and VERification dataset. Inspired by the FEVER dataset (Thorne et al., 2018), we provide class labels (Supports, Refutes, or Not Enough Information) and evidence for each claim in the CFEVER dataset. CFEVER is currently the largest Chinese fact-checking dataset, and it can serve as a benchmark for Chinese fact verification and retrieval-augmented generation (RAG).

For more details about CFEVER, please refer to our AAAI-24 paper:

Dataset

Please visit our GitHub repository to download the dataset:

Submission

We do not provide the ground-truths for the test set. To submit your model, please follow the instructions in our GitHub repository.

Citation

If you use CFEVER in your research, please cite our paper by:

@inproceedings{lin2023CFEVER,
  title={CFEVER: A Chinese Fact Extraction and VERification Dataset},
  author={Lin, Ying-Jia and Lin, Chun-Yi and Yeh, Chia-Jen and Li, Yi-Ting and Hu, Yun-Yu and Hsu, Chih-Hao and Lee, Mei-Feng and Kao, Hung-Yu},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence},
  pages={},
  year={2024},
  month={Feb.}
}

Leaderboard

	Model	Accuracy (%)	FEVER Score (%)
1 Feb 20, 2024	BEVERS University of Southern California & University of Nebraska-Lincoln (DeHaven and Scott, 2023) (Implemented in Lin et al., 2023)	69.73	64.80
2 Feb 20, 2024	Our baseline National Cheng Kung University (Lin et al., 2023)	61.17	52.47
3 Apr 24, 2024	Kernel Graph Attention Network (KGAT) Beijing University of Posts and Telecommunications	52.17	45.83
4 Feb 20, 2024	GPT-4 (3-shot) Microsoft & OpenAI (Lin et al., 2023)	48.40	NA
5 Feb 20, 2024	GPT-4 (zero-shot) Microsoft & OpenAI (Lin et al., 2023)	47.23	NA
6 Feb 20, 2024	GPT-3.5 (3-shot) Microsoft & OpenAI (Lin et al., 2023)	44.20	NA
7 Feb 20, 2024	GPT-3.5 (zero-shot) Microsoft & OpenAI (Lin et al., 2023)	43.17	NA