CFEVER

A Fact Extraction and VERification dataset in Traditional Chinese

About

CFEVER is a Chinese Fact Extraction and VERification dataset. Inspired by the FEVER dataset (Thorne et al., 2018), we provide class labels (Supports, Refutes, or Not Enough Information) and evidence for each claim in the CFEVER dataset. CFEVER is currently the largest Chinese fact-checking dataset, and it can serve as a benchmark for Chinese fact verification and retrieval-augmented generation (RAG).

For more details about CFEVER, please refer to our AAAI-24 paper:

Dataset

Please visit our GitHub repository to download the dataset:

Submission

We do not provide the ground-truths for the test set. To submit your model, please follow the instructions in our GitHub repository.

Citation

If you use CFEVER in your research, please cite our paper by:

@inproceedings{lin2023CFEVER,
  title={CFEVER: A Chinese Fact Extraction and VERification Dataset},
  author={Lin, Ying-Jia and Lin, Chun-Yi and Yeh, Chia-Jen and Li, Yi-Ting and Hu, Yun-Yu and Hsu, Chih-Hao and Lee, Mei-Feng and Kao, Hung-Yu},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence},
  pages={},
  year={2024},
  month={Feb.}
}
Leaderboard
Model Code Accuracy (%) FEVER Score (%)
1
Feb 20, 2024
BEVERS
University of Southern California & University of Nebraska-Lincoln
(DeHaven and Scott, 2023)
(Implemented in Lin et al., 2023)
69.73 64.80
2
Feb 20, 2024
Our baseline
National Cheng Kung University
(Lin et al., 2023)
61.17 52.47
3
Apr 24, 2024
Kernel Graph Attention Network (KGAT)
Beijing University of Posts and Telecommunications
52.17 45.83
4
Feb 20, 2024
GPT-4 (3-shot)
Microsoft & OpenAI
(Lin et al., 2023)
48.40 NA
5
Feb 20, 2024
GPT-4 (zero-shot)
Microsoft & OpenAI
(Lin et al., 2023)
47.23 NA
6
Feb 20, 2024
GPT-3.5 (3-shot)
Microsoft & OpenAI
(Lin et al., 2023)
44.20 NA
7
Feb 20, 2024
GPT-3.5 (zero-shot)
Microsoft & OpenAI
(Lin et al., 2023)
43.17 NA