Skip to content

Latest commit

 

History

History
57 lines (43 loc) · 2.02 KB

NarrativeQA.md

File metadata and controls

57 lines (43 loc) · 2.02 KB

NarrativeQA benchmark

NarrativeQA dataset is an English-lanaguage dataset of stories and corresponding questions designed to test reading comprehension, especially on long documents. The dataset is used to test reading comprehension. There are 2 tasks proposed in the paper: "summaries only" and "stories only", depending on whether the human-generated summary or the full story text is used to answer the question.

Performance

1. Leaderboard from SOTA

Paper Year Model Model Details NDCG@10 Recall@5 EM
XXX 2024 INSTRUCTRAG

R:DPR ,G:ChatGPT-4oMINI - - 71.6
R:DPR ,G:Llama-3-Ins-70B - - 70.8
R:DPR ,G:Llama-3-Ins-8B - - 65.0
Baseline1 R: ❌, G: Llama3-8-Ins8B - - 39.2
Baseline2 R: ❌, G: Llama3-8-Ins70B - - 54.2

2. LLM-based Methods (Reproducable)