Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How to get meetingbank_test_3qa_pairs_summary_formated.json? #170

Open
mzf666 opened this issue Jul 24, 2024 · 2 comments
Open
Assignees
Labels
question Further information is requested

Comments

@mzf666
Copy link

mzf666 commented Jul 24, 2024

Describe the issue

When I am trying to run the script experiments/llmlingua2/evaluation/scripts/compress.sh, it seems that the code for constructing ../../../results/meetingbank_short/origin/meetingbank_test_3qa_pairs_summary_formated.json is missed? Similarly, I can neither found the construction codes for ../../../results/longbench/origin/longbench_test_single_doc_qa_formated.json, ../../../results/zero_scrolls/origin/zero_scrolls_validation.json and ../../../results/gsm8k/origin/gsm8k_cot_example_all_in_one.json.

May I know how to construct these json formatted data files? Thanks for your consideration!

@mzf666 mzf666 added the question Further information is requested label Jul 24, 2024
@pzs19
Copy link
Contributor

pzs19 commented Jul 30, 2024

Hi, @mzf666, thank you for raising the question.

We have provided the meetingbank_test_3qa_pairs_summary_formated.json on huggingface. For Longbench, you can refer to the format_data scripts and the LongBench repo.

@cornzz
Copy link

cornzz commented Aug 22, 2024

@mzf666 I figured out how to get the dataset into the appropriate format for compress.sh

from datasets import load_dataset
import json
import os


os.makedirs("results/meetingbank_short/origin", exist_ok=True)
if not os.path.exists("results/meetingbank_short/origin/meetingbank_test_3qa_pairs_summary_formated.json"):
    meeting_bank_comp = load_dataset("microsoft/MeetingBank-QA-Summary", split="test")
    json.dump(
        meeting_bank_comp.to_list(),
        open("results/meetingbank_short/origin/meetingbank_test_3qa_pairs_summary_formated.json", "w"),
    )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants