GitHub

Code for paper "Towards More Robust Retrieval-Augmented Generation: Evaluating RAG Under Adversarial Poisoning Attacks"

Environment

conda create -n eval_PoisonRaG python=3.10

conda activate eval_PoisonRaG

pip install beir openai google-generativeai
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
pip install --upgrade charset-normalizer
pip3 install "fschat[model_worker,webui]"

Data

Please download the data on google drive here.

Initial Data can be find in results/adv_and_guiding_contexts directory, where guiding_contexts are generated by prompting gpt-4; while adv_contexts are from the original PoisonedRAG paper.

Directory results/pre-processed is the preprocessed data, where an additional topk_results entry is added. This entry contains different contexts (top-10 untouched context, 5 adv contexts, and 5 guiding contexts) and their similarity score according to different retrievers.

Final RAG outputs from LLMs will be in results/LLM_output_results directory.

Code for producing the pre-processed data

To successfully run preprocess.py, make sure you already have corresponding data in results/beir_results directory. (which is provided in google drive). If you want to produce the beir_results yourself, you may use the code here.

python preprocess.py --eval_model_code 'ance' --dataset_name 'hotpotqa'
# choose eval_model_code among ['dpr-multi', 'dpr-single', 'contriever', 'ance', 'contriever-msmarco']
# choose dataset_name among ['hotpotqa', 'nq']

Code for signle context experiments

python SingleContext.py --model_name 'gpt3.5' \
# choose from ['gpt3.5', 'gpt4', 'gpt4o', 'claude', 'llama8b', 'llama70b']
--prompt_type 'skeptical' \
# choose from ['skeptical', 'faithful', 'neutral']
--dataset_name 'nq'
# choose from ['hotpotqa', 'msmarco', 'nq']

Code for multiple context experiments (Dilution, pollution rate and counteract)

python MixedContext.py --model_name 'gpt3.5' \
# choose from ['gpt3.5', 'gpt4', 'gpt4o', 'claude', 'llama8b', 'llama70b']
--prompt_type 'skeptical' \
# choose from ['skeptical', 'faithful', 'neutral']
--dataset_name 'nq'
# choose from ['hotpotqa', 'msmarco', 'nq']

Code for using generation and retrieval together (top-10 retrieval)

python top10.py

Note:

I might induce some bugs when I clean up the code and data, so please contact me through email if something weird happens or there is something wrong with the data.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
model_configs		model_configs
results/adv_and_guiding_contexts		results/adv_and_guiding_contexts
src		src
MixedContext.py		MixedContext.py
README.md		README.md
SingleContext.py		SingleContext.py
preprocess.py		preprocess.py
top10.py		top10.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Environment

Data

Code for producing the pre-processed data

Code for signle context experiments

Code for multiple context experiments (Dilution, pollution rate and counteract)

Code for using generation and retrieval together (top-10 retrieval)

Note:

About

Releases

Packages

Languages

JinyanSu1/eval_PoisonRaG

Folders and files

Latest commit

History

Repository files navigation

Environment

Data

Code for producing the pre-processed data

Code for signle context experiments

Code for multiple context experiments (Dilution, pollution rate and counteract)

Code for using generation and retrieval together (top-10 retrieval)

Note:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages