WIkipedia LOcal VAriety Question Answering (WILOVA-QA)

Data and code of Information Asymmetry across Language Varieties: A Case Study on Cantonese-Mandarin and Bavarian-German QA

The WILOVA-QA dataset and the data used for generate prompts have been compressed as .zip files to prevent direct leakage. The password for decompressing the files is: wilovaqa

Pipeline

Generate prompts -> Run the LLM generation -> Run the LLM-as-a-judge -> Evaluation

Generate prompts

Run python generate_prompts.py <language_id> <source_type> to generate a .pkl file for prompts, which is is a dictionary of the form: dict[str, dict[str, dict]].

'<language_id>' can be 'deu' (deu-bar) or 'zho' (cmn-yue).

'<source_type>' can be 'dialectqa' or 'eclektic'. Please manually modify the list of settings inside generate_prompts.py to include the desired settings as prompt settings.

Usage example: python generate_prompts.py zho dialectqa

Run the LLM generation

Run python3 -u dialectqa.py <GPU_id(s)> <path_to_pkl_file_of_prompts> <model_name> <tokenizer_path> <model_path> to run the LLM generation. The results will be saved as a .pkl file, which is a dictionary of the form: dict[str, dict[str, dict]].

<GPU_id(s)> may be 1 GPU id for smaller models, or 4 GPU ids for larger models like llama3_70b and qwen2.5_72b.

Run the LLM-as-a-judge

After obtaining the results generated by the LLM, run python3 -u dialectqa.py <GPU_id(s)> <path_to_pkl_file_of_results> <model_name> <tokenizer_path> <model_path> to use another LLM to evaluate the generated results. The LLM-as-a-judge evaluation results will be appended to the existing results, and be saved as a .pkl file, which is a dictionary of the form: dict[str, dict[str, dict]].

Evaluation

After obtaining the results of LLM-as-a-judge, run python evaluation.py <path_to_pkl_file_of_LLM-as-a-judge_results> to evaluate all the results (including metrics other than LLM-as-a-judge). The evaluation scores will be printed on the screen.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
generate_prompts		generate_prompts
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WIkipedia LOcal VAriety Question Answering (WILOVA-QA)

Pipeline

Generate prompts

Run the LLM generation

Run the LLM-as-a-judge

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WIkipedia LOcal VAriety Question Answering (WILOVA-QA)

Pipeline

Generate prompts

Run the LLM generation

Run the LLM-as-a-judge

Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages