Reproduce results #4

adita15 · 2021-04-12T17:42:02Z

I am tryin to reproduce these results and I am quite confused w.r.t. the structure. Could you provide detailed setup instructions?

kamalojasv181 · 2021-04-13T02:00:41Z

Can you please particularly point out what part confuses you?

kamalojasv181 · 2021-04-13T02:24:28Z

I will write a generic workflow. First, you need to get the dataset. We have not posted the dataset here due to the policy of Constraint Shared Task. Get yourself registered with them to get the dataset and put it in the Dataset folder. The dataset must have two columns 1) the data 2) the labels. (we deleted the first row containing the column names and the first column containing serial numbers for each tweet). Now if you want to train the models yourself, make a directory by the name of models and run main_multitask_learning.py or main_bin_classification.py. If you wish to use our models, download them in the models folder in this directory. Now you can write your own script to generate results or use our script at your convenience. For anything specific, feel free to ask.

adita15 · 2021-04-13T02:26:05Z

Thank you for prompt response. These are the points that I want to confirm- 1. The dataset passed in the model (combined) is combination of train, val and test files from the original dataset files 2. The number of epochs to train t get the said results 3. The model to be passed in as parameter to train model (should it be 'mrm8488/HindiBERTa’ or “ai4bharat/indic-bert” or anything else to get the results for auxiliary model) 4. Which script counts as baseline and which one counts as auxiliary? Thank you, Aditi Damle.

…

On Apr 12, 2021, at 10:00 PM, OJASV Kamal ***@***.***> wrote: Can you please particularly point out what part confuses you? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOLTGLCGHRT7HCLVIK33KATTIOQVLANCNFSM42ZXXM2A>.

kamalojasv181 · 2021-04-13T02:42:33Z

Nope, for training, use training data, for validation, use validation data and generate csv on test data.
Use 10 epochs for all models.
Pass the one you want to generate results for.
Baseline model is the one mentioned in the paper by the workshop organisers (https://arxiv.org/abs/2011.03588). For the auxiliary approach, use the file main_multitask_learning.py . I can see why this might confuse someone. We were naive about the code. For now, use this info, I will update the repo in a day or two.

adita15 · 2021-04-13T03:03:58Z

Thank you for this response. Could you just confirm the parameters passed to the classifer in place of model_path (ai4bharat/indic-bert ) or anything else?This confusion is mainly because some of them were experiments and only one was you selected approach. Just so that we could reproduce exact results. Also, if I wish to use pretrained model from your model files, where do I specify that while fine tuning using main_multitask_classfer.py?

…

On Apr 12, 2021, at 10:42 PM, OJASV Kamal ***@***.***> wrote: Nope, for training, use training data, for validation, use validation data and generate csv on test data. Use 10 epochs for all models. Pass the one you want to generate results for. Baseline model is the one mentioned in the paper by the workshop organisers (https://arxiv.org/abs/2011.03588 <https://arxiv.org/abs/2011.03588>). For the auxiliary approach, use the file main_multitask_learning.py . I can see why this might confuse someone. We were naive about the code. For now, use this info, I will update the repo in a day or two. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOLTGLAVJJD7MU4YOQLCGU3TIOVSLANCNFSM42ZXXM2A>.

kamalojasv181 · 2021-04-13T03:12:28Z

Yes, our best results were obtained on the ai4bharat/indic-bert using the auxiliary approach.
So the binaries released by us are already fine-tuned on the dataset of the workshop. You can either choose to fine-tune the original ai4bharat/indic-bert model on the same dataset and reproduce the models that we have released or just use our released models to directly generate results on the testset. There is no point fine tuning our model on the same dataset.

kamalojasv181 · 2021-04-13T03:18:18Z

Anything else? Should I close it?

adita15 · 2021-04-13T03:47:48Z

Hi Ojasv, We really appreciate your responses and sorry to trouble you so much. Your responses so far have clarified most of our doubts. We just have final two queries. 1. If we are passing only Train file to train the data, I did not see any command to include val data set in the repo. So how does that thing work? 2. Also, from the paper, if linear SVM is the baseline model and we are training data using it to compare, why do we fit the data only using test file? Could you elaborate more on that part. I am attaching the SS of the results we have got using the pre-trained model files. Kindly let us know if these look correct. I am okay with closing the issue and I hope I could ask you more questions in the future with regards to this problem of hate speech detection. Its really amazing work! Thank you once again, Aditi Damle.

…

On Apr 12, 2021, at 11:18 PM, OJASV Kamal ***@***.***> wrote: Anything else? Should I close it? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOLTGLGGKI2ZSE53PA7UV73TIOZYNANCNFSM42ZXXM2A>.

kamalojasv181 · 2021-04-13T03:59:29Z

Actually, we did a very sloppy job. We combined the train and valid data(in the CSV), and we accordingly passed the spilt parameter. For now, please bear with us. I have noted this and will fix it very soon.
Can you please elaborate? Are you talking about the baseline paper?

adita15 · 2021-04-13T04:04:25Z

Its alright, We understand you as we are also students! Here is the SS. The baseline model in the paper mentions SVM use. My understanding is that you guys are training the data on linear SVM and then comparing against your auxiliary approach to show the improvement. In such case, the data used here to fit SVM model is just the test set. So I am not clear about the purpose of this SVM code here in generate_csv.py Best, Aditi

…

On Apr 12, 2021, at 11:59 PM, OJASV Kamal ***@***.***> wrote: Actually, we did a very sloppy job. We combined the train and valid data(in the CSV), and we accordingly passed the spilt parameter. For now, please bear with us. I have noted this and will fix it very soon. Can you please elaborate? Are you talking about the baseline paper? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOLTGLA2CAUKPYDO65JYHZLTIO6S3ANCNFSM42ZXXM2A>.

kamalojasv181 · 2021-04-13T04:14:47Z

Ok. Our bad again!. We actually tried ensembling in the generate csv code, which did not work out for us. This is not the baseline implementation but result generation with ensambling. We forgot to delete the code. Thanks for pointing out.

adita15 · 2021-04-13T04:36:28Z

No problem! So I will just delete all the SVM code for now. For the baseline, I refereed the paper. I want to ask if you have implanted that with this dataset in you repo?

…

On Apr 13, 2021, at 12:14 AM, OJASV Kamal ***@***.***> wrote: Ok. Our bad again!. We actually tried ensembling in the generate csv code, which did not work out for us. This is not the baseline implementation but result generation with ensambling. We forgot to delete the code. Thanks for pointing out. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOLTGLGFJQJ6QTYRDXBNO6LTIPAMHANCNFSM42ZXXM2A>.

adita15 · 2021-04-14T00:18:22Z

I tried fine tuning using your script. I am still not able to reproduce the results. F1 scores lag by 2 digits for all tasks

siddjags · 2021-04-14T04:38:33Z

I am also facing a similar issue. Are we supposed to train the model with batch size 16? The current version of the code is using batch_size=8. Also, the pre-trained models do not give identical results on running generate_csv.py. Could you please help me with this?
FYI I am trying to reproduce results for AUX Indic Bert. Here are the results that were obtained after running main_multitask_learning.py to train/fine-tune the model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduce results #4

Reproduce results #4

adita15 commented Apr 12, 2021

kamalojasv181 commented Apr 13, 2021

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

kamalojasv181 commented Apr 13, 2021

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

adita15 commented Apr 14, 2021

siddjags commented Apr 14, 2021 •

edited

Loading

Reproduce results #4

Reproduce results #4

Comments

adita15 commented Apr 12, 2021

kamalojasv181 commented Apr 13, 2021

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

kamalojasv181 commented Apr 13, 2021

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

kamalojasv181 commented Apr 13, 2021

adita15 commented Apr 13, 2021 via email

adita15 commented Apr 14, 2021

siddjags commented Apr 14, 2021 • edited Loading

siddjags commented Apr 14, 2021 •

edited

Loading