-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load edited GRACE and WISE for evaluation offline #469
Comments
It's quite challenging to load the WISE or GRACE models without creating a WISE/GRACE object, as both models include memory parameters that are part of the external structure and do not inherit from HuggingFace's |
hi, do you have any further questions. |
I tried to perform the generation by creating the WISE/GRACE object, but it doesn't seem like the edits are there. Here is how I did it:
Are you saying that I cannot use the editor.model directly? I can only see if I can utilize the generation method for the WISE object and use that output to create metrics? Since we can calculate metrics on GRACE and WISE within your framework, is there a way to recreate those objects, generate outputs just like we do for the EasyEdit metrics, and calculate other metrics on that output? |
I'm also having a similar problem with AdaLoRA where the edits aren't showing up properly - it just outputs gibberish. I am using save_pretrained and from_pretrained to save and load the model. Does the same issue apply here where there are external parameters that are not being loaded properly? |
You can add |
AdaLoRA might be experiencing overfitting. Have you tried reducing the learning rate? Currently, we haven't observed a large amount of garbled output when running other methods. |
My evaluation of AdaLoRA is returning identical poor metrics in my evaluation for all the datasets I have tried, so I thought it was an issue with the loading of the model for evaluation rather than the method itself. Otherwise, it would be strange (though possible) that models that had differing metrics from your framework got identical metrics in my evaluation. Am I right to be using save_pretrained and from_pretrained to save and load the model? I am not sure if this method has extra parameters like WISE and GRACE. |
Saving the model using Additionally, I have already explained in the previous response that GRACE does not have an offline caching interface. You can implement offline model saving by following the example of WISE's save function. |
We have conducted numerous experiments with AdaLoRA, and the evaluation metrics are completely consistent with other methods. The |
hi, do you have any further questions? |
Yes, I was referring to AdaLoRA when asking about using save_pretrained. I understand that WISE and GRACE require a special implementation. For AdaLoRA, I am running
but the results look strange because I ran several different experiments that all resulted in different metrics in your framework, but got identical evaluation metrics in my separate evaluation for a different task. So, I just wanted to clarify that my approach was correct. Thank you for clarifying that! For GRACE, I saw that the original code does this to save. So, that's why I did the implementation I mentioned earlier, where I saved:
and loaded:
However, the model didn't seem to load correctly because the loaded model doesn't seem edited. I will look into additional parameters that are necessary to save for GRACE. For WISE, I made a slight change to your suggestion here
since the editor object doesn't have a load method. I loaded a WISE object instead using:
The results don't seem to have the edits still, but there was a slight change this time. I will try to reproduce the metrics given from the original saved model on the loaded model to verify. If I misunderstood something in your suggestions, please let me know. Otherwise, I will investigate further. |
I also think that your use of LoRA is correct. For GRACE, I believe that your offline saving method is incorrect. The code at here only saves the For WISE, as you said, “there was a slight change this time,” which proves that the saving of WISE should be OK. As for the other benchmarks, if the metrics/output do not change significantly, it might be normal. Perhaps this is a flaw of WISE that you have discovered! |
Thank you for the clarification! I will investigate further and see if I can resolve this. |
For WISE, I am seeing a difference in the logits, but not the generated outputs. I see
|
"generate" might be problematic because the token generation process is not necessarily classified as "side memory." Additionally, "wise" can currently only operate under the condition of teacher forcing. sry, I have no way to fix this issue. |
Is there a way to load a saved WISE or GRACE model without creating a WISE/GRACE object? I am performing a separate evaluation of the edited model, so ideally I would be able to load the model outside of the EasyEdit repo.
I understand (from here) that there is a load_path hparam to load a WISE model, but loading the WISE object in my separate evaluation repo is difficult due to dependency issues. For other methods, I have been using save_pretrained and from_pretrained to save the edited model and load it for evaluation respectively, but I understand that GRACE and WISE have special parameters.
I tried to use torch.save and torch.load_state_dict for GRACE as the original paper authors did, but the edits are not showing up as I expect. I think the edited version of the model is not being saved/loaded properly.
Any help you have to offer would be immensely appreciated! And thank you for putting together such a great framework for model editing.
The text was updated successfully, but these errors were encountered: