Load edited GRACE and WISE for evaluation offline #469

shariqahn · 2025-01-05T00:33:47Z

Is there a way to load a saved WISE or GRACE model without creating a WISE/GRACE object? I am performing a separate evaluation of the edited model, so ideally I would be able to load the model outside of the EasyEdit repo.

I understand (from here) that there is a load_path hparam to load a WISE model, but loading the WISE object in my separate evaluation repo is difficult due to dependency issues. For other methods, I have been using save_pretrained and from_pretrained to save the edited model and load it for evaluation respectively, but I understand that GRACE and WISE have special parameters.

I tried to use torch.save and torch.load_state_dict for GRACE as the original paper authors did, but the edits are not showing up as I expect. I think the edited version of the model is not being saved/loaded properly.

Any help you have to offer would be immensely appreciated! And thank you for putting together such a great framework for model editing.

pengzju · 2025-01-07T04:54:20Z

It's quite challenging to load the WISE or GRACE models without creating a WISE/GRACE object, as both models include memory parameters that are part of the external structure and do not inherit from HuggingFace's PreTrainedModel. I apologize, but the scenario you mentioned is not currently feasible. It would be best to separate the generation and evaluation processes within the evaluation module. At least WISE supports offline generation.

zxlzr · 2025-01-07T06:38:07Z

hi, do you have any further questions.

shariqahn · 2025-01-07T18:46:45Z

I tried to perform the generation by creating the WISE/GRACE object, but it doesn't seem like the edits are there. Here is how I did it:

if ("WISE" in cfg.model_path):
    hparams = WISEHyperParams.from_hparams('../EasyEdit/hparams/WISE/eval.yaml')
    hparams.load_path = os.path.join(cfg.model_path, "wise.pt")
    editor = BaseEditor.from_hparams(hparams)
    model = editor.model
elif ('GRACE' in cfg.model_path):
    path = <llama path>
    model = AutoModelForCausalLM.from_pretrained(path, config=config, use_flash_attention_2=False, torch_dtype=torch.float16, trust_remote_code = True, device_map=device_map)
    checkpoint = os.path.join(cfg.model_path, "model.pt")
    state_dict = torch.load(checkpoint, map_location='cpu')
    model.load_state_dict(state_dict, False)
else:
    model = AutoModelForCausalLM.from_pretrained(cfg.model_path, config=config, use_flash_attention_2=False, torch_dtype=torch.float16, trust_remote_code = True, device_map=device_map)

Are you saying that I cannot use the editor.model directly? I can only see if I can utilize the generation method for the WISE object and use that output to create metrics?

Since we can calculate metrics on GRACE and WISE within your framework, is there a way to recreate those objects, generate outputs just like we do for the EasyEdit metrics, and calculate other metrics on that output?

shariqahn · 2025-01-08T07:44:56Z

I'm also having a similar problem with AdaLoRA where the edits aren't showing up properly - it just outputs gibberish. I am using save_pretrained and from_pretrained to save and load the model. Does the same issue apply here where there are external parameters that are not being loaded properly?

pengzju · 2025-01-09T06:15:26Z

You can add editor.load(hparams.load_path) after editor = BaseEditor.from_hparams(hparams). This should allow WISE to run normally. However, GRACE currently doesn't have a save_pt interface, and the original official implementation doesn't provide it either.

pengzju · 2025-01-09T06:17:22Z

AdaLoRA might be experiencing overfitting. Have you tried reducing the learning rate? Currently, we haven't observed a large amount of garbled output when running other methods.

shariqahn · 2025-01-09T09:33:10Z

My evaluation of AdaLoRA is returning identical poor metrics in my evaluation for all the datasets I have tried, so I thought it was an issue with the loading of the model for evaluation rather than the method itself. Otherwise, it would be strange (though possible) that models that had differing metrics from your framework got identical metrics in my evaluation.

Am I right to be using save_pretrained and from_pretrained to save and load the model? I am not sure if this method has extra parameters like WISE and GRACE.

pengzju · 2025-01-10T02:44:11Z

Saving the model using save_pretrained is incorrect because there are additional network parameter structures. WISE has implemented an offline save function, which you can use: EasyEdit/easyeditor/models/wise/WISE.py at main · zjunlp/EasyEdit.

Additionally, I have already explained in the previous response that GRACE does not have an offline caching interface. You can implement offline model saving by following the example of WISE's save function.

pengzju · 2025-01-10T02:49:27Z

We have conducted numerous experiments with AdaLoRA, and the evaluation metrics are completely consistent with other methods. The peft_model is returned in the edit interface (code), and both saving and loading can be achieved through Hugging Face's default interfaces (save_pretrained and from_pretrained).

zxlzr · 2025-01-13T10:15:45Z

hi, do you have any further questions?

shariqahn · 2025-01-13T20:14:15Z

Yes, I was referring to AdaLoRA when asking about using save_pretrained. I understand that WISE and GRACE require a special implementation. For AdaLoRA, I am running

metrics, edited_model, _ = editor.edit(
    ...
)

edited_model.save_pretrained(model_save_dir)

but the results look strange because I ran several different experiments that all resulted in different metrics in your framework, but got identical evaluation metrics in my separate evaluation for a different task. So, I just wanted to clarify that my approach was correct. Thank you for clarifying that!

For GRACE, I saw that the original code does this to save. So, that's why I did the implementation I mentioned earlier, where I saved:

checkpoint = os.path.join(model_save_dir, "model.pt")
torch.save(edited_model.model.state_dict(), checkpoint)

and loaded:

model = AutoModelForCausalLM.from_pretrained(path, config=config, use_flash_attention_2=False, torch_dtype=torch.float16, trust_remote_code = True, device_map=device_map)
checkpoint = os.path.join(cfg.model_path, "model.pt")
state_dict = torch.load(checkpoint, map_location='cpu')
model.load_state_dict(state_dict, False)

However, the model didn't seem to load correctly because the loaded model doesn't seem edited. I will look into additional parameters that are necessary to save for GRACE.

For WISE, I made a slight change to your suggestion here

You can add editor.load(hparams.load_path) after editor = BaseEditor.from_hparams(hparams). This should allow WISE to run normally. However, GRACE currently doesn't have a save_pt interface, and the original official implementation doesn't provide it either.

since the editor object doesn't have a load method. I loaded a WISE object instead using:

editor = BaseEditor.from_hparams(hparams)
model = WISE(model=editor.model, config=hparams, device=editor.model.device)
model.load(hparams.load_path)

The results don't seem to have the edits still, but there was a slight change this time. I will try to reproduce the metrics given from the original saved model on the loaded model to verify.

If I misunderstood something in your suggestions, please let me know. Otherwise, I will investigate further.

pengzju · 2025-01-18T04:02:19Z

I also think that your use of LoRA is correct.

For GRACE, I believe that your offline saving method is incorrect. The code at here only saves the state_dict, which can only preserve the model architecture. However, the key and value of GRACE do not belong to the model architecture; they are just external memory and need to be saved separately. You can refer to the saving method of WISE. But since GRACE is not my work, you will need to explore this part yourself.

For WISE, as you said, “there was a slight change this time,” which proves that the saving of WISE should be OK. As for the other benchmarks, if the metrics/output do not change significantly, it might be normal. Perhaps this is a flaw of WISE that you have discovered!

shariqahn · 2025-01-20T05:04:03Z

Thank you for the clarification! I will investigate further and see if I can resolve this.

shariqahn · 2025-01-22T22:47:40Z

For WISE, I am seeing a difference in the logits, but not the generated outputs. I see TODO: generation above the WISE generate method. Is this not implemented yet? Or there something additional I need to do to handle generation with WISE? I am running:

out = model.generate(inputs.input_ids, attention_mask=inputs.attention_mask, max_length=cfg.generation.max_length, max_new_tokens=cfg.generation.max_new_tokens, do_sample=False, use_cache=True, pad_token_id=left_pad_tokenizer.eos_token_id)

pengzju · 2025-01-23T04:26:48Z

def generate(self, *args, **kwargs):
        setattr(eval(f"self.model.{self.layer}"), "key_id", -1)
        return self.model.generate(*args, **kwargs)

"generate" might be problematic because the token generation process is not necessarily classified as "side memory." Additionally, "wise" can currently only operate under the condition of teacher forcing.

sry, I have no way to fix this issue.

zxlzr added the question Further information is requested label Jan 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load edited GRACE and WISE for evaluation offline #469

Load edited GRACE and WISE for evaluation offline #469

shariqahn commented Jan 5, 2025 •

edited

Loading

pengzju commented Jan 7, 2025

zxlzr commented Jan 7, 2025

shariqahn commented Jan 7, 2025 •

edited

Loading

shariqahn commented Jan 8, 2025

pengzju commented Jan 9, 2025

pengzju commented Jan 9, 2025

shariqahn commented Jan 9, 2025 •

edited

Loading

pengzju commented Jan 10, 2025

pengzju commented Jan 10, 2025

zxlzr commented Jan 13, 2025

shariqahn commented Jan 13, 2025 •

edited

Loading

pengzju commented Jan 18, 2025

shariqahn commented Jan 20, 2025

shariqahn commented Jan 22, 2025 •

edited

Loading

pengzju commented Jan 23, 2025

Load edited GRACE and WISE for evaluation offline #469

Load edited GRACE and WISE for evaluation offline #469

Comments

shariqahn commented Jan 5, 2025 • edited Loading

pengzju commented Jan 7, 2025

zxlzr commented Jan 7, 2025

shariqahn commented Jan 7, 2025 • edited Loading

shariqahn commented Jan 8, 2025

pengzju commented Jan 9, 2025

pengzju commented Jan 9, 2025

shariqahn commented Jan 9, 2025 • edited Loading

pengzju commented Jan 10, 2025

pengzju commented Jan 10, 2025

zxlzr commented Jan 13, 2025

shariqahn commented Jan 13, 2025 • edited Loading

pengzju commented Jan 18, 2025

shariqahn commented Jan 20, 2025

shariqahn commented Jan 22, 2025 • edited Loading

pengzju commented Jan 23, 2025

shariqahn commented Jan 5, 2025 •

edited

Loading

shariqahn commented Jan 7, 2025 •

edited

Loading

shariqahn commented Jan 9, 2025 •

edited

Loading

shariqahn commented Jan 13, 2025 •

edited

Loading

shariqahn commented Jan 22, 2025 •

edited

Loading