instruction fine-tuning failed. #6

shengyuwoo · 2024-02-20T11:41:06Z

Hello, thank you very much for your open source contribution. I have a question to ask you. When I was using your project for the second stage of instruction fine-tuning, the program would get stuck without reporting any errors when I used the file parameter zero3.json or zero2.json to configure deepspeed. It just couldn't proceed. However, when I used the file parameter zero3_offload.json to configure deepspeed, the training proceeded normally. Could you please tell me the reason behind this? Have you run it through your open source project?

My training machine has the following hardware configuration: 8xA800 (80G). The deepspeed version is 0.12.6， transformers version is 4.31.0.

tsb0601 · 2024-02-28T03:08:07Z

Hi Could you please provide more detail of the bug?

Z-MU-Z · 2024-03-20T06:58:14Z

Hello, I also encountered a problem during fine-tuning. When I used llava_v1_5_mix665k.json for fine-tuning, I found that the model would report an error when encountering QA data without images. Have you encountered the same problem? How should it be solved?

HashmatShadab · 2024-08-26T07:32:44Z

@tsb0601 Please provide update on the issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

instruction fine-tuning failed. #6

instruction fine-tuning failed. #6

shengyuwoo commented Feb 20, 2024

tsb0601 commented Feb 28, 2024

Z-MU-Z commented Mar 20, 2024

HashmatShadab commented Aug 26, 2024

instruction fine-tuning failed. #6

instruction fine-tuning failed. #6

Comments

shengyuwoo commented Feb 20, 2024

tsb0601 commented Feb 28, 2024

Z-MU-Z commented Mar 20, 2024

HashmatShadab commented Aug 26, 2024