Gemma Model Storing and Loading after Fine tuning #67

Danish202Gupta · 2024-03-06T05:46:00Z

Hi there, I encountered a strange bug after trying to load the gemma-2b model using kerasnlp.

My finetuning code is the following:

` def fine_tune(self, X, y):
data = generate_training_prompts(X, y)

enable lora-finetuning

self.model.backbone.enable_lora(rank=self.config['lora_rank'])

# Reduce the input sequence length to limit memory usage
self.model.preprocessor.sequence_length = self.config['tokenization_max_length']

# Use AdamW (a common optimizer for transformer models)
optimizer = keras.optimizers.AdamW(
    learning_rate=self.config['learning_rate'],
    weight_decay=self.config['weight_decay'],
)

# Exclude layernorm and bias terms from decay
optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])

self.model.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
    sampler=self.config['sampler'],
)

self.model.fit(data, epochs=self.config['epochs'], batch_size=self.config['batch_size'])

# Define the directory name
fine_tuned_dir_name = f'fine_tuned_{self.config["basemodel"]}_{datetime.now().strftime("%Y%m%d_%H%M%S")}'
fine_tuned_dir_path = os.path.join('models', fine_tuned_dir_name)

# Create the directory if it doesn't exist
if not os.path.exists(fine_tuned_dir_path):
    os.makedirs(fine_tuned_dir_path)

# Save only the weights in the directory with a specific name
weights_file_path = os.path.join(fine_tuned_dir_path, 'weights.keras')
self.model.save(weights_file_path)

# Save model configuration within the same directory
model_config = create_model_config(self.config, np.unique(
    y).tolist())  # Ensure you have `class_names` defined or adapt as necessary
config_filename = os.path.join(fine_tuned_dir_path, 'model_config.json')
with open(config_filename, 'w') as json_file:
    json.dump(model_config, json_file, indent=4)

# Push model weights and config to wandb
# Note: You may need to adjust this depending on how wandb expects files to be saved
wandb.save(os.path.join(fine_tuned_dir_path, '*'))`

The training completes as expected in keras. Although when I try to load the model using the weights.keras file created from the script above I am getting two unexpected behaviors, see script for loading the model below,

`import keras

loaded_model = keras.saving.load_model("/data/host-category-classification/nlp/classification/Gemma/models"
"/fine_tuned_gemma-2b_20240229_151158/weights.keras")

print(loaded_model.summary())`

First, I observed that each call to the loading process will generate unknown set of files that occupy my disk indefinitely ~10 gb. In addition, the loading process takes forever (havent found the actual time but it should not take more than 10 minutes to load) compared to the the gemma.load_preset method. Do you have any suggestions? There seem to be null documentation either on keras nlp or tensorflow regarding model storage and loading for gemma related models.

Update test.js

aadf88f

github-actions bot added the Gemma label Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma Model Storing and Loading after Fine tuning #67

Gemma Model Storing and Loading after Fine tuning #67

Danish202Gupta commented Mar 6, 2024

Gemma Model Storing and Loading after Fine tuning #67

Are you sure you want to change the base?

Gemma Model Storing and Loading after Fine tuning #67

Conversation

Danish202Gupta commented Mar 6, 2024

enable lora-finetuning