Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent device in regressor.py #4

Open
lucasresck opened this issue Dec 12, 2024 · 0 comments
Open

Inconsistent device in regressor.py #4

lucasresck opened this issue Dec 12, 2024 · 0 comments

Comments

@lucasresck
Copy link

Dear authors,

Thank you for releasing fast_l1 code together with datamodels.

While running the linear regression step of datamodels, I faced an issue regarding tensors not being in the same device.

After running

python -m datamodels.regression.compute_datamodels \
    -C regression_config.yaml \
    --data.data_path "$tmp_dir/reg_data.beton" \
    --cfg.out_dir "$tmp_dir/reg_results"

I would face something similar to

  File "/path_to_python3.9/site-packages/fast_l1-0.0.1-py3.9.egg/fast_l1/regressor.py", line 221, in train_saga
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

or

  File "/path_to_python3.9/site-packages/fast_l1-0.0.1-py3.9.egg/fast_l1/regressor.py", line 341, in train_saga
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

This happened because in lines 221 and 341 of regressor.py some CPU tensors are being indexed/sliced using other tensors that lie on the GPU, in this case, idx and still_opt_outer:

a_prev[:, :num_keep].copy_(a_table[idx, :num_keep],
non_blocking=True)

inds_to_swap = inds_to_swap[still_opt_outer[inds_to_swap]]

On the other hand, they are both on the GPU because weight and train_loader in datamodels/datamodels/regression/compute_datamodels.py are on the GPU when train_saga is called:

        regressor.train_saga(weight,
                             bias,
                             train_loader,
                             val_loader,
                             lr=lr,
                             start_lams=max_lam,
                             update_bias=(use_bias > 0),
                             lam_decay=np.exp(np.log(eps)/k),
                             num_lambdas=k,
                             early_stop_freq=early_stop_freq,
                             early_stop_eps=early_stop_eps,
                             logdir=str(log_path))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant