You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I work on self-supervised learning for depth estimation. The only difference to supervised learning is the more complicated calculation of the loss function: instead of a comparison with ground truth, I do backprojection of the image pixels into 3D space and projection again into the other camera. I tried mobilestereonet, it works fine, but RAFT-stereo learns nothing if the training begins from scratch.
Not sure whether you have read my former text. I found out that the finetuning should have a much lower learning rate than you proposed. I can get qualitatively reasonable results. The question now is how to make the training from scratch.
One difference to RAFT-stereo is that the training loss is not a summation, but an average of the "error map". Is that a problem?
The text was updated successfully, but these errors were encountered:
Hello
I work on self-supervised learning for depth estimation. The only difference to supervised learning is the more complicated calculation of the loss function: instead of a comparison with ground truth, I do backprojection of the image pixels into 3D space and projection again into the other camera. I tried mobilestereonet, it works fine, but RAFT-stereo learns nothing if the training begins from scratch.
Not sure whether you have read my former text. I found out that the finetuning should have a much lower learning rate than you proposed. I can get qualitatively reasonable results. The question now is how to make the training from scratch.
One difference to RAFT-stereo is that the training loss is not a summation, but an average of the "error map". Is that a problem?
The text was updated successfully, but these errors were encountered: