Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation details of the paper. #4

Open
LuPaoPao opened this issue Dec 18, 2024 · 4 comments
Open

Evaluation details of the paper. #4

LuPaoPao opened this issue Dec 18, 2024 · 4 comments

Comments

@LuPaoPao
Copy link

Thank you for your outstanding contribution. The paper says that the protocol test follows papers 4,41,66. However, waymo was not evaluated in their paper. I'd like to compare with your approach, but I can't find any more details. In what scenarios are your methods trained and tested? Was T+5 tested on all frames (from 5-th to 200)? Looking forward to your reply with more details of the experimental scheme. Thank you for your help!

@yifanlu0227
Copy link
Collaborator

Hi, we use the official validation set and take the exact 100-th frame as the input (T=100), and render 100-th, 105-th, 110-th frames for evaluation. We render images of original resolution for SCube and all the baseline methods.

Since not all the scenes are static in the official validation set, we further apply dynamic object masks for 105-th/110-th frames, which project the dynamic bounding boxes from input frame (100-th) and target frame(105-th/110-th) to mask out possible dynamics.

@LuPaoPao
Copy link
Author

LuPaoPao commented Dec 18, 2024

Hi, we use the official validation set and take the exact 100-th frame as the input (T=100), and render 100-th, 105-th, 110-th frames for evaluation. We render images of original resolution for SCube and all the baseline methods.

Since not all the scenes are static in the official validation set, we further apply dynamic object masks for 105-th/110-th frames, which project the dynamic bounding boxes from input frame (100-th) and target frame(105-th/110-th) to mask out possible dynamics.

Thank you for your response. The last question is are you training on all training set of Waymo datasets? Looking forward to your reply, thank you very much!

@yifanlu0227
Copy link
Collaborator

yifanlu0227 commented Dec 18, 2024

It depends on the training stages:

  • For the geometry reconstruction stage, we use both static and dynamic scenes with good voxels. They remove bad clips (e.g. those have few ego movements so few voxels are accumulated) from the whole training set.
  • For the appearance reconstruction stage, since we need future timestamps for supervision, we simplify the training set into static scenes only with good voxels.

You can find the difference in their configuration yaml.

@LuPaoPao
Copy link
Author

It depends on the training stages:

* For the geometry reconstruction stage, we use both static and dynamic scenes with good voxels. They remove bad clips (e.g. those have few ego movements so few voxels are accumulated) from the whole training set.

* For the appearance reconstruction stage, since we need future timestamps for supervision, we simplify the training set into static scenes only with good voxels.

You can find the difference in their configuration yaml.

Thank you for your reply! Exclamation for your huge workload! You solved my problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants