-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about correct order and paths when running the full pipeline #2
Comments
Hi @baurst, thank you for your interest! Here are some step-by-step suggestions to check the reproduction:
I've uploaded our processed data to: https://www.dropbox.com/s/vpibaeu1yx1kpeg/kittisf.zip?dl=0.
Here are our outputs:
This step is also deterministic and should produce exactly the same results.
I've uploaded our downsampled data to: https://www.dropbox.com/s/r2lq98afy61u6de/kittisf_downsampled.zip?dl=0.
Here are our outputs for the OA-ICP algorithm:
Besides, you can also take a look at your segmentation results after round 1:
In this step, due to the non-deterministic PyTorch operators in training, you may not reproduce exactly the same values even with fixed random seed. However, the segmentation and scene flow improvement results should be close to ours. The scene flow improvement (Object-Aware ICP flow) results on training split are especially important, as the segmentation in round 2 depends on it. (By the way, the "Original flow" in the outputs of OA-ICP should be exactly the same, otherwise the previous downsampling step is not correct.)
If you have reproduced the same results in step 1-3 and similar results in step 4, in this step your results should be close to ours reported in Table 3 of the paper. After you fixed the training on KITTI-SF, the testing on KITTI-Det and SemanticKITTI are also expected to work fine. Hope these can help you! |
Thank you so much for taking the time to investigate this and uploading your results. The resulting data and metrics including step 2 are pretty much identical with the output that I am getting. For step 3 I got different results from you (compared I am now rerunning the pipeline and will report back with new results, but I am reasonably confident that this could have been the issue. |
Hi, thank you very much for all your help, it is very much appreciated! After retraining the whole thing, I got the following results: Round 1:SF - Train:
SF - Val:
Segmentation - Val:
So this looks all good, for Round 1 the segmentation result is even better than the one you have reported in the post above. Round 2:Here it get's a bit weird to me: SF - Train:
Is it expected that the original flow is much better than the Weighted Kabsch flow and Object-Aware ICP flow? I think this contradicts your statement: SF - Val:
Here the difference is not that big. Segmentation Train:Results after training using
That looks very good! Segmentation Val:
This is 10% less than you have reported in the paper, indicating I must have made a mistake somewhere. Thanks again for your help! Just to be sure, I run the experiment pipeline like this. Am I missing something critical?
|
Hi @baurst , thanks for your feedback! 1. Experiment pipelineIn your reported results, the SF-train and SF-val of Round 2 is not needed. In round1, we train the segmentation and improve the scene flow; In round 2, we only train the segmentation (with improved flow) and report it as final segmentation results.
You are expected to get only two trained models: In your pipeline, what make me confusing is
The 2. Testing in Round 2If you have fixed the experiment pipeline, there might be another reason for the failure in reproduction. As you can see, in testing we load the model with lowest validation loss: Lines 80 to 83 in 3afbf55
However, this may not lead to the model with best performance. I occasionally met such case before, as shown below: A quick solution is to load the model from the final epoch: Lines 76 to 79 in 3afbf55
Or you can save model from different epochs during the training and select from them: Lines 212 to 217 in 3afbf55
Training log on Tensorboard can also help you debug~ I've reproduced with different random seeds and we can always got a ~50 F1 score and ~40 PQ score: |
Thank you very much for your help and detailed explanation! I will delete the intermediate results and try again. :) I did not know that the experiment has tensorboard support! I never could find any tensorboad logs, so I assumed there is no tensorboard logging active. But I found out that the summaries are not written because the log_dir did not exist for me and thus no tensorboard files could be written. I created a PR #3 enabling the creation of the log_dir prior to running so that others can have the tensorboard as well. |
Hi,
thank you for publishing the code to your very interesting paper!
Could you please kindly look at my steps that I did to try to reproduce the results in the paper? Clearly I must be doing something wrong, but I cannot figure it out, since there are a lot of steps involved. Thank you very much in advance for taking a look. Your help is very much appreciated!
Here is how I adapted the experiment (mainly data and save paths) to my machine:
config/flow/kittisf/kittisf_unsup.yaml
config/seg/kittidet/kittisf_unsup.yaml
config/seg/kittisf/kittisf_sup.yaml
config/seg/kittisf/kittisf_unsup.yaml
config/seg/kittisf/kittisf_unsup_woinv.yaml
config/seg/semantickitti/kittisf_unsup.yaml
After this, I did the following steps:
For the last command I am getting:
AveragePrecision@50: 0.3241964006222572
PanopticQuality@50: 0.2567730165763252 F1-score@50: 0.35737439222042144 Prec@50: 0.26614363307181654 Recall@50: 0.5437731196054254
{'per_scan_iou_avg': 0.5634193836152553, 'per_scan_iou_std': 0.020407961700111627, 'per_scan_ri_avg': 0.6674587628245354, 'per_scan_ri_std': 0.00429959088563919}
I am getting:
AveragePrecision@50: 0.13945170257439435
PanopticQuality@50: 0.1318724309223011 F1-score@50: 0.19702186647587533 Prec@50: 0.13796774698606545 Recall@50: 0.3444609491048393
{'per_scan_iou_avg': 0.45250289306404357, 'per_scan_iou_std': 0.0, 'per_scan_ri_avg': 0.4861106249785733, 'per_scan_ri_
std': 0.0}
AveragePrecision@50: 0.10315215577576131
PanopticQuality@50: 0.0989709766834506 F1-score@50: 0.15591615175838772 Prec@50: 0.10372148859543817 Recall@50: 0.31385
31283601174
{'per_scan_iou_avg': 0.4351089967498311, 'per_scan_iou_std': 0.0, 'per_scan_ri_avg': 0.4129963953279687, 'per_scan_ri_s
td': 0.0}
Am I doing something fundamentally wrong? Thanks again for taking a look!
The text was updated successfully, but these errors were encountered: