-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with Incomplete Evaluation on re10k Dataset and Lower-than-Expected Results #52
Comments
Hi, the number of test samples (6,474) is correct, and this is aligned with previous methods (i.e., some samples are skipped). Could you double check if the pre-trained model is loaded correctly since there is an additional |
Thank you for your response! I have double-checked, and it doesn't seem to be an issue with the pre-trained model not loading correctly. The PSNR results on the RE10K dataset are only about 1 point lower than those reported in the paper, which suggests that the model is functioning as expected. Moreover, the results on the DL3DV dataset are consistent with those in the paper, further indicating that the model is likely loaded correctly. Let me know if you have any other thoughts! |
Hi, could you reproduce the results with small and base models? |
I reproduced the results with the small and base models. The evaluation metrics are as follows: |
Hi, it turns out that all the numbers on re10k are slightly worse than our results. |
Dear authors,
Thank you for your work. I have a question regarding the evaluation on the re10k dataset. I found that my test results are somewhat lower than those reported in the paper. Additionally, the test set only processed 6,474 samples instead of the full dataset (7,286 samples). Upon inspecting the code, I noticed that some examples were skipped due to the following condition:
try:
context_indices, target_indices = self.view_sampler.sample(
scene,
extrinsics,
intrinsics,
)
except ValueError:
# Skip because the example doesn't have enough frames.
continue
I used the processed versions of the dataset from pixelSplat, which I downloaded from the following link:
http://schadenfreude.csail.mit.edu:8000/
Interestingly, my results on the dl3dv dataset were as expected, but the issue only occurs when testing on re10k. Could you help me identify the cause of this discrepancy?
Below is my test script:
evaluate on re10k
CUDA_VISIBLE_DEVICES=0 python -m src.main +experiment=re10k
dataset.test_chunk_interval=1
model.encoder.num_scales=2
model.encoder.upsample_factor=2
model.encoder.lowest_feature_resolution=4
model.encoder.monodepth_vit_type=vitl
model.encoder.gaussian_regressor_channels=64
model.encoder.color_large_unet=true
model.encoder.feature_upsampler_channels=128
checkpointing.pretrained_model=/pretrained/depthsplat-gs-large-re10k-256x256-288d9b26.pth
mode=test
dataset/view_sampler=evaluation
test.compute_scores=true
wandb.mode=disabled
test.save_image=false
test.save_depth=true
test.save_depth_concat_img=true
output_dir=output/depthsplat-depth-large-re10k_train
Thank you!
The text was updated successfully, but these errors were encountered: