Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About 4K resolution dataset training #7

Open
LaFeuilleMorte opened this issue Oct 14, 2024 · 1 comment
Open

About 4K resolution dataset training #7

LaFeuilleMorte opened this issue Oct 14, 2024 · 1 comment

Comments

@LaFeuilleMorte
Copy link

Hi, Sorry for taking up your precious time. I'm recently working on a reconstruction from 4K resolution aerial images. From my experiment, I found that I could get quite good results training with 2K. But when I tried to use 4K as input resolution. The loss is very hard to converge to a satisfied value. And the metrics (PSNR, Lpips, SSIM) are way too worse than training in 2K. To solve this problem, I tried the following measures:

  1. Use multi-scale training (set resolution_scale =[1.0, 2.0, 4.0, 8.0]): helps a little but still cannot converge to satisfied value.
  2. Use smaller densify_grad (densify_grad =0.001), metrics improved slightly but will get more floaters artifacts.
  3. Increase "feat_dim", (set feat_dim=64), no improvements and harmful to rendering speed.
  4. Increase levels (set levels=15), helps a little.
  5. Increase base_layer (set base_layer=13), helps a little.

Both the above measures helps only a little. But cannot significantly enhance the performance in 4K resolution.
I had several questions to ask:

  1. Does the 4K resolution images have too many high frequency features that reaches the limit of the gaussian kernel can fit? So even if we apply more gaussians to fit the scene. There's a ceiling of the representation capability of the gaussian model.

  2. In the aerial dataset, The camera are basically very far from the objects. And the lod levels are calculated with respect to Dmax/ Dmin. Where Dmax and Dmin are the max depth between the cam center and the object points, and the min depth are the min depth between the cam center and object points. I wonder if it's better to calculate the lod levels with respect to Dmax/ (Dmax -Dmin). As in a scene that cameras are very far. Using (Dmax/ Dmin) will result to very small levels value. For the aerial scene, when we zoom in near the ground. The lod level maybe remain unchanged on a large height. I'm not sure if this could have a bad impact on the quality.

In conclusion, I guess the poor quality in 4K resolution input may largely due to the limited representation capabilities giving the current default hyper-parameters. If you have any advice. Please kindly reply to this issue. Thank you very much.

Yours,
Best Regards

@tongji-rkr
Copy link
Contributor

I am glad to see your recognition of our work. I think you are right and I guess the current densification operation is not aggressive enough, and I suggest you refer to the densification strategy of Hierarchical-GS. There are more policies that we haven't added to the current release, but we don't have time at the moment, we'll be rolling out a more complete version as soon as possible, probably in a month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants