You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, Sorry for taking up your precious time. I'm recently working on a reconstruction from 4K resolution aerial images. From my experiment, I found that I could get quite good results training with 2K. But when I tried to use 4K as input resolution. The loss is very hard to converge to a satisfied value. And the metrics (PSNR, Lpips, SSIM) are way too worse than training in 2K. To solve this problem, I tried the following measures:
Use multi-scale training (set resolution_scale =[1.0, 2.0, 4.0, 8.0]): helps a little but still cannot converge to satisfied value.
Use smaller densify_grad (densify_grad =0.001), metrics improved slightly but will get more floaters artifacts.
Increase "feat_dim", (set feat_dim=64), no improvements and harmful to rendering speed.
Increase levels (set levels=15), helps a little.
Increase base_layer (set base_layer=13), helps a little.
Both the above measures helps only a little. But cannot significantly enhance the performance in 4K resolution.
I had several questions to ask:
Does the 4K resolution images have too many high frequency features that reaches the limit of the gaussian kernel can fit? So even if we apply more gaussians to fit the scene. There's a ceiling of the representation capability of the gaussian model.
In the aerial dataset, The camera are basically very far from the objects. And the lod levels are calculated with respect to Dmax/ Dmin. Where Dmax and Dmin are the max depth between the cam center and the object points, and the min depth are the min depth between the cam center and object points. I wonder if it's better to calculate the lod levels with respect to Dmax/ (Dmax -Dmin). As in a scene that cameras are very far. Using (Dmax/ Dmin) will result to very small levels value. For the aerial scene, when we zoom in near the ground. The lod level maybe remain unchanged on a large height. I'm not sure if this could have a bad impact on the quality.
In conclusion, I guess the poor quality in 4K resolution input may largely due to the limited representation capabilities giving the current default hyper-parameters. If you have any advice. Please kindly reply to this issue. Thank you very much.
Yours,
Best Regards
The text was updated successfully, but these errors were encountered:
I am glad to see your recognition of our work. I think you are right and I guess the current densification operation is not aggressive enough, and I suggest you refer to the densification strategy of Hierarchical-GS. There are more policies that we haven't added to the current release, but we don't have time at the moment, we'll be rolling out a more complete version as soon as possible, probably in a month.
Hi, Sorry for taking up your precious time. I'm recently working on a reconstruction from 4K resolution aerial images. From my experiment, I found that I could get quite good results training with 2K. But when I tried to use 4K as input resolution. The loss is very hard to converge to a satisfied value. And the metrics (PSNR, Lpips, SSIM) are way too worse than training in 2K. To solve this problem, I tried the following measures:
Both the above measures helps only a little. But cannot significantly enhance the performance in 4K resolution.
I had several questions to ask:
Does the 4K resolution images have too many high frequency features that reaches the limit of the gaussian kernel can fit? So even if we apply more gaussians to fit the scene. There's a ceiling of the representation capability of the gaussian model.
In the aerial dataset, The camera are basically very far from the objects. And the lod levels are calculated with respect to Dmax/ Dmin. Where Dmax and Dmin are the max depth between the cam center and the object points, and the min depth are the min depth between the cam center and object points. I wonder if it's better to calculate the lod levels with respect to Dmax/ (Dmax -Dmin). As in a scene that cameras are very far. Using (Dmax/ Dmin) will result to very small levels value. For the aerial scene, when we zoom in near the ground. The lod level maybe remain unchanged on a large height. I'm not sure if this could have a bad impact on the quality.
In conclusion, I guess the poor quality in 4K resolution input may largely due to the limited representation capabilities giving the current default hyper-parameters. If you have any advice. Please kindly reply to this issue. Thank you very much.
Yours,
Best Regards
The text was updated successfully, but these errors were encountered: