GenS is an end-to-end generalizable neural surface reconstruction method that uses the multi-scale volume to reconstruct globally smooth surfaces and recover more high-frequency details. It leverages the mult-scale feature-metric consistency to impose the multi-view consistency in the more robust feature space, and utilizes the more accurate geometry of the dense inputs to teach the model with sparse inputs. Details are described in our paper:
GenS: Generalizable Neural Surface Reconstruction from Multi-View Images
Rui Peng, Xiaodong Gu, Luyang Tang, Shihe Shen, Fanqi Yu, Ronggang Wang
NeuIPS 2023 (arxiv | OpenReview)
Formula correction of Eq. (7) in our paper:
We apologize for this mistake, and if there are any bugs in our code, please feel free to raise your issues.
conda create -n gens python=3.10.9
conda activate gens
pip install -r requirements.txt
We only train our model on DTU dataset. We adopt the full resolution ground-truth depth maps (just for testing) and RGB images, and use the camera parameters prepocessed by CasMVSNet or MVSNet. Simply, please follow the instruction here of UniMVSNet to prepare the dataset. We generate pseudo depth maps using the model trained with dense inputs to supervise the model with sparse inputs, and download them from here. The final data structure is just like this:
dtu_training
├── Cameras
├── 00000000_cam.txt
├── ...
├── pair.txt
├── Depths_raw
├── Pseudo_depths
└── Rectified_raw
Rectified_raw
is the full resolution RGB images provided in DTU. We use the same training and testing split as SparseNeuS, please refer to here for more details.
For testing, you can download the testing data prepared by SparseNeuS here, which contains some object masks for cleaning the mesh. Put it to <your DTU_TEST path>
. For quantitative evaluation, you need to download the ground-truth points from the DTU website and put it to <your GT_POINTS path>
.
Download BlendedMVS for evaluation. The data structure is just like this:
blendedmvs
├── 5a0271884e62597cdee0d0eb
├── blended_images
├── cams
├── 00000000_cam.txt
├── ...
├── pair.txt
├── 5a3ca9cb270f0e3f14d0eddb
├── ...
Download our pretrained model and put it to <your CKPT path>
.
CKPT | Train Res | Train View | Test Res | Test View | Mean Cham. Dist.↓ |
gens | 480X640 | 5 (4src) | 480X640 | 3 (2src) | 1.34 |
You can also download our precomputed DTU points through direct inference here or after a fast fine-tuning here.
We define all information like the model structure and testing parameters in the configuration file. We use the ./confs/gens.conf
file for training and testing. You need to first specify the correct values in your own environment, such as <your dataset path>
and <your output save path>
. You can use our default testing configurations and the model structure. Once everything is ready, you can simply start testing via:
bash ./scripts/run.sh --mode val --resume <your CKPT path>
This will predict all scenes in the test split at view index 23 by default. If you want to get the results at other views (e.g., view43), you can change the ref_view
under the val_dataset
namespace in configuration file. Meanwhile, you can also specify scene
list under the val_dataset
namespace to test on a single scene like scene=[scan24,scan55]
.
Optionaly, you can add --clean_mesh
command to generate the filtered mesh, but you need to note that the mask used in --clean_mesh
command is from MVSNet and is not the correct object mask used during the quantitative evaluation.
Before evaluation, to generate the clean meshes using the correct mask, you need clean the mesh first:
python evaluation/clean_meshes.py --root_dir <your DTU_TEST path> --out_dir <your output save path>/meshes
Then run the quantitative evaluation:
python evaluation/dtu_eval.py --dataset_dir <your GT_POINTS path> --out_dir <your output save path>
You need to pay attention to the filename of meshes in evaluation/clean_meshes.py
file, and we use scan24_epoch0.ply
for scene 24 by default. In our paper, we test at the low resolution of 480X640, and you can get the better performance if you test at a higher resolution.
Similar to the evaluation on DTU, you can test on BlendedMVS dataset through the command ./scripts/run.sh
and change the configuration file to confs/gens_bmvs.conf
, and you can also run the python command:
python main.py --conf confs/gens_bmvs.conf --mode val --resume <your CKPT path> --clean_mesh
Here, we recommand to add the --clean_mesh
command. You can change or add more testing scenes through change scene
. Note that camera poses in BlendedMVS have a great difference, you need to make sure that the bounding box fits as closely as possible to the object you want to reconstruct, e.g., adjusting factor
and num_interval
.
Similarly, you need to first specify the value in confs/gens.conf
file and then run:
bash ./scripts/run.sh --mode train
By default, we employ the DistributedDataParallel mode to train our model on 2 GPUs.
We use confs/gens_finetune.conf
file to config the fine-tuning on DTU dataset. For convenience, we use scripts/finetune.sh
file to fine-tune all testing scenes at both 23 and 43 views:
bash ./scripts/finetune.sh --resume <your CKPT path>
You can change the scene and view through the --scene
and --ref_view
command directly or through modifying the configuration file.
we use confs/gens_bmvs_finetune.conf
file to config the BlendedMVS fine-tuning. Similarly, you need to first make sure the bounding box is compact enough and then run:
python main.py --conf confs/gens_bmvs_finetune.conf --mode finetune --resume <your CKPT path>
Note that we save the optimized volume and implicit surface network after fine-tuning. And if you want to resume the fine-tuned model, you need add --load_vol
command to distinguish it from the oridinary ckpt.
If you find our work useful in your research please consider citing our paper:
@inproceedings{peng2023gens,
title={GenS: Generalizable Neural Surface Reconstruction from Multi-View Images},
author={Peng, Rui and Gu, Xiaodong and Tang, Luyang and Shen, Shihe and Yu, Fanqi and Wang, Ronggang},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS)},
year={2023}
}
Thanks to NeuS, SparseNeuS and Geo-NeuS.