BANMo

Changelog

11/21: Remove eikonal loss to align with paper results, #36
08/09: Fix eikonal loss that regularizes surface (resulting in smoother mesh).
06/18: Add a colab demo for novel view synthesis.
04/11: Replace matching loss with feature rendering loss; Fix bugs in LBS; Stablize optimization.
03/20: Add mesh color option (canonical mappihg vs radiance) during surface extraction. See --ce_color flag.
02/23: Improve NVS with fourier light code, improve uncertainty MLP, add long schedule, minor speed up.
02/17: Add adaptation to a new video, optimization with known root poses, and pose code visualization.
02/15: Add motion-retargeting, quantitative evaluation and synthetic data generation/eval.

Install

Build with conda

We provide two versions.

[A. torch1.10+cu113 (1.4x faster on V100)]

# clone repo
git clone git@github.com:facebookresearch/banmo.git --recursive
cd banmo
# install conda env
conda env create -f misc/banmo-cu113.yml
conda activate banmo-cu113
# install pytorch3d (takes minutes), kmeans-pytorch
pip install -e third_party/pytorch3d
pip install -e third_party/kmeans_pytorch
# install detectron2
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

[B. torch1.7+cu110]

# clone repo
git clone git@github.com:facebookresearch/banmo.git --recursive
cd banmo
# install conda env
conda env create -f misc/banmo.yml
conda activate banmo
# install kmeans-pytorch
pip install -e third_party/kmeans_pytorch
# install detectron2
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu110/torch1.7/index.html

Data

We provide two ways to obtain data. The easiest way is to download and unzip the pre-processed data as follows.

[Download pre-processed data]

We provide preprocessed data for cat and human. Download the pre-processed rgb/mask/flow/densepose images as follows

# (~8G for each)
bash misc/processed/download.sh cat-pikachiu
bash misc/processed/download.sh human-cap

[Download raw videos]

Download raw videos to ./raw/ folder

bash misc/vid/download.sh cat-pikachiu
bash misc/vid/download.sh human-cap
bash misc/vid/download.sh dog-tetres
bash misc/vid/download.sh cat-coco

To use your own videos, or pre-process raw videos into banmo format, please follow the instructions here.

PoseNet weights

[expand]

Download pre-trained PoseNet weights for human and quadrupeds

mkdir -p mesh_material/posenet && cd "$_"
wget $(cat ../../misc/posenet.txt); cd ../../

Demo

This example shows how to reconstruct a cat from 11 videos and a human from 10 videos. For more examples, see here.

Hardware/time for running the demo

The short schedule takes 4 hours on 2 V100 GPUs (+SSD storage). To reach higher quality, the full schedule takes 12 hours. We provide a script that use gradient accumulation to support experiments on fewer GPUs / GPU with lower memory.

Setting good hyper-parameter for videos with various length

When optimizing videos with different lengths, we found it useful to scale batchsize with the number of frames. A rule of thumb is to set "num gpus" x "batch size" x "accu steps" ~= num frames. This means more video frames needs more GPU memory but the same optimization time.

Try pre-optimized models

We provide pre-optimized models and scripts to run novel view synthesis and mesh extraction (results saved at tmp/*all.mp4). Also see this Colab for NVS.

# download pre-optimized models
mkdir -p tmp && cd "$_"
wget https://www.dropbox.com/s/qzwuqxp0mzdot6c/cat-pikachiu.npy
wget https://www.dropbox.com/s/dnob0r8zzjbn28a/cat-pikachiu.pth
wget https://www.dropbox.com/s/p74aaeusprbve1z/opts.log # flags used at opt time
cd ../

seqname=cat-pikachiu
# render novel views
bash scripts/render_nvs.sh 0 $seqname tmp/cat-pikachiu.pth 5 0
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: path to the weights
# argv[4]: video id used for pose traj
# argv[5]: video id used for root traj

# Extract articulated meshes and render
bash scripts/render_mgpu.sh 0 $seqname tmp/cat-pikachiu.pth \
        "0 5" 64
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: weights path
# argv[4]: video id separated by space
# argv[5]: resolution of running marching cubes (use 256 to get higher-res mesh)

1. Optimization

[cat-pikachiu]

seqname=cat-pikachiu
# To speed up data loading, we store images as lines of pixels). 
# only needs to run it once per sequence and data are stored
python preprocess/img2lines.py --seqname $seqname

# Optimization
bash scripts/template.sh 0,1 $seqname 10001 "no" "no"
# argv[1]: gpu ids separated by comma 
# args[2]: sequence name
# args[3]: port for distributed training
# args[4]: use_human, pass "" for human cse, "no" for quadreped cse
# args[5]: use_symm, pass "" to force x-symmetric shape

# Extract articulated meshes and render
bash scripts/render_mgpu.sh 0 $seqname logdir/$seqname-e120-b256-ft2/params_latest.pth \
        "0 1 2 3 4 5 6 7 8 9 10" 256
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: weights path
# argv[4]: video id separated by space
# argv[5]: resolution of running marching cubes (256 by default)

cat-pikachiu-.0.-all.mp4

[human-cap]

seqname=adult7
python preprocess/img2lines.py --seqname $seqname
bash scripts/template.sh 0,1 $seqname 10001 "" ""
bash scripts/render_mgpu.sh 0 $seqname logdir/$seqname-e120-b256-ft2/params_latest.pth \
        "0 1 2 3 4 5 6 7 8 9" 256

adult7-.8.-all.mp4

2. Visualization tools

[Tensorboard]

# You may need to set up ssh tunneling to view the tensorboard monitor locally.
screen -dmS "tensorboard" bash -c "tensorboard --logdir=logdir --bind_all"

[Root pose, rest mesh, bones]

To draw root pose trajectories (+rest shape) over epochs

# logdir
logdir=logdir/$seqname-e120-b256-init/
# first_idx, last_idx specifies what frames to be drawn
python scripts/visualize/render_root.py --testdir $logdir --first_idx 0 --last_idx 120

Find the output at $logdir/mesh-cam.gif. During optimization, the rest mesh and bones at each epoch are saved at $logdir/*rest.obj.

pose-20.mp4

[Correspondence/pose code]

To visualize 2d-2d and 2d-3d matchings of the latest epoch weights

# 2d matches between frame 0 and 100 via 2d->feature matching->3d->geometric warp->2d
bash scripts/render_match.sh $logdir/params_latest.pth "0 100" "--render_size 128"

2d-2d matches will be saved to tmp/match_%03d.jpg. 2d-3d feature matches of frame 0 will be saved to tmp/match_line_pred.obj. 2d-3d geometric warps of frame 0 will be saved to tmp/match_line_exp.obj. near-plane frame 0 will be saved to tmp/match_plane.obj. Pose code visualization will be saved at tmp/code.mp4.

pose-code.mp4

[Render novel views]

Render novel views at the canonical camera coordinate

bash scripts/render_nvs.sh 0 $seqname logdir/$seqname-e120-b256-ft2/params_latest.pth 5 0
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: path to the weights
# argv[4]: video id used for pose traj
# argv[5]: video id used for root traj

Results will be saved at logdir/$seqname-e120-b256-ft2/nvs*.mp4.

nvs-pikachiu.mp4

[Render canonical view over iterations]

Render depth and color of the canonical view over optimization iterations

bash scripts/nvs_iter.sh 0 logdir/$seqname-e120-b256-init/
# argv[1]: gpu id
# argv[2]: path to the logdir

Results will be saved at logdir/$seqname-e120-b256-init/vis-iter*.mp4.

cat-pikachiu-vis-iter-iter-dph.mp4

cat-pikachiu-vis-iter-iter-rgb.mp4

Common install issues

[expand]

Q: pyrender reports ImportError: Library "GLU" not found.
- install sudo apt install freeglut3-dev
Q: ffmpeg reports libopenh264.so.5 not fund
- resinstall ffmpeg in conda conda install -c conda-forge ffmpeg

Note on arguments

[expand]

use --use_human for human reconstruction, otherwise it assumes quadruped animals
use --full_mesh to disable visibility check at mesh extraction time
use --noce_color at mesh extraction time to assign radiance instead canonical mapping as vertex colors.
use --queryfw at mesh extraction time to extract forward articulated meshes, which only needs to run marching cubes once.
use --use_cc maintains the largest connected component for rest mesh in order to set the object bounds and near-far plane (by default turned on). Turn it off with --nouse_cc for disconnected objects such as hands.
use --debug to print out the rough time each component takes.

Acknowledgement

[expand]

Volume rendering code is borrowed from Nerf_pl. Flow estimation code is adapted from VCN-robust. Other external repos:

Detectron2 (modified)
SoftRas (modified, for synthetic data generation)
Chamfer3D (for evaluation)

License

[expand]

code: CC-BY-NC 4.0. See the LICENSE file.
dataset
- CC0: cat-pikachiu, cat-coco, dog-tetres, human-cap
- Pexels free license: penguin
- Turbosquid license: hands, eagle
  - the final dataset is modified from those 3D assets.
- AMA comes without a license
- We thank the artists for sharing theirs videos and 3D assets.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
configs		configs
dataloader		dataloader
mesh_material		mesh_material
misc		misc
nnutils		nnutils
preprocess		preprocess
scripts		scripts
third_party		third_party
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
extract.py		extract.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BANMo

Changelog

Install

Build with conda

Data

PoseNet weights

Demo

1. Optimization

2. Visualization tools

Common install issues

Note on arguments

Acknowledgement

License

About

Releases

Packages

Languages

License

ZCH142857/banmo

Folders and files

Latest commit

History

Repository files navigation

BANMo

Changelog

Install

Build with conda

Data

PoseNet weights

Demo

1. Optimization

2. Visualization tools

Common install issues

Note on arguments

Acknowledgement

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages