Update docs

YanWenKun · Dec 19, 2024 · d3791f6 · d3791f6
1 parent a8f730c
commit d3791f6
Show file tree

Hide file tree

Showing 10 changed files with 402 additions and 248 deletions.
diff --git a/comfy3d-pt25/README.adoc b/comfy3d-pt25/README.adoc
@@ -6,12 +6,12 @@ https://hub.docker.com/r/yanwk/comfyui-boot/tags?name=comfy3d-pt25[View on <Dock
 
 NOTE: **Currently Not Working on WSL2**.
 
-* By default, install only ComfyUI, AGL Translation, ComfyUI-3D-Pack and custom nodes required by example workflows.
-* Will install ComfyUI-Manager, but disable it by default.
-* After download complete, will try to rebuild dependencies for 3D-Pack.
-** Will target local GPU, usually no need to config. If having issues, try set env var `TORCH_CUDA_ARCH_LIST` and `CMAKE_ARGS` (see table below).
-** If rebuild is unnecessary, add an empty file `.build-complete` to the storage folder (just like `.download-complete`).
-
+* By default, only install ComfyUI, AGL Translation, ComfyUI-3D-Pack, and any custom nodes required by example workflows.
+* ComfyUI-Manager will be installed but disabled by default.
+* After the download is complete, the script will attempt to rebuild dependencies for the 3D-Pack.
+** This process could take about 10 minutes.
+** If the rebuild is unnecessary (some workflows such as TripoSR can run directly), add an empty file named `.build-complete` to the storage folder (similar to the `.download-complete` file).
+** The build process will auto target local GPU, usually no need to config. If having issues, try set env var `TORCH_CUDA_ARCH_LIST` (see the table below).
 
 ## Version Info
 
@@ -63,4 +63,114 @@ podman run -it --rm \
 ----
 
 
-include::../docs/section-env-vars.adoc[]
+[[env-vars]]
+## Environment Variables Reference
+
+[cols="2,2,3"]
+|===
+|Variable|Example Value|Memo
+
+|HTTP_PROXY +
+HTTPS_PROXY
+|http://localhost:1081 +
+http://localhost:1081
+|Set HTTP proxy.
+
+|PIP_INDEX_URL
+|'https://pypi.org/simple'
+|Set mirror site for Python Package Index.
+
+|HF_ENDPOINT
+|'https://huggingface.co'
+|Set mirror site for HuggingFace Hub.
+
+|HF_TOKEN
+|'hf_your_token'
+|Set HuggingFace Access Token.
+https://huggingface.co/settings/tokens[More]
+
+|HF_HUB_ENABLE_HF_TRANSFER
+|1
+|Enable HuggingFace Hub experimental high-speed file transfers.
+Only make sense if you have >1000Mbps and VERY STABLE connection (e.g. cloud server).
+https://huggingface.co/docs/huggingface_hub/hf_transfer[More]
+
+|TORCH_CUDA_ARCH_LIST
+|7.5 +
+or +
+'5.2+PTX;6.0;6.1+PTX;7.5;8.0;8.6;8.9+PTX'
+|Build target for PyTorch and its extensions.
+For most users, you only need to set one build target for your GPU.
+https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/[More]
+
+|CMAKE_ARGS
+|(Default) +
+'-DBUILD_opencv_world=ON -DWITH_CUDA=ON -DCUDA_FAST_MATH=ON -DWITH_CUBLAS=ON -DWITH_NVCUVID=ON'
+|Build options for CMAKE for projects using CUDA.
+
+|===
+
+
+[[trellis-demo]]
+## Additional: Running the TRELLIS Demo Using This Image
+
+https://github.com/microsoft/TRELLIS[TRELLIS]
+officially provides a Gradio demo that can generate orbit videos and `.glb` models from images.
+This image has almost all the necessary dependencies, so you can easily run the demo. The execution script is provided below.
+
+* Note: Requires more than 16G VRAM.
+
+* `ATTN_BACKEND` Parameter Selection
+** `flash-attn` is suitable for Ampere architecture (30 series/A100) and later GPUs.
+** `xformers` has better compatibility.
+
+* `SPCONV_ALGO` Parameter Selection
+** `native` starts faster and is suitable for single runs.
+** `auto` will have better performance, but will take some time for benchmarking at the beginning.
+
+.1. Run the Container
+[source,sh]
+----
+mkdir -p storage
+
+podman run -it \
+  --name trellis-demo \
+  --device nvidia.com/gpu=all \
+  --security-opt label=disable \
+  -p 7860:7860 \
+  -v "$(pwd)"/storage:/root \
+  -e ATTN_BACKEND="flash-attn" \
+  -e SPCONV_ALGO="native" \
+  -e GRADIO_SERVER_NAME="0.0.0.0" \
+  -e PIP_USER=true \
+  -e PIP_ROOT_USER_ACTION=ignore \
+  -e PYTHONPYCACHEPREFIX="/root/.cache/pycache" \
+  docker.io/yanwk/comfyui-boot:comfy3d-pt25 \
+  /bin/fish
+----
+
+.2. Run the Commands
+[source,sh]
+----
+export PATH="$PATH:/root/.local/bin"
+
+# Run the compilation script, takes about 10 minutes.
+bash /runner-scripts/build-deps.sh
+
+# Install dependencies
+pip install gradio==4.44.1 gradio_litmodel3d==0.0.1
+
+# Download the model
+huggingface-cli download JeffreyXiang/TRELLIS-image-large
+
+# Download and run TRELLIS demo
+git clone --depth=1 --recurse-submodules \
+  https://github.com/microsoft/TRELLIS.git \
+  /root/TRELLIS
+
+cd /root/TRELLIS
+
+python3 app.py
+----
+
+NOTE: You may safely ignore the message "matrix-client 0.4.0 requires urllib3~=1.21, but you have urllib3 2.2.3 which is incompatible." As `matrix-client` is used by ComfyUI-Manager, it is not relevant in this context.
diff --git a/comfy3d-pt25/README.zh.adoc b/comfy3d-pt25/README.zh.adoc
@@ -9,8 +9,9 @@ https://hub.docker.com/r/yanwk/comfyui-boot/tags?name=comfy3d-pt25[在 <Docker H
 * 默认仅安装 ComfyUI、AGL 翻译、ComfyUI-3D-Pack 以及 3D-Pack 示例工作流所需节点。
 * 会下载安装 ComfyUI-Manager，但默认将其禁用，以避免其自动更新依赖项。
 * 第一次启动时，在下载步骤完成后，会尝试重新编译 3D-Pack 所需依赖项。
-** 默认仅针对本机 GPU 编译，一般不需要手动调整。如有问题，可手动设置环境变量 `TORCH_CUDA_ARCH_LIST` 与 `CMAKE_ARGS` （见附表）。
-** 如果兼容性OK，不需要重新编译，添加空文件 `.build-complete` 到主目录下即可（类似 `.download-complete`）。
+** 添加空文件 `.build-complete` 到主目录下即可跳过该步骤（类似 `.download-complete`）。一些工作流（比如 TripoSR）不需要重新编译也能正常运行，这里为了确保兼容性添加了该步骤。
+** 耗时大约 10 分钟，如果太长（比如半小时以上）建议跳过编译，尝试直接运行。
+** 默认仅针对本机 GPU 编译，一般不需要手动调整参数。如有问题，可手动设置环境变量 `TORCH_CUDA_ARCH_LIST` 与 `CMAKE_ARGS` （见附表）。
 
 ## 版本信息
 
@@ -67,4 +68,117 @@ podman run -it --rm \
 ----
 
 
-include::../docs/section-env-vars.zh.adoc[]
+[[env-vars]]
+## 环境变量参考
+
+[cols="2,2,3"]
+|===
+|变量名|参考值|备注
+
+|HTTP_PROXY +
+HTTPS_PROXY
+|http://localhost:1081 +
+http://localhost:1081
+|设置 HTTP 代理。
+
+|PIP_INDEX_URL
+|'https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple'
+|设置 PyPI 镜像站点。
+
+|HF_ENDPOINT
+|'https://hf-mirror.com'
+|设置 HuggingFace 镜像站点。
+
+|HF_TOKEN
+|'hf_your_token'
+|设置 HuggingFace
+https://huggingface.co/settings/tokens[访问令牌]
+（Access Token）。
+
+|HF_HUB_ENABLE_HF_TRANSFER
+|1
+|启用 HuggingFace Hub 实验性高速传输，仅对 >1000Mbps 且十分稳定的连接有意义（比如云服务器）。
+https://huggingface.co/docs/huggingface_hub/hf_transfer[文档]
+
+|TORCH_CUDA_ARCH_LIST
+|7.5 +
+或 +
+'5.2+PTX;6.0;6.1+PTX;7.5;8.0;8.6;8.9+PTX'
+|设置 PyTorch 及扩展的编译目标。
+对于大多数用户，仅需为自己的 GPU 设置一个目标。
+https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/[参考]
+
+|CMAKE_ARGS
+|'-DBUILD_opencv_world=ON -DWITH_CUDA=ON -DCUDA_FAST_MATH=ON -DWITH_CUBLAS=ON -DWITH_NVCUVID=ON'
+|设置 CMAKE 编译参数，脚本中已默认设置，一般情况无需调整。
+
+|===
+
+
+[[trellis-demo]]
+## 额外内容：使用本镜像运行 TRELLIS 官方 demo
+
+https://github.com/microsoft/TRELLIS[TRELLIS]
+官方自带了一个 Gradio 演示程序，可以从单张或多张图片生成环绕视频和 `.glb` 模型。
+而本镜像依赖项基本完备，可以简单运行该 demo，以下提供执行脚本。
+
+* 注意：需要 16G 以上显存
+
+* `ATTN_BACKEND` 参数选择
+** `flash-attn` 适合安培架构（30系／A100）及之后的 GPU
+** `xformers` 兼容性更好
+
+* `SPCONV_ALGO` 参数选择
+** `native` 启动较快，适合单次运行
+** `auto` 会有更好性能，但一开始会花时间进行性能测试
+
+.1. 运行容器
+[source,sh]
+----
+mkdir -p storage
+
+# 注意这里配置了代理
+# 请按需修改
+podman run -it \
+  --name trellis-demo \
+  --device nvidia.com/gpu=all \
+  --security-opt label=disable \
+  -p 7860:7860 \
+  -v "$(pwd)"/storage:/root \
+  -e ATTN_BACKEND="flash-attn" \
+  -e SPCONV_ALGO="native" \
+  -e GRADIO_SERVER_NAME="0.0.0.0" \
+  -e PIP_USER=true \
+  -e PIP_ROOT_USER_ACTION=ignore \
+  -e PYTHONPYCACHEPREFIX="/root/.cache/pycache" \
+  -e PIP_INDEX_URL="https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple" \
+  -e HF_ENDPOINT="https://hf-mirror.com" \
+  docker.io/yanwk/comfyui-boot:comfy3d-pt25 \
+  /bin/fish
+----
+
+.2. 运行命令
+[source,sh]
+----
+export PATH="$PATH:/root/.local/bin"
+
+# 执行一遍编译脚本，耗时10分钟左右
+bash /runner-scripts/build-deps.sh
+
+# 安装依赖项
+pip install gradio==4.44.1 gradio_litmodel3d==0.0.1
+
+# 下载模型
+huggingface-cli download JeffreyXiang/TRELLIS-image-large
+
+# 下载并运行 TRELLIS demo
+git clone --depth=1 --recurse-submodules \
+  https://github.com/microsoft/TRELLIS.git \
+  /root/TRELLIS
+
+cd /root/TRELLIS
+
+python3 app.py
+----
+
+NOTE: 如果提示 "matrix-client 0.4.0 requires urllib3~=1.21, but you have urllib3 2.2.3 which is incompatible." 直接忽略即可。只有 ComfyUI-Manager 的分享功能会用到 `matrix-client` 这个过时的组件，此处毫无影响。
diff --git a/comfy3d-pt25/runner-scripts/build-deps.sh b/comfy3d-pt25/runner-scripts/build-deps.sh
@@ -10,10 +10,11 @@ cd /root
 
 if [ -z "${CMAKE_ARGS}" ]; then
     export CMAKE_ARGS='-DBUILD_opencv_world=ON -DWITH_CUDA=ON -DCUDA_FAST_MATH=ON -DWITH_CUBLAS=ON -DWITH_NVCUVID=ON'
-    echo "CMAKE_ARGS not set, setting to ${CMAKE_ARGS}"
+    echo "[INFO] CMAKE_ARGS not set, setting to ${CMAKE_ARGS}"
 fi ;
 
-# Compile PyTorch3D first
+# Compile PyTorch3D
+# Put it first because it takes longest time.
 pip install --force-reinstall \
     "git+https://github.com/facebookresearch/pytorch3d.git"
 
@@ -57,7 +58,7 @@ pip install --force-reinstall \
     /tmp/build/mip-splatting/submodules/diff-gaussian-rasterization/
 
 # (Optional) Compile Flash Attention for Ampere and later GPUs.
-# Limit Ninja jobs to avoid OOM.
+# "MAX_JOBS" limits Ninja jobs to avoid OOM.
 # If have >96GB RAM, just remove MAX_JOBS line.
 export MAX_JOBS=4
 pip install flash-attn --no-build-isolation

diff --git a/cu124-megapak/README.adoc b/cu124-megapak/README.adoc
@@ -66,7 +66,88 @@ touch storage/.download-complete
 ----
 
 
-include::../docs/section-cli-args.adoc[]
-
-include::../docs/section-env-vars.adoc[]
-
+[[cli-args]]
+## CLI_ARGS Reference
+
+[%autowidth,cols=2]
+|===
+|args |description
+
+|--lowvram
+|If your GPU only has 4GB VRAM.
+
+|--novram
+|If adding __--lowvram__ still out-of-memory.
+
+|--cpu
+|Run on CPU. It's pretty slow.
+
+|--use-pytorch-cross-attention
+|If you don't want to use xFormers. This may perform well on WSL2, but significantly slower on Linux hosts.
+
+|--preview-method taesd
+|Enable higher-quality previews with TAESD. ComfyUI-Manager would override this (settings available in Manager UI).
+
+|--front-end-version Comfy-Org/ComfyUI_frontend@latest
+|Use the most up-to-date frontend version.
+
+|--fast
+|Enable experimental optimizations.
+Currently the only optimization is float8_e4m3fn matrix multiplication on
+4000/ADA series Nvidia cards or later.
+Might break things/lower quality.
+See the 
+https://github.com/comfyanonymous/ComfyUI/commit/9953f22fce0ba899da0676a0b374e5d1f72bf259[commit].
+|===
+
+More `CLI_ARGS` available at 
+https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/cli_args.py[ComfyUI].
+
+
+[[env-vars]]
+## Environment Variables Reference
+
+[cols="2,2,3"]
+|===
+|Variable|Example Value|Memo
+
+|HTTP_PROXY +
+HTTPS_PROXY
+|http://localhost:1081 +
+http://localhost:1081
+|Set HTTP proxy.
+
+|PIP_INDEX_URL
+|'https://pypi.org/simple'
+|Set mirror site for Python Package Index.
+
+|HF_ENDPOINT
+|'https://huggingface.co'
+|Set mirror site for HuggingFace Hub.
+
+|HF_TOKEN
+|'hf_your_token'
+|Set HuggingFace Access Token.
+https://huggingface.co/settings/tokens[More]
+
+|HF_HUB_ENABLE_HF_TRANSFER
+|1
+|Enable HuggingFace Hub experimental high-speed file transfers.
+Only make sense if you have >1000Mbps and VERY STABLE connection (e.g. cloud server).
+https://huggingface.co/docs/huggingface_hub/hf_transfer[More]
+
+|TORCH_CUDA_ARCH_LIST
+|7.5 +
+or +
+'5.2+PTX;6.0;6.1+PTX;7.5;8.0;8.6;8.9+PTX'
+|Build target for PyTorch and its extensions.
+For most users, no setup is needed as it will be automatically selected on Linux.
+When needed, you only need to set one build target just for your GPU.
+https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/[More]
+
+|CMAKE_ARGS
+|(Default) +
+'-DBUILD_opencv_world=ON -DWITH_CUDA=ON -DCUDA_FAST_MATH=ON -DWITH_CUBLAS=ON -DWITH_NVCUVID=ON'
+|Build options for CMAKE for projects using CUDA.
+
+|===