3DGRUT on A100: Build your own 3D Scene Reconstruction
Created based on https://github.com/nv-tlabs/3dgrut NVIDIA 3DGRUT · SIGGRAPH Asia 2024 / CVPR 2025
1. What is 3DGRUT?
Big Picture ⬇️
3DGRUT is NVIDIA's official open-source codebase implementing two groundbreaking 3D scene reconstruction and rendering methods:
3DGRT — 3D Gaussian Ray Tracing (SIGGRAPH Asia 2024, Journal Track)
3DGUT — 3D Gaussian Unscented Transform (CVPR 2025, Oral)
Both methods build on top of the classic 3D Gaussian Splatting (3DGS) paradigm — representing a scene as millions of tiny 3D Gaussian "blobs" — but they push the technology far beyond what rasterization-based 3DGS can do.
3DGRT vs. Classic 3DGS
Rendering method
Rasterization (tile-based sort)
Ray tracing via GPU RT cores
Shadows & reflections
❌ No
✅ Yes (secondary rays)
Distorted cameras
❌ Limited
✅ Full support
Rolling shutter
❌ No
✅ Yes
Speed vs. 3DGS
Faster
Slightly slower, but far richer
Hardware requirement
Any CUDA GPU
NVIDIA RT-core GPU (Turing+)
How 3DGRT Works Under the Hood
Unlike classic 3DGS which rasterizes Gaussians by projecting them onto screen-space tiles and sorting them front-to-back, 3DGRT performs ray tracing — it:
Wraps each Gaussian particle in a bounding mesh primitive
Inserts all bounding meshes into an OptiX BVH (Bounding Volume Hierarchy)
For each pixel, casts a ray and traverses the BVH in O(log n)
Shades batches of intersected Gaussians in depth order
Optionally fires secondary rays from surface hits for reflections, shadows, and refractions
What is 3DGUT?
3DGUT (CVPR 2025 Oral) solves a different problem: making Gaussian Splatting work with highly distorted cameras — fish-eye lenses, rolling-shutter sensors, and time-dependent camera models common in robotics and autonomous driving. It uses the Unscented Transform to propagate Gaussian distributions through non-linear camera projections, enabling a hybrid rasterizer that's both fast and accurate for distorted optics.
Quick rule of thumb:
Standard perspective cameras + want reflections/shadows → use 3DGRT
Distorted cameras (fish-eye, rolling shutter) or need max speed → use 3DGUT
2. Spin up a Yotta Labs Pod
Recommended GPU
For 3DGRUT you must have an NVIDIA GPU with RT cores (Turing architecture or newer). My recommendation:
H100 SXM5 80GB
80 GB
Best all-around: training + 3DGRT rendering
A100 80GB
80 GB
Great for training, RT cores present
RTX 5090
32 GB
Smaller scenes, fastest RT throughput
⚠️ Important: 3DGRT requires RT cores for fast ray traversal. Without them it falls back to software ray tracing, which is ~10× slower. Always pick an Turing+ (RTX/A-series/H-series) GPU.
Log in to the Yotta Labs Console.
Navigate to Compute → Pods and click Deploy.
Select RTX 5090 as the GPU.
Under Pod Template, choose
pytorch.Set System Volume to at least 150 GB (checkpoints vary: LongCat ~15 GB, Self-Forcing ~30 GB, HY-WorldPlay ~25 GB).
Click Deploy and wait for the Pod to reach
Runningstate.
3. Install & Build 3DGRUT
Download and Install the OptiX SDK
3DGRUT requires the NVIDIA OptiX SDK for ray tracing. The SDK is a self-extracting shell script that can be installed entirely in your home directory — no sudo required.
Log in to your NVIDIA Developer account.
Go to the OptiX Legacy Downloads page.
Download OptiX SDK 8.0.0 for Linux 64-bit.
Transfer the
.shfile to your VM (viascp,wgetwith a direct link, etc.).
Once the file is on your VM:
Set the environment variable so the build system can find it:
4. Prepare Your Scene Data
3DGRUT trains from a set of posed images — photos of your scene from multiple angles, along with camera intrinsics and extrinsics. The standard input is COLMAP, but NeRF-Synthetic JSON is also supported.
Expected Directory Structure (COLMAP)
5. Train the Model
Training uses Hydra for configuration. There are separate configs for 3DGRT and 3DGUT. The full training loop runs for 30,000 iterations and covers several distinct phases.
Understanding the Training Phases
Warmup
0 – 500
Low learning rate, coarse geometry
Densification
500 – 15,000
Clone under-reconstructed Gaussians, split large ones every 100 iters
Opacity reset
15,000 – 25,000
Periodically zero out low-opacity Gaussians to prune floaters
Final refinement
25,000 – 30,000
Fine-tune colors and covariances, no more densification
During densification, the Gaussian count grows from ~50,000 (sparse SfM seed) to several million. Expect GPU memory usage to rise during this phase.
Training will produce:
outputs/room_3dgut/<experiment_name>/ckpt_last.pt— Final checkpointoutputs/room_3dgut/<experiment_name>/ours_7000/ckpt_7000.pt— Intermediate checkpoint at 7000 iterationsoutputs/room_3dgut/<experiment_name>/ours_30000/ckpt_30000.pt— Checkpoint at 30000 iterationsoutputs/room_3dgut/<experiment_name>/metrics.json— Evaluation metrics (PSNR, SSIM, LPIPS)outputs/room_3dgut/<experiment_name>/parsed.yaml— Full resolved config
Example Training Results

6.View the Traning Results
The Viser GUI provides a web-based interactive 3D viewer accessible via your browser. This is the best option for remote servers.
1. Install viser:
2. Launch the viewer with your pre-trained checkpoint:
You should see:
3. On your local machine, set up SSH port forwarding:
4. Open in your browser:
5. Navigate the scene:
On startup you may see a black screen. This is normal. Use your mouse to navigate:
Left-click drag — Rotate
Right-click drag — Pan
Scroll wheel — Zoom
Navigate to the training camera views to see the reconstructed scene.

Keep an eye on the 3DGRUT GitHub for updates — NVIDIA's team ships improvements regularly. Happy training! 🎉
Last updated
Was this helpful?