Wenqing Cui1, Zhenyu Li1,†, Mykola Lavreniuk2,†, Jian Shi1, Ramzi Idoughi1, Xiangjun Tang1, Peter Wonka1.
1KAUST, 2Space Research Institute NASU-SSAU
†Equal contribution
- 2026-02-27: Initially release
- 2026-02-20: Accepted to CVPR 2026.
Requirements: Python ≥ 3.10, CUDA 12.4
conda create -n urgt python=3.10 -y
conda activate urgt
pip install -r requirements.txtDownload checkpoints from HuggingFace:
https://huggingface.co/Kingslanding/Any-Resolution-Any-Geometry/tree/main
| Checkpoint | Training data | Recommended use |
|---|---|---|
ckpt_best.pth |
U4K dataset | U4K benchmark evaluation |
ckpt_promask_best.pth |
U4K dataset with PRO model masks | Zero-shot evaluation |
Place the downloaded checkpoints under work_dir/ckpts/:
work_dir/
└── ckpts/
├── ckpt_best.pth
└── ckpt_promask_best.pth
You can also download directly from the command line:
mkdir -p work_dir/ckpts
huggingface-cli download Kingslanding/Any-Resolution-Any-Geometry \
ckpt_best.pth ckpt_promask_best.pth \
--local-dir work_dir/ckptsThe inference script runs the full URGT pipeline on a single image:
- Depth Anything v2 → coarse relative depth
- Metric3D v2 → coarse surface normals
- URGT refiner → high-resolution refined depth + normals
python tools/infer.py \
--image path/to/image.jpg \
--checkpoint work_dir/ckpts/ckpt_best.pth \
--output-dir ./outputIf you already have coarse depth/normal maps (.npy), pass them directly to skip Steps 1 and 2:
python tools/infer.py \
--image path/to/image.jpg \
--checkpoint work_dir/ckpts/ckpt_best.pth \
--coarse-depth path/to/coarse_depth.npy \
--coarse-normal path/to/coarse_normal.npy \
--output-dir ./outputpython tools/infer.py \
--image work_dir/examples/lab_8k.jpg \
--checkpoint work_dir/ckpts/ckpt_best.pth \
--output-dir work_dir/examples/output \
--save-intermediates| Argument | Default | Description |
|---|---|---|
--image |
(required) | Path to input RGB image (JPG/PNG) |
--checkpoint |
(required) | Path to URGT checkpoint (.pth) |
--output-dir |
same dir as image | Directory to save results |
--save-intermediates |
off | Also save coarse depth/normal visualisations |
--dav2-encoder |
vitl |
Depth Anything v2 encoder: vits / vitb / vitl / vitg |
--coarse-depth |
None |
Pre-computed coarse depth (.npy, shape [H, W]); skips DAv2 |
--metric3d-model |
ViT-Small |
Metric3D v2 variant: ViT-Small / ViT-Large / ViT-giant2 |
--coarse-normal |
None |
Pre-computed coarse normal (.npy, shape [H, W, 3]); skips Metric3D |
--patch-split |
8 8 |
Patch grid N_H N_W; image is resized to be divisible by these values |
--min-depth |
0.001 |
Minimum depth value in metres |
--max-depth |
80.0 |
Maximum depth value in metres |
--device |
auto | cuda or cpu (defaults to CUDA when available) |
For an input named image.jpg, the following files are written to --output-dir:
| File | Description |
|---|---|
image_depth_pred.png |
Colour-mapped refined depth |
image_depth_pred.npy |
Raw refined depth array, shape [H, W] |
image_normal_pred.png |
Colour-mapped refined surface normals |
image_normal_pred.npy |
Raw refined normal array, shape [H, W, 3], range [-1, 1] |
image_coarse_depth.png |
(with --save-intermediates) Colour-mapped coarse depth |
image_coarse_depth.npy |
(with --save-intermediates) Raw coarse depth array |
image_coarse_normal.png |
(with --save-intermediates) Colour-mapped coarse normals |
image_coarse_normal.npy |
(with --save-intermediates) Raw coarse normal array |
If you find our work useful for your research, please consider citing the paper
@inproceedings{cui2026resolutiongeometrymultiviewmultipatch,
title={Any Resolution Any Geometry: From Multi-View To Multi-Patch},
author={Cui, Wenqing and Li, Zhenyu and Lavreniuk, Mykola and Shi, Jian and Idoughi, Ramzi and Tang, Xiangjun and Wonka, Peter},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}