arXiv:2511.20853

MODEST

Multi-Optics Depth-of-Field Stereo Dataset

Nisarg K. Trivedi, Vinayak A. Belludi, Li-Yun Wang, Pardis Taghavi, Dante Lok

The first high-resolution (5472×3648px) stereo DSLR dataset with 18,000 images, systematically varying focal length and aperture across complex real scenes. Captured with two identical Canon 6D camera assemblies at 10 focal lengths (28-70mm) and 5 apertures (f/2.8-f/22), spanning 50 optical configurations across 9 diverse scenes.

Dataset Comparison

Stereo Dataset and Depth-model Evaluation

Dataset Capture Setup Real/Syn Resolution Focal Var. Aperture Var. Light Var. Depth Range Calib.
KITTI ('12) Stereo RGB + LiDAR Real 1242 × 375 0.5m - 80m
NYU Depth V2 ('12) Mono RGB-D (Kinect v1) Real 640 × 480 0.5m - 10m
TUM RGB-D ('12) Mono RGB-D (Kinect v1) Real 640 × 480 0.5m - 5m
vKITTI ('16) Virtual stereo (Unity) Synthetic 1242 × 375 0.5m - 100m
ScanNet ('17) Mono RGB-D (iPad) Real 640-1296 × 480-968 0.2m - 10m
Matterport3D ('17) Multi-cam (360° RGB-D) Real 1280 × 1024 0.2m - 20m
iBims-1 ('18) Mono RGB (DSLR) Real 640-1500 × 480-1000 0.5m - 10m
VOID ('20) Mono RGB-D (RealSense) Real 640 × 480 50m
DIML outdoor ('20) Stereo RGB (ZED) Real 1920 × 1080 0.5m - 100m
HAMMER ('22) Multi-cam (RGB-P + ToF) Real 1280 × 720 10m
ScanNet++ ('24) Mono RGB-D (iPad) Real 1920 × 1440 0.2m - 20m
MODEST ('25) Stereo RGB (Canon 6D) Real 5472 × 3648 0.5m - 10m

Multi-view Stereo

Dataset Capture Setup Real/Syn Resolution Focal Var. Aperture Var. Light Var. Depth Range Calib.
ETH3D ('17) Multi-cam (DSLR + LiDAR) Real 2048 × 1536 0.5m - 50m
Replica ('19) Mono RGB (DSLR + LiDAR) Synthetic 1080 × 1080 0.1m - 10m
DDAD ('20) Multi-cam (shutter + LiDAR) Real 1600 × 900 0.5m - 100m
MODEST ('25) Stereo RGB (Canon 6D) Real 5472 × 3648 0.5m - 10m

3D Static and Dynamic Scene Generation

Dataset Capture Setup Real/Syn Resolution Focal Var. Aperture Var. Light Var. Depth Range Calib.
Sintel ('12) Synthetic (3D film) Synthetic 1024 × 436 0m - 80m
DIODE ('19) Mono RGB-D (FARO Focus) Real 1024 × 768 0.5m - 350m
MODEST ('25) Stereo RGB (Canon 6D) Real 5472 × 3648 0.5m - 10m

Shallow Depth-of-Field Rendering

Dataset Capture Setup Real/Syn Resolution Focal Var. Aperture Var. Light Var. Depth Range Calib.
DPDD ('19) Mono RGB Real 1680 × 1120 0.3m - 10m
BLB ('22) Synthetic (Blender) Synthetic 1920 × 1080 0.5m - 10m
VABD ('24) Mono RGB Real 1536 × 1024 N/A
MODEST ('25) Stereo RGB (Canon 6D) Real 5472 × 3648 0.5m - 10m

Optical Illusions

Dataset Capture Setup Real/Syn Resolution Focal Var. Aperture Var. Light Var. Depth Range Calib.
3D-Visual-Illusion ('25) Stereo RGB + LiDAR Real + Syn 1080 × 1920 N/A N/A 0.5m - 50m
MODEST ('25) Stereo RGB (Canon 6D) Real 5472 × 3648 0.5m - 10m

Visualizations & Results

Dataset Overview

Dataset Samples

Click image to view in full resolution

Depth Estimation Results

Aperture Analysis

Aperture Analysis

Analysis of different aperture values and their impact on depth of field

Bokeh & Defocus Blur Analysis

Deblurring Results

Visual Illusions

Challenging visual illusions included in the dataset to test algorithm robustness

3D Scene Reconstruction - COLMAP & OpenMVS

3D Gaussian Splatting

Novel view synthesis using 3D Gaussian Splatting on MODEST dataset

Download Dataset

Dataset Structure

MODEST/
├── Global_calibration_set/
│   ├── EOS6D_A_Left/
│   │   └── fl_<focal_length>/
│   │       ├── calibration/
│   │       │   └── rectified/
│   │       └── inference/
│   ├── EOS6D_B_Right/
│   │   └── fl_<focal_length>/
│   │       ├── calibration/
│   │       │   └── rectified/
│   │       └── inference/
│   └── stereocal_rectified_calibration_<focal_length>/
│
├── Scene<id>/
│   ├── EOS6D_A_<Left|Right>/
│   │   └── fl_<focal_length>/
│   │       ├── calibration/
│   │       │   └── rectified/
│   │       └── inference/
│   │           ├── F<aperture>/
│   │           └── rectified/
│   │          
│   │
│   └── EOS6D_B_<Left|Right>/
│       └── fl_<focal_length>/
│           ├── calibration/
│           │   └── rectified/
│           └── inference/
│               ├── F<aperture>/
│               └── rectified/
│
└── ...

Notes

  • <focal_length> ∈ {28mm, 32mm, 36mm, 40mm, 45mm, 50mm, 55mm, 60mm, 65mm, 70mm}
  • <aperture> ∈ {F2.8, F5.0, F9.0, F16.0, F22.0}
  • Scene<id> spans multiple scenes captured under identical optical configurations

Dataset Statistics

  • Total Images18,000
  • Resolution5472 × 3648px
  • Scenes9 indoor environments
  • Focal Lengths10 (28mm - 70mm)
  • Apertures5 (f/2.8 - f/22)
  • CameraCanon 6D DSLR
  • Depth Range0.5m - 10m