Learning for 3D Vision

Course Description

Any autonomous agent we develop must perceive and act in a 3D world. The ability to infer, model, and utilize 3D representations is therefore of central importance in AI, with applications ranging from robotic manipulation and self-driving to virtual reality and image manipulation. While 3D understanding has been a longstanding goal in computer vision, it has witnessed several impressive advances due to the rapid recent progress in (deep) learning techniques. The goal of this course is to explore this confluence of 3D Vision and Learning-based methods. In particular, this course will cover topics including -

• Explicit, Implicit, and Neural 3D Representations		• Differentiable Rendering
• Single-view 3D Prediction: Objects, Scenes, and Humans		• Neural Rendering
• Multi-view 3D Inference: Radiance Fields, Multi-plane Images, Implicit Surfaces, etc.
• Generative 3D Models	• Shape Abstraction	• Mesh and Point cloud processing

Format and Prerequisites

The course will be lecture-based, and the grades will primarily be determined by assignments and a final project. The course will require as background good coding skills, and an understanding of basics in Computer Vision (e.g. image formation, ray optics) and Machine Learning (e.g. optimization, neural networks).

Course Staff

Please use the course Piazza page for all communication with course staff. See this office hour schedule for in-person meetings.

Course Instructor

Shubham Tulsiani

Teaching Assistants

George Wei

Hanzhe Hu

Jianren Wang

Related Courses

If you found this course useful, you may also be interested in the following related courses:

Learning for 3D Vision by Angjoo Kanazawa, UC Berkeley
3D Vision by Derek Hoiem, UIUC
Physics-based Rendering by Ioannis (Yannis) Gkioulekas, CMU
Machine Learning for Inverse Graphics by Vincent Sitzmann, MIT
Geometry-based Methods in Vision, CMU
Machine Learning meets Geometry by Hao Su, UCSD

Previous offerings: Spring 2025, Spring 2024, Spring 2023, Spring 2022