Research Scientist

Meta

Biography

I’m a Research Scientist at Codec Avatars Lab, Meta. I received Ph.D. degree in Australian National University, supervised by Hongdong Li and Yasuyuki Matsushita. I received B. Eng degree in Shanghai Jiaotong University.

My research is focused on reconstructing photorealistic humans in 3D, creating next generation digital avatars.

If you are interested in interning at Meta with me, please send an email with your resume and research interests.

Interests
  • Photorealistic humans in 3D
  • Next generation digital avatars

Experience

 
 
 
 
 
Codec Avatars Lab, Meta
Research Scientist
Codec Avatars Lab, Meta
Jul 2023 – Present Pittsburgh, United States
 
 
 
 
 
Tencent
Research Intern
Tencent
Feb 2023 – May 2023 Canberra, Australia
 
 
 
 
 
Reality Labs Research, Meta
Research Scientist Intern
Reality Labs Research, Meta
Jun 2022 – Dec 2022 Pittsburgh, United States

News

  • One paper (Oral) is accepted by ICCV 2025.

  • Two papers are accepted by SIGGRAPH 2025.

  • Three papers are accepted by CVPR 2025.

  • One paper is accepted by SIGGRAPH Asia 2024.

  • One paper is accepted by CVPR 2024.

  • Two papers are accepted by CVPR 2023.

Recent Publications

HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars
A universal prior model, HairCUP, explicitly disentangles hair and face components to enable flexible hairstyle swapping and the creation of high-fidelity 3D head avatars from only a few images
Relightable Full-body Gaussian Codec Avatars
The first drivable, full-body avatar that can be realistically relighted is introduced, employing a new method to manage complex lighting effects on an articulated body.
3DGH: 3D Head Generation with Composable Hair and Face
A novel generative model, 3DGH, creates a wide variety of 3D heads by freely composing different hair and face components.
FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images
Personalized and animatable 3D avatars are reconstructed with a fast, feed-forward method from just a few images, removing the need for per-subject optimization.
LUCAS: Layered Universal Codec Avatars
High-fidelity, real-time 3D avatars efficient enough for mobile devices are created using a layered model that separates the hair and face.
Vid2Avatar-Pro: Authentic Avatar from Videos in the Wild via Universal Prior
Authentic, animatable 3D avatars are generated from challenging videos captured “in the wild” by leveraging a universal prior model.
URAvatar: Universal Relightable Gaussian Codec Avatars
We present URAvatar. It is a high-fidelity Universal prior for Relightable Avatars. You can create URAvatar (Your Avatar) from a phone scan
Relightable Gaussian Codec Avatars
We build high-fidelity relightable & animatable head avatars with 3D-consistent sub-millimeter details such as hair strands and pores on dynamic face sequences.
MEGANE: Morphable Eyeglass and Avatar Network

We propose a 3D compositional morphable model of eyeglasses that accurately incorporates high-fidelity geometric and photometric interaction effects.

We employ a hybrid representation that combines surface geometry and a volumetric representation to enable modification of geometry, lens insertion and frame deformation.

Our model is relightable under point lights and natural illumination, which can synthesize casting shadows between faces and glasses

In-the-wild Inverse Rendering with a Flashlight

We propose a practical photometric solution for the in-the-wild inverse rendering under unknown ambient lighting.

We recovers scene geometry and reflectance using only multi-view images captured by a smartphone.

The key idea is to exploit smartphone’s built-in flashlight as a minimally controlled light source, and decompose images into two photometric components: a static appearance corresponds to ambient flux, plus a dynamic reflection induced by the flashlight.

Self-calibrating Photometric Stereo by Neural Inverse Rendering

Introduced a self-supervised neural network for uncalibrated photometric stereo problem.

The object surface shape, and light sources are jointly estimated via the neural network in an unsupervised manner

Neural Reflectance for Shape Recovery with Shadow Handling

Formulated the shape estimation and material estimation in a self-supervised framework. Explicitly predicted shadows to mitigate the errors.

Achieved the state-of-the-art performance in surface normal estimation and been an order of magnitude faster than previous methods.

The proposed neural representation of reflectance also presents higher quality in object relighting task than prior works.

Neural Plenoptic Sampling: Learning Light-field from Thousands of Imaginary Eyes

Proposed a neural representation for the plenoptic function, which describes light rays observed from any given position in every viewing direction.

Proposed proxy depth reconstruction and color-blending network for achieving well approximation on the complete plenoptic function.

The generated results are in high-quality with better PSNR than previous methods. The training and testing time of proposed method is also more than 10 times faster than prior works.

Lighting, Reflectance and Geometry Estimation from 360° Panoramic Stereo
Learning to Minify Photometric Stereo

Dramatically decrease the demands on the photometric stereo problem by reducing the number of images at input.

Automatically learn the critical and informative illuminations required at input.

A Frequency Domain Neural Network for Fast Image Super-resolution

A frequency domain neural network for image super-resolution.

Employs the convolution theorem so as to cast convolutions in the spatial domain as products in the frequency domain.

The network is very computationally efficient at testing, which is one to two orders of magnitude faster than the previous works.

Stereo Super-resolution via a Deep Convolutional Network

A deep network for images super-resolution with stereo images at input. The network is designed to allow combining structural information in the image across large regions efficiently.

By learning the residual image, the network copes better with vanishing gradients and its devoid of gradient clipping operations.