PIFu: Pixel-Aligned Implicit Function for
High-Resolution Clothed Human Digitization

Shunsuke Saito1,2 *
Zeng Huang1,2 *
Ryota Natsume3 *
Shigeo Morishima3
Angjoo Kanazawa4
Hao Li1,2,5

University of Southern California1
USC Institute for Creative Technologies2
Waseda University3
University of California, Berkeley4

[Code (coming soon)]

Our approach can digitize intricate variations in clothing, such as wrinkled skirts, high-heels, and complex hairstyles. Shape and textures can be fully recovered in largely unseen regions such as the back of the subject. Our method can also be extended to multi-view input images.

We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way. Compared to existing representations used for 3D deep learning, PIFu can produce high-resolution surfaces including largely unseen regions such as the back of a person. In particular, it is memory efficient unlike the voxel representation, can handle arbitrary topology, and the resulting surface is spatially aligned with the input image. Furthermore, while previous techniques are designed to process either a single image or multiple views, PIFu extends naturally to arbitrary number of views. We demonstrate high-resolution and robust reconstructions on real world images from the DeepFashion dataset, which contains a variety of challenging clothing types. Our method achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.


Saito*, Huang*, Natsume*, Morishima, Kanazawa, Li.

PIFu: Pixel-Aligned Implicit Function for
High-Resolution Clothed Human Digitization.

ICCV 2019.

[pdf]     [Bibtex]

Paper Video

Single-View Reconstruction

Multi-View Reconstruction

Support Arbitrary Number of Views

Single-View Video Reconstruction

Code (coming soon)



Hao Li is affiliated with the University of Southern California, the USC Institute for Creative Technologies, and Pinscreen. This research was conducted at USC and was funded by in part by the ONR YIP grant N00014-17-S-FO14, the CONIX Research Center, one of six centers in JUMP, a Semiconductor Research Corporation program sponsored by DARPA, the Andrew and Erna Viterbi Early Career Chair, the U.S. Army Research Laboratory under contract number W911NF-14-D-0005, Adobe, and Sony. This project was not funded by Pinscreen, nor has it been conducted at Pinscreen or by anyone else affiliated with Pinscreen. Shigeo Morishima is supported by the JST ACCEL Grant Number JPMJAC1602, JSPS KAKENHI Grant Number JP17H06101, the Waseda Research Institute for Science and Engineering. Angjoo Kanazawa is supported by the Berkeley Artificial Intelligence Research sponsors. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. This webpage template was borrowed from colorful folks.