Fanyi Xiao

Email: fyxiao at ucdavis dot edu

I am a Research Scientist working on computer vision at Meta AI. Previously, I have worked at Amazon AI on the AWS Rekognition team. Before that, I finished my PhD at the University of California Davis, advised by Prof. Yong Jae Lee.

During my PhD, I am very fortunate to have spent time at Disney Research working with Prof. Leonid Sigal, at NVIDIA Research with Dr. Xiaodong Yang and Dr. Ming-Yu Liu, and at Facebook AI Research (FAIR) with Dr. Christoph Feichtenhofer, Prof. Kristen Grauman and Prof. Jitendra Malik.

I'm mostly interested in multimodal learning with minimal human supervision as well as video understanding. Here is a talk I gave recently.

We have internship openings to work on a broad range of topics including visual language pretraining for object detection, low-shot and efficient detection, etc. Drop me an email if you're interested!

CV / Scholar / Github


3/22 -- We are releasing EgoObjects dataset -- the first large-scale dataset focused on object detectors for egocentric video, check it out!
3/22 -- Our paper on hierarchical pretraining for movie understanding is accepted to CVPR22!
12/21 -- I have recently joined Meta AI as a Research Scientist focusing on object and scene understanding for Augmented Reality.
09/20 -- Our paper on adaptive anti-aliasing won best paper award at BMVC 2020!


Hierarchical Self-supervised Representation Learning for Movie Understanding
Fanyi Xiao, Kaustav Kundu, Joseph Tighe, Davide Modolo
Computer Vision and Pattern Recognition (CVPR), 2022

MoDist: Motion Distillation for Self-supervised Video Representation Learning
Fanyi Xiao, Joseph Tighe, Davide Modolo
Surprising effectiveness of simple motion prior for video SSL

YolactEdge: Real-time Instance Segmentation on the Edge
Haotian Liu*, Rafael A. Rivera-Soto*, Fanyi Xiao, Yong Jae Lee
IEEE International Conference on Robotics and Automation (ICRA), 2020
[arXiv] [Code] [Talk] [Demo] [Colab Notebook]
Run instance segmentation on your Jetson device

Delving Deeper into Anti-aliasing in ConvNets
Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yong Jae Lee
British Machine Vision Conference (BMVC), 2020
[Project] [Code] [Talk]
Best Paper Award

Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao, Yong Jae Lee, Kristen Grauman, Jitendra Malik, Christoph Feichtenhofer

YOLACT++: Better Real-time Instance Segmentation
Daniel Bolya*, Chong Zhou*, Fanyi Xiao, Yong Jae Lee
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)
YOLACT++ (v1.2) code released

Identity from here, Pose from there: Self-supervised Disentanglement and Generation of Objects using Unlabeled Videos
Fanyi Xiao, Haotian Liu, Yong Jae Lee
International Conference on Computer Vision (ICCV), 2019

YOLACT: Real-time Instance Segmentation
Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee
International Conference on Computer Vision (ICCV), 2019
oral presentation
code available!

STEP: Spatio-Temporal Progressive Learning for Video Action Detection
Xitong Yang, Xiaodong Yang, Ming-Yu Liu, Fanyi Xiao, Larry Davis, Jan Kautz
Computer Vision and Pattern Recognition (CVPR), 2019
oral presentation

Video Object Detection with an Aligned Spatial-Temporal Memory
Fanyi Xiao and Yong Jae Lee
European Conference on Computer Vision (ECCV), 2018

Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks
Wenjian Hu, Krishna Kumar Singh*, Fanyi Xiao*, Jinyoung Han, Chen-Nee Chuah and Yong Jae Lee
ACM International Conference on Web Search and Data Mining (WSDM), 2018
* equal contribution

Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
Fanyi Xiao, Leonid Sigal and Yong Jae Lee
Computer Vision and Pattern Recognition (CVPR), 2017

Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals
Fanyi Xiao and Yong Jae Lee
Computer Vision and Pattern Recognition (CVPR), 2016
spotlight presentation

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection
Krishna Singh, Fanyi Xiao and Yong Jae Lee
Computer Vision and Pattern Recognition (CVPR), 2016

Discovering the Spatial Extent of Relative Attributes
Fanyi Xiao and Yong Jae Lee
International Conference on Computer Vision (ICCV), 2015
oral presentation

Efficient Model Evaluation with Bilinear Separation Model
Fanyi Xiao and Martial Hebert
Winter Conference on Applications of Computer Vision (WACV), 2015

Transitive Distance Clustering with K-Means Duality
Zhiding Yu, Chunjing Xu, Deyu Meng, Zhuo Hui, Fanyi Xiao, Wenbo Liu, Jianzhuang Liu
Computer Vision and Pattern Recognition (CVPR), 2014

Physical Querying with Multi-modal Sensing
Iljoo Baek, Taylor Stine, Denver Dash, Fanyi Xiao, Yaser Sheikh, Yair Movshovitz-Attias, Mei Chen, Martial Hebert, and Takeo Kanade
Winter Conference on Applications of Computer Vision (WACV), 2014

Industry Experience

Facebook AI Research (Summer 2019)
Developed an audiovisual network architecture for video understanding

NVIDIA Research (Summer 2017)
Developed a novel method for action detection

Disney Research (Summer 2016)
Developed a novel model for free-form language grounding on images