Me hiking

I’m Wei Wen (温伟), an AI Research Scientist at Meta. I lead a team to build efficient large language models, vision language models, image & video understanding and generation for AI applications in AR/VR/XR devices and Facebook/Instagram. I have been working on LLM (e.g. MobileLLM, Llama), on-device GenAI, AI scalability and efficiency, automated generation of AI models, and others. I obtained my Ph.D. degree from Duke University. Before Meta, I work on Automated Machine Learning (AutoML), efficient deep neural networks, and distributed deep learning at Google Brain (now part of Google DeepMind), Facebook AI, Microsoft Research and HP Labs.

P. Huber*, E. Chang*, W. Wen*, I. Fedorov*, T. Elgamal, H. Huang, N. Suda, C. Sankar, V. Vogeti, Y. Wang, A. Gladkov, K. S. Tai, A. Elogeel, T. Hefny, V. Chandra, A. Aly, A. Kumar, R. Krishnamoorthi**, A. Sagar**, "MobileLLM-Pro Model Card", 2025 [HuggingFace]
C. Zhao*, E. Chang*, Z. Liu*, C.-J. Chang, W. Wen, C. Lai, R. Cao, Y. Tian, R. Krishnamoorthi, Y. Shi, V. Chandra, "MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes", preprint 2025 [paper]
Wei Wen, et al, “Llama Log: a Dataset of Llama Pre-training Log in the Wild”, Under internal review, 2024
Z, Zeng, X. Liu, M. Hang, X. Liu, Q. Zhou, C. Yang, Y. Liu, Y. Ruan, L. Chen, Y. Chen, Y. Hao, J. Xu, J. Nie, X. Liu, B. Zhang, W. Wen, S. Yuan, H. Yin, X. Zhang, K. Wang, W.-Y. Chen, Y. Han, H. Li, C. Yang, B. Long, P. S. Yu, H. Tong, J. Yang, "InterFormer: Effective Heterogeneous Interaction Learning for Click-Through Rate Prediction", International Conference on Information and Knowledge Management (CIKM), 2025
Wei Wen, Q. Zhu, W. Chu, W.-Y. Chen, and J. Yang. “CubicML: Automated ML for Distributed ML Systems Co-design with ML Prediction of Performance”, Workshop on ML for Systems, Neural Information Processing Systems (NeurIPS) 2024
Hua Zheng, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Wen-Yen Chen, Wei Wen, "SiGeo: Sub-One-Shot NAS via Geometry of Loss Landscape", ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) 2024. Acceptance Rate: 20%. [paper]
Wei Wen*, Kuang-Hung Liu*, Igor Fedorov, Xin Zhang, Hang Yin, Weiwei Chu, Kaveh Hassani, Mengying Sun, Jiang Liu, Xu Wang, Lin Jiang, Yuxin Chen, Buyun Zhang, Xi Liu, Dehua Cheng, Zhengxing Chen, Guang Zhao, Fangqiu Han, Jiyan Yang, Yuchen Hao, Liang Xiong, Wen-Yen Chen, “Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale”, The Web Conference (WWW) 2024. Acceptance Rate: 21.3%. [paper]
- * equal contributors
Hang Yin*, Kuang-Hung Liu*, Mengying Sun, Yuxin Chen, Buyun Zhang, Jiang Liu, Vivek Sehgal, Rudresh Rajnikant Panchal, Eugen Hotaj, Xi Liu, Daifeng Guo, Jamey Zhang, Zhou Wang, Shali Jiang, Huayu Li, Zhengxing Chen, Wen-Yen Chen, Jiyan Yang, Wei Wen †, “AutoML for Large Capacity Modeling of Meta's Ranking Systems”, The Web Conference (WWW) 2024. Acceptance Rate: 21.3%. [paper]
- † Project lead
Tunhou Zhang, Dehua Cheng, Yuchen He, Zhengxing Chen, Xiaoliang Dai, Liang Xiong, Feng Yan, Hai Li, Yiran Chen, Wei Wen †, “NASRec: Weight Sharing Neural Architecture Search for Recommender Systems”, International World Wide Web Conference (WWW) 2023. Acceptance Rate: 365/1900=19.2%. [paper]
- †Corresponding author. Intern Manager
Wei Wen, Yandan Wang, Feng Yan, Cong Xu, Yiran Chen, Hai Li, “SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning”, preprint. [paper][code]
Wei Wen, Hanxiao Liu, Hai Li, Yiran Chen, Gabriel Bender, Pieter-Jan Kindermans, “Neural Predictor for Neural Architecture Search”, European Conference on Computer Vision (ECCV) 2020. [Video][slides][paper][supplementary]
Wei Wen, Feng Yan, Yiran Chen, Hai Li, “AutoGrow: Automatic Layer Growing in Deep Convolutional Networks”, SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2020. Acceptance Rate: 216/1279=16.8%. [video][slides][paper][code]
Huanrui Yang, Wei Wen, Hai Li, “DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures”, International Conference on Learning Representations (ICLR) 2020. [paper]
Yuhui Xu, Yuxi Li, Shuai Zhang, Wei Wen, Botao Wang, Yingyong Qi, Yiran Chen, Weiyao Lin, Hongkai Xiong. “Trained Rank Pruning for Efficient Deep Neural Networks”, International Joint Conference on Artificial Intelligence (IJCAI) 2020. [paper]
Nathan Inkawhich, Wei Wen, Yiran Chen, Hai Li. “Feature Space Perturbations Yield More Transferable Adversarial Examples.” The Conference on Computer Vision and Pattern Recognition (CVPR) 2019. [paper]
Sangkug Lym, Esha Choukse, Siavash Zangeneh, Wei Wen, Sujay Sanghavi, and Mattan Erez. “PruneTrain: fast neural network training by dynamic sparse model reconfiguration.” International Conference for High Performance Computing, Networking, Storage and Analysis (SC). 2019. (Best Student Paper Finalist, 3.5%)
Sangkug Lym, Armand Behroozi, Wei Wen, Ge Li, Yongkee Kwon, and Mattan Erez. “Mini-batch Serialization: CNN Training with Inter-layer Data Reuse.” The Conference on Machine Learning and Systems (MLSys). 2019
Jingchi Zhang, Wei Wen, Michael Deisher, Hsin-Pai Cheng, Hai Li, Yiran Chen, “Learning Efficient Sparse Structures in Speech Recognition”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019. [paper]
Xuyang Guo, Yuanjun Huang, Hsin-pai Cheng, Bing Li, Wei Wen, Siyuan Ma, Hai Li, Yiran Chen, “Exploration of Automatic Mixed-Precision Search for Deep Neural Networks”, IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2019. (Best Paper Nomination)
Wei Wen, Yuxiong He, Samyam Rajbhandari, Minjia Zhang, Wenhan Wang, Fang Liu, Bin Hu, Yiran Chen, Hai Li, “Learning Intrinsic Sparse Structures within Long Short-Term Memory”,the 6th International Conference on Learning Representations (ICLR), 2018. [poster][paper][code]
Hsin-Pai Cheng, Yuanjun Huang, Xuyang Guo, Feng Yan, Yifei Huang, Wei Wen, Hai Li, Yiran Chen, “Differentiable Fine-grained Quantization for Deep Neural Network Compression”, NeurIPS 2018 CDNNRIA Workshop . [paper]
Wei Wen, Cong Xu, Feng Yan, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning”,the 31st Annual Conference on Neural Information Processing Systems (NeurIPS), 2017. (Oral, 40/3240=1.2%. Available in PyTorch/Caffe2.). [paper][video][slides][code][poster]
Wei Wen, Cong Xu, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “Coordinating Filters for Faster Deep Neural Networks”, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. [paper][code][poster]
Yandan Wang, Wei Wen, Linghao Song, Hai Li, “Classification Accuracy Improvement for Neuromorphic Computing Systems with One-level Precision Synapses “, ASP-DAC, 2017. (Best Paper Award, 2/358=0.56%). [paper]
Chunpeng Wu, Wei Wen, Tariq Afzal, Yongmei Zhang, Yiran Chen, Hai Li, “A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation”, CVPR, 2017. [paper]
Jongsoo Park, Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, Pradeep Dubey, “Faster CNNs with Direct Sparse Convolutions and Guided Pruning”, the 5th International Conference on Learning Representations (ICLR), 2017. [paper][code][media]
Yandan Wang, Wei Wen, Beiye Liu, Donald Chiarulli, Hai Li, “Group Scissor: Scaling Neuromorphic Computing Design to Big Neural Networks”, 54th ACM/EDAC/IEEE Design Automation Conference (DAC), 2017. Acceptance Rate: 24%. [paper]
Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “Learning Structured Sparsity in Deep Neural Networks”, the 30th Annual Conference on Neural Information Processing Systems (NeurIPS), 2016. Acceptance Rate: 568/2500=22.7%. (Integrated into Intel Nervada) [paper][code][poster]
Wei Wen, Chunpeng Wu, Yandan Wang, Kent Nixon, Qing Wu, Mark Barnell, Hai Li, Yiran Chen, “A New Learning Method for Inference Accuracy, Core Occupation, and Performance Co-optimization on TrueNorth Chip”, 53rd ACM/EDAC/IEEE Design Automation Conference (DAC), 2016. Acceptance Rate: 152/876=17.4%. (Best Paper Nomination, 16/876=1.83%). [paper]
Wei Wen, Chi-Ruo Wu, Xiaofang Hu, Beiye Liu, Tsung-Yi Ho, Xin Li, Yiran Chen, “An EDA Framework for Large Scale Hybrid Neuromorphic Computing Systems”, 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), 2015. Acceptance Rate: 162/789=20.5%. (Best Paper Nomination, 7/789=0.89%). [paper]
Jongsoo Park, Sheng R. Li, Wei Wen, Hai Li, Yiran Chen, Pradeep Dubey, “Holistic SparseCNN: Forging the Trident of Accuracy, Speed, and Size”, arXiv 1608.01409, 2016. (in Intel Developer Forum 2016, pages 41-43). [paper][code]

Industrial Experience

Talks and Presentations

University of Illinois Urbana-Champaign, Automated Design of Recommender Systems, 04/2024
Rutgers University, Rutgers EFficient AI (REFAI) Seminar, Efficient Deep Learning - on Automated Design, Distributed Training and Edge Inference, 06/29/2021
Microsoft Research, Speaker, Efficient and Scalable Deep Learning, 10/10/2019
Rice University, Guest Lecture, ELEC 515 Embedded Machine Learning, 10/28/2019.
UC Berkeley, Scientific Computing and Matrix Computations Seminar, “On Matrix Sparsification and Quantization for Efficient and Scalable Deep Learning“, 10/10/2018
Cornell University, AI Seminar, “Efficient and Scalable Deep Learning“, 10/05/2018
NIPS 2017 oral presentation, TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning, 12/6/2017
Alibaba DAMO Academy, “Deep Learning in Cloud-Edge AI Systems“, SunnyVale, CA, 06/28/2018
“Deep Learning in the Cloud and in the Fog”, [Blog@AI科技评论]
“Deep Learning in Cloud-Edge AI Systems”, [Video in Mandarin @将门创投]
“Lifting Efficiency in Deep Learning – For both Training and Inference”, [Video in Mandarin @机器之心]
“Scalable Event-driven Neuromorphic Learning Machines 3″, Intel Strategic Research Alliances (ISRA) – UC Berkeley, UC Irvine, Univ of Pitt, UCSD”, 10/27/2016
“A Predictive Performance Model of Distributed Deep Learning on Heterogeneous Systems”, Final Intern Talk, HP Labs, 08/23/2016
“Variation-Aware Predictive Performance Model for Distributed Deep Learning”, Summer Intern Fair Poster, HP Labs, 08/02/2016
“An Overview of Deep Learning Accelerator”, Seminar, HP Labs, 07/18/2016

Activities

TPC, DAC 2022
Serving as a reviewer of NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, TPAMI, IJCV, TNNLS, TCAD, Neurocomputing, TCBB, ICME, TCSVT, etc
Activity volunteer, Machine Learning for Girls, FEMMES (Female Excelling More in Math, Engineering, and Science) Capstone at Duke University, 02/2018
Conference volunteer, ESWEEK 2016, OCTOBER 2-7, PITTSBURGH, PA, USA, 10/2016

Teaching

TA: CEE 690/ECE 590: Introduction to Deep Learning, Duke University, Fall 2018
TA: STA561/COMPSCI571/ECE682: Probabilistic Machine Learning, Duke University, Spring 2019

Education

Ph.D. in Electrical and Computer Engineering, Duke University, 08/2014-12/2019
- Advisors: Dr. Hai (Helen) Li and Dr. Yiran Chen
- First 3 years in University of Pittsburgh, then moved to Duke with my advisors.
M.S. in Electronic and Information Engineering, Beihang University, Beijing, China, 09/2010-01/2013
B.S. in Electronic and Information Engineering, Beihang University, Beijing, China, 09/2006-07/2010

Media

Duke Electrical and Computer Engineering Ph.D. program cover page, “Q&A: Wei Wen. Making deep learning models faster & more efficient,” accessed on February 14, 2020 at https://ece.duke.edu/phd/students/wen.
People of ACM – Yiran Chen, “One of your most cited recent papers is “Learning Structured Sparsity in Deep Neural Networks,” which addresses the importance of enhancing the efficiency of deep neural networks”, accessed on September 24, 2022 at https://www.acm.org/articles/people-of-acm/2022/yiran-chen.

Publications