Dr. Furu Wei is a Partner Research Manager (全球研究合伙人) at Microsoft Research Asia, where he leads and oversees research on Foundation Models (across tasks, languages and modalities), NLP, MT, Speech and Multimodal AI. More recently, he has also been driving the mission-focused research on General AI, focusing on fundamental research of the Foundation of A(G)I. Furu received his B.S. and Ph.D. in computer science from Wuhan University in 2004 and 2009, respectively. He was a Staff Researcher at IBM Research - China (IBM CRL) from Jul. 2009 to Nov. 2010, and a Research Assistant at Department of Computing, The Hong Kong Polytechnic University from Jan. 2007 to Jun. 2009.


Furu published over 200 research papers in prestigious conferences and journals in natural language processing and artificial intelligence, including ACL, EMNLP, NAACL, COLING, Computational Linguistics, ICML, NeurIPS, ICLR, SIGIR, KDD, AAAI, IJCAI, etc. According to Google Scholar, his H-index is 90 with more than 40,000 citations (as of 2024). As a co-author, Furu received the Best Paper Runners Up award at AAAI 2021, and the Best Student Paper award at KDD 2018. Furu served as a Senior Area Chair in ACL 2021, an Area Chair in EMNLP 2015, NAACL-HIT 2016, EMNLP 2019, and NeurIPS 2021. He has more than 20 patents filed or granted. The research from Furu and his team has been widely integrated in Microsoft products, including Office (Word, PowerPoint, Outlook and Microsoft Designer), Bing, Microsoft Ads, Azure (Cognitive Services), Dynamics, Windows, LinkedIn, etc.


Recently, Furu has been driving the agenda of the mission-focused research on advancing A(G)I for humanity, focusing on fundamental research of the Foundation of A(G)I. We are also committed to building the new foundation of A(G)I.

Our research has been pushing #TheBigConvergence of Foundation Models across tasks, languages, and modalities, including UniLM(-2) for language; InfoXLM, XLM-E for multilingual; BEiT(-2) for vision; WavLM, SpeechLM, VALL-E for speech; BEiT-3 for multimodal; Layout(X)LM(-2/3) as the multimodal document foundation model; MetaLM as the general-purpose foundation model; The Evolution of (M)LLMs (Multimodal LLMs): Kosmos-1/2/2.5/G, VALL-E.
Our research on Foundation Architecture has been pushing new architectures for foundation models and A(G)I, focusing on modeling generality and capability, as well as training stability and efficiency, including DeepNet (training stability), Magneto (modeling generality), Multiway Transformers (multimodal modeling), X-MoE (efficiency), LEX Transformer (better position embedding and length extrapolation), The Revolution of Model Architecture: RetNet, BitNet, LongNet, and TorchScale (Library).
Science of Intelligence: Understanding the principles and theoretical boundary of (artificial general) intelligence. Why Can GPT Learn In-Context?.
LLMOps: Research and technology for building AI products w/ foundation models. We work on general technology for enabling AI capabilities w/ (M)LLMs, including MiniLLM (LLM Distillation), LLM Accelerator, Structured Prompting, Extensible Prompts, and Promptist. We also develop effective and efficient approaches to deploying large AI models in practice, including MiniLM(-2), xTune, EdgeFormer, and Aggressive Decoding.

In addition to the research achievements, these models are significant parts of Microsoft's own family of large AI (foundation) models powering language and multimodal tasks and scenarios across products in Microsoft. Moreover, our research tops public benchmarks and leaderboards across language, vision, speech, and multimodal tasks, and hugely contributes to the open source community through GitHub and Hugging Face.


Furu Wei was named to the first (2017) MIT Technology Review’s annual list of Innovators Under 35 China (MIT TR35 China) for contributions to natural language processing.


We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on Foundation Models and General AI, NLP, MT, Speech, Document AI and Multimodal AI, please send your resume to fuwei@microsoft.com.


microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

microsoft/torchscale: Neural Architecture for General AI

microsoft/lmops: General technology for enabling AI capabilities w/ (M)LLMs


Our mission-focused research on Advancing A(G)I for humanity | Foundation of A(G)I