Dr. Furu Wei is a Partner Research Manager (首席研究员/全球合伙人) at Microsoft Research Asia, Beijing, China, where he is leading the Natural Language Processing group and overseeing the team's research on Foundation Models (across tasks, languages and modalities), NLP, MT, Speech and Multimodal AI. He got his B.Sc and Ph.D from Department of Computer Science of Wuhan University in 2004 and 2009, respectively. He was a Staff Researcher at IBM Research - China (IBM CRL) from Jul. 2009 to Nov. 2010, and a Research Assistant at Department of Computing, The Hong Kong Polytechnic University from Jan. 2007 to Jun. 2009.


Furu published over 200 research papers in prestigious conferences and journals in natural language processing and artificial intelligence, including ACL, EMNLP, NAACL, COLING, Computational Linguistics, ICML, NeurIPS, ICLR, SIGIR, KDD, AAAI, IJCAI, etc. According to Google Scholar, his H-index is 70 with more than 20,000 citations (as of 2022). As a co-author, Furu received the Best Paper Runners Up award at AAAI 2021, and the Best Student Paper award at KDD 2018. Furu served as a Senior Area Chair in ACL 2021, an Area Chair in EMNLP 2015, NAACL-HIT 2016, EMNLP 2019, and NeurIPS 2021. He has more than 20 patents filed or granted. The research from Furu and his team has been widely used in Microsoft products, including Office (Word, PowerPoint, Outlook and Microsoft Designer), Bing, Microsoft Ads, Azure (Cognitive Services), Dynamics, Windows, LinkedIn, etc.


In addition to the research efforts to push the state-of-the-art in specific tasks and areas like NLP, MT, Speech and Multimodal AI, Furu and his team has been working on a mission-focused research with the long-term vision to advance artificial general (adaptable & generalizable) intelligence, focusing on the generality, generalizability, and adaptability of AI. Specifically, the team's research on Foundation Models has been pushing large-scale AI and The Big Convergence of large-scale pre-training across tasks, languages, and modalities, including UniLM(-2) for language model pre-training; InfoXLM, XLM-E for multilingual pre-training; BEiT(-2) for vision pre-training; WavLM, SpeechLM, VALL-E for speech pre-training; BEiT-3 for multimodal pre-training; Layout(X)LM(-2/3) as the first multimodal document foundation model; MetaLM as the general-purpose foundation model; Multiway Transformers for multimodal modeling and Magneto (Foundation Transformers) for true general-purpose modeling, to name a few. Also, the research has been pushing fundamental AI, such as the TorchScale initiative which focuses on fundamental research to improve modeling generality and capability as well as training stability and efficiency for Transformers at any scale, including DeepNet, Magneto (Foundation Transformers), and X-MoE. We also work on fundamental research and technology for building AI products w/ foundation models. For example, we develop effective and efficient approaches to deploying large AI models in practice, including MiniLM(-2), xTune, EdgeFormer, and Aggressive Decoding. We also have the LMOps initiative which specifically focuses on general technology for enabling AI capabilities w/ LLMs and Generative AI models, including Extensible Prompts, Promptist, and Structured Prompting. In addition to the research achievements, these models are significant parts of Microsoft's own family of large AI (foundation) models powering language and multimodal tasks and scenarios across products in Microsoft. Moreover, our research tops public benchmarks and leaderboards across language (GLUE, XTREME), vision (ADE20k, COCO), speech (SUPERB), and multimodal (NLVR2, VQAv2) tasks, and hugely contributes to the open source community through GitHub and Hugging Face.


Furu Wei was named to the first (2017) MIT Technology Review’s annual list of Innovators Under 35 China (MIT TR35 China) for contributions to natural language processing. 2018年12月,入选中国AI英雄风云榜技术创新人物(新锐)奖。2019年10月,统一预训练语言模型与机器阅读理解的创新获选第六届世界互联网大会“世界互联网领先科技成果”。2020年12月,入选“北京市劳动模范”。


We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on Foundation Models (aka large-scale pre-trained models) and AGI, NLP, MT, Speech, Document AI and Multimodal AI, please send your resume to fuwei@microsoft.com.


microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

microsoft/torchscale: Transformers at (Any) Scale

microsoft/lmops: General technology for enabling AI capabilities w/ LLMs and Generative AI models


Our mission-focused research on Advancing AGI: Adaptable & Generalizable Intelligence