Advancing AI for humanity
⇛
Research
Blog
About
Star
...
Retentive Network: Revolutionizing Transformers for Large Language Models
Jul 18, 2023
Kosmos-2.5: A Multimodal Literate Model
Sep 20, 2023
Large Language Model for Science: A Study on P vs. NP
Sep 13, 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jul 6, 2023
Kosmos-2: Grounding Multimodal Large Language Models (MLLMs) to the World
Jun 26, 2023
Kosmos-1: A Multimodal Large Language Model (MLLM)
Feb 28, 2023
WavMark: Watermarking for Audio Generation
Aug 24, 2023
VALL-E (X): Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Jan 6, 2023
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
Sep 19, 2023
AdaLLM: Adapting Large Language Models via Reading Comprehension
Sep 18, 2023
MiniLLM: Knowledge Distillation of Large Language Models
Jun 14, 2023
Large Language Models with Long-Term Memory
Jun 12, 2023
LLM Accelerator: Lossless Acceleration of Large Language Models
Apr 11, 2023
A Length-Extrapolatable Transformer
Dec 20, 2022
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta Optimizers
Dec 20, 2022
Promptist: Optimizing Prompts for Text-to-Image Generation
Dec 19, 2022
Structured Prompting: Scaling In-Context Learning to 1,000 Examples
Dec 12, 2022
TorchScale: Transformers at (Any) Scale
Nov 24, 2022
Magneto: A Foundation Transformer
October 13, 2022
BEiT-3: A General-Purpose Multimodal Foundation Model
Aug 30, 2022
Language Models are General-Purpose Interfaces
June 13, 2022
DeepNet: Scaling Transformers to 1,000 Layers
Mar 1, 2022
BEiT: BERT Pre-Training of Image Transformers
June 15, 2021
XLM-E: Efficient Multilingual Language Model Pre-training
June 30, 2021
UniLM: Unified Language Model Pre-training
May 8, 2019