Home | Twitter / X | Google Scholar | Semantic Scholar

Hugging Face | Github | About Yao Fu

University of Edinburgh

[email protected]

I am a Ph.D. student at the University of Edinburgh (2020-) with professor Mirella Lapata. I finished my M.S. at Columbia University (2018-2020) with professor John Cunningham and my B.S. at Peking University (2013-2018) with professor Yansong Feng. Before Ph.D., I spent great time visiting professor Alexander Rush at Cornell Tech (2019-2020).

I study large-scale generative models for human language. My research objective is to make large language models the next generation computational platforms and build a language model based application ecosystem together with the community. I am broadly interested in scaling, long-context, multimodal, reasoning and efficiency.

Featured Research

Arxiv 2024 | Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis [paper][Twitter/X]

Arxiv 2024 | Retrieval Head Mechanistically Explains Long-Context Factuality [code][paper][Twitter/X]

ICML 2024 | Data Engineering for Scaling Language Models to 128K Context [code][Paper][Twitter/X]