Yao Fu | Website | Blog

University of Edinburgh | [email protected]

[email protected]

June 29th 2023

https://embed.notionlytics.com/wt/ZXlKM2IzSnJjM0JoWTJWVWNtRmphMlZ5U1dRaU9pSkRjbEpXY0dSRGRXdGhhekoxT1VoMVZGbGtkeUlzSW5CaFoyVkpaQ0k2SW1ZMU9XUmlabU16Tm1VeVpEUmxNVEpoTXpNME5ETmlaRFppTWpBeE1tTXlJbjA9

Following the great success of ChatGPT, on February 24, 2023, the emergence of LLaMA heated up the direction of instruction tuning. On March 18, Alpaca demonstrated the potential of distilling smaller models from mature ones to become decent ChatBots, triggering a Cambrian explosion of llama-based models. However, just three months later, people began to recognize the various problems of training LLaMA with ChatGPT's data. This article reviews the development of LLaMA-based models over the past three months and discusses the next challenges of Instruction Tuning.

Disclaimer: This article is essentially a quick research memo, edited from the outline of my recent presentation, with some cuts and additions. Currently, there are many things the open-source community is unclear about building LLMs. I have tried my best to ensure that the content I refer to or discuss has solid evidence, rather than being based on rumors. Much of the content comes from direct discussions with the original authors of the corresponding papers. Even so, my take may still be very wrong and there may be many unresolved issues, so please feel free to comment directly beside the article and participate actively in the discussion — I will keep all the comments that point out my errors. The truth becomes clearer with debate.

Table of Contents

1 - The origin

The first three papers:

Comparisons:

Then there's Self-instruct:

Important points about self-instruct: