Who are the true catalysts fueling AI progress?

Who are the true catalysts fueling AI progress?
Contribution to code either directly or indirectly connects all of the contributors above across the overlapping categories. Image created using authorswithcode app

Hint: They are hardly represented in popular top k lists

When a field of study achieves a significant practical breakthrough, it rapidly attracts capital and heightens societal awareness of its implications. AI had such a moment with ChatGPT. However, these breakthroughs often cast a spotlight that overshadows the genuine catalysts who’ve contributed incrementally over the years. The focus also tends to shift towards those who control the technology’s adoption and application. The recent Times’ top 100 AI influencers list exemplifies this form of marginalization, given its sparse representation of the true catalysts.

This article attempts to zero in on the true catalysts, by factoring years of effort by both researchers and practitioners, documented in academic papers and open-source code. While citations gauge a paper’s significance, the count of its implementations and user feedback on implementations — evident in GitHub stars — often prove just as valuable, especially from the practical demonstration of an idea’s usefulness.

Identification of key researchers and practitioners advancing AI requires examining all published papers and their respective code implementations. It also requires factoring in the proportional contributions of each author to any given implementation. Such scrutiny offers genuine insight into a contributor’s significance. For instance, while some papers have no accompanying code, others have multiple implementations.

Another crucial group propelling AI forward consists of educators, whose thought leadership in addition to their pedagogical clarity is pivotal in guiding young, emerging researchers and practitioners. Figures like Andrej Karpathy and Andrew Ng are renowned for their contributions and teachings. Despite being a revered figure among both researchers and practitioners and even sponsoring contributors directly on GitHub (which might be a lesser-known fact), Karpathy’s absence from the Times 2023 list of top AI influencers is striking. Ironically, he is among the select few whose contribution to AI spans research, code, and exceptionally clear instruction.

There’s a fourth group that plays a pivotal role in augmenting AI progress: the disseminators. Comprised primarily of researchers and practitioners, they identify, comment on, share their thoughts on noteworthy research papers, and at times craft demonstrations that implement the ideas described in the papers. Some also create content on platforms like YouTube or write newsletters. The roster of these contributors continues to grow. It includes individuals (in no particular order sampled from authorswithcode ) such as Ahsen Khaliq, Elvis Saravia, Jim Fan, Christopher Olah, Yannic Kilcher, Shital Shah, and Sebastian Ruder. Both this list and that of the educators/thought leaders were manually curated. We acknowledge the possibility of overlooking some key players and apologize for any inadvertent omissions. We are committed to updating the list upon being notified of any oversights. Also, these categories are not mutually exclusive — for instance, Christopher Olah is also an active researcher who recently published a noteworthy paper on the interpretability of neural networks. Most importantly, contribution to code either directly or indirectly (through paper implementation), connects nearly all of them across categories. Collectively, they are the true catalysts fueling AI progress, and their output forms the bedrock upon which all stakeholders rely — whether it be business executives or even just critics. Any AI company relies on the work of these catalysts, with some of them even being the founders (e.g. Demis Hassabis) or key players (e.g. most of OpenAI co-founders — see note at end of this article) in these companies.

The true catalysts

Here are snapshots of the true catalysts bucketed into different categories

  • Top Repo Authors — algorithmically harvested
  • Top Paper Authors — algorithmically harvested
  • Educators and thought leaders — manually curated
  • Disseminators — manually curated
  • Top Repo Authors who have enabled sponsorship on GitHub — algorithmically harvested

Links to the interactive versions of these are provided under the images below.

Top Repo Authors. Interactive version of this page
Top Paper Authors. Interactive version of this page
Educators. Interactive version of this page
Disseminators. Interactive version of this page

What is the objective of creating these lists?

The aim of identifying the true catalysts fueling AI progress is three-fold.

  • First, address the credit assignment problem, which ensures that credit is correctly attributed, especially from a technical contribution standpoint.
  • Second, to inspire both individuals and businesses to sponsor these authors. It could be a strategic advantage for businesses leveraging machine learning to directly invest in researchers and practitioners who open-source their code.
  • Finally, to create a direct channel for both rewarding and investing in them by encouraging them to activate GitHub sponsorship. Currently, around 2% of the contributors have enabled sponsorship options on GitHub. Initiatives, like authorswithcode, aim to encourage more to follow suit. As a result, businesses — particularly those deriving significant benefits from open source with permissive licenses — have a direct channel to both reward the work of these contributors and invest in them.
Top Repo Authors open for sponsorship. Interactive version of this page

Limitations of our approach

There are certain limitations in our approach to identifying the true catalysts using algorithms. For instance, we mainly focus on code implementations that are associated with papers. However, some code implementations are not linked to any paper yet hold immense value. A few repositories by Andrej Karpathy that aren’t tied to any specific paper serve as prime examples. We may also have overlooked certain AI practitioners who contributed to repos not associated with a paper. Despite these limitations, a notable feature of our approach is the algorithmically curated list of top repo and paper contributions, which is updated regularly for interaction and discovery. Additionally, the ability to search, discover, and reference the contributions of researchers, practitioners, educators, thought leaders, and disseminators should hopefully be valuable for anyone chronicling a narrative around these true catalysts of AI progress.

Final thoughts

While figures like Sam Altman undoubtedly play a crucial role in garnering capital for AGI research, the journey to AGI is firmly anchored in the hands of researchers and practitioners, particularly those who open-source their work to accelerate collective progress. Moreover, game-changing breakthroughs need not necessarily emerge from well-funded research groups; instead, they might arise from researchers facing financial constraints, who could benefit from public support. This is especially pertinent for businesses already reaping the rewards of these researchers’ contributions. The claim about the potential of researchers isn’t hyperbolic, though it may be wishful thinking to expect businesses to realize the value of direct investment in them. Returning to the topic of game-changing breakthroughs, consider that a dog, operating on just a few watts of power, exhibits common sense and survival skills. Yet, these traits are largely absent in current machine-learning models, no matter how massive they may be.

Everyone mentioned in the article has a contribution page on authorswithcode, with the exception of Sam Altman. We couldn’t find any paper or repo association for him. This doesn’t mean he didn’t contribute to code or papers; we simply couldn’t locate his contribution. Other co-founders of OpenAI — Ilya Sutskever, Greg Brockman, Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, John Schulman, and Wojciech Zaremba — have contribution pages on authorswithcode, listing their contributions to papers and repos, with the exceptions of Jessica Livingston and Pamela Vagata.