Understanding AI Minds

🟢 This article is rated easy

Reading Time: 3 minutes

Last updated on August 7, 2024

Takeaways

There are many different types of AIs - Basics of how LLMs work

Before delving into the rest of the course, it's important to grasp some fundamental concepts about various AIs and their functioning. This foundational knowledge will provide a clearer understanding of the material that follows.

Different AIs

The landscape of artificial intelligence is vast and varied, encompassing thousands, if not millions, of distinct models. These models boast a broad spectrum of capabilities and applications. Some are generative, engineered to create outputs such as images, music, text, and even videos. In contrast, others are discriminative, designed to classify or differentiate between various inputs, like an image classifier distinguishing between cats and dogs. This course, however, will concentrate solely on generative AIs.

Among generative AIs, only a select few possess the advanced capabilities that make them particularly useful for prompt engineering. In this course, we will primarily focus on ChatGPT and other Large Language Models (LLMs). The techniques we explore are applicable to most LLMs.

As we venture into the realm of image generation, we'll explore the use of Stable Diffusion and DALLE.

How Large Language Models Work

Generative text AIs, such as GPT-3 and ChatGPT, operate based on a complex type of neural network known as a transformer architecture. This architecture comprises billions of artificial neurons. Here are some key points to understand about how these AIs work:

At their core, these AIs are mathematical functions. Instead of a simple function like $f(x) = x^2$ , think of them as functions with thousands of variables leading to thousands of possible outputs.
These AIs process sentences by breaking them into units called tokens, which can be words or subwords. For example, the AI might read I don't like as "I", "don", "'t", "like". Each token is then converted into a list of numbers for the AI to process.
The AIs generate text by predicting the next token based on the previous ones. For instance, after I don't like, the AI might predict apples. Each new token they generate is influenced by the previous tokens.
Unlike humans who read from left to right or right to left, these AIs consider all tokens simultaneously.

It is important to note that terms like "think", "brain", and "neuron" are metaphors used to describe the workings of these AIs. In reality, these models are mathematical functions, not biological entities. They don't "think" in the way humans do; they calculate based on the data they've been trained on.

Conclusion

Understanding the fundamental workings of AI is crucial as we delve deeper into this course. While it's tempting to anthropomorphize AI for easier understanding, it's essential to remember that these models are mathematical functions, not thinking beings. They operate based on data and algorithms, not human cognition. As we continue to explore and debate the nature and capabilities of AI, this foundational knowledge will serve as a guide, helping us navigate the complex and fascinating world of artificial intelligence.

Sander Schulhoff

Sander Schulhoff is the Founder of Learn Prompting and an ML Researcher at the University of Maryland. He created the first open-source Prompt Engineering guide, reaching 3M+ people and teaching them to use tools like ChatGPT. Sander also led a team behind Prompt Report, the most comprehensive study of prompting ever done, co-authored with researchers from the University of Maryland, OpenAI, Microsoft, Google, Princeton, Stanford, and other leading institutions. This 76-page survey analyzed 1,500+ academic papers and covered 200+ prompting techniques.

Footnotes

d2l.ai is a good resource for learning about how AI works ↩
Please note that the authors do, in fact, enjoy apples. They are delicious. ↩