Skip to content

Commit

Permalink
add-how-llms-work
Browse files Browse the repository at this point in the history
  • Loading branch information
ehumph committed Nov 7, 2023
1 parent 823fc32 commit b5eb955
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion 01-intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,9 @@ We must always be aware of the potential for harm and deliberately take steps to

Humans have been interacting with AI chatbots for years. In fact, Alan Turing is credited with coming up with the concept for chatbots as early as 1950. Chatbots are software-based systems that interact with humans typically by text or speech inputs, rather than code. They mimic some human activity [@wikipedia_chatbot_2023; @abdulla2022chatbots] based on these language inputs. They process the inputs using natural language processing commonly abbreviated as NLP. NLP is a kind of AI that uses human text or speech and parses the language to determine structures and patterns to extract meaning. NLP uses large amounts of language data (such as books, websites etc.) to train AI systems to identify these structures and patterns. For example, the AI model might identify when a sentence is a question or a statement by examining various features in a prompt such as the inclusion of a question mark of the use of words often used in questions [@wikipedia_natural_2023; @cahn2017chatbot].

The methods used for chatbots have evolved over time. Now chatbots often utilize AI methods like [deep learning](https://en.wikipedia.org/wiki/Deep_learning) (which involve multiple layers of abstractions of the input data [@wikipedia_deep_learning_2023]) to extract meaning from the language data [@wikipedia_natural_2023]. As these methods use large quantities of text, they are therefore often called large language models [@wikipedia_large_language_2023].
The methods used for chatbots have evolved over time. Now chatbots often utilize AI methods like [deep learning](https://en.wikipedia.org/wiki/Deep_learning) (which involve multiple layers of abstractions of the input data [@wikipedia_deep_learning_2023]) to extract meaning from the language data [@wikipedia_natural_2023]. As these methods use large quantities of text, they are therefore often called large language models, or LLMs [@wikipedia_large_language_2023].

Although it might _seem_ like LLMs are talking to you when you interact with them, it's important to remember they aren't actually thinking. Instead, LLMs are simply putting together tokens, or parts of words, based on a huge distance matrix created using an LLM's training data set. Essentially, an LLM's program figures out how frequently (and in what contexts) different words show up together in the training data. For example, the word "example" is often paired with the word "for" in the text for this course. An LLM trained on this course would then be more likely to create the phrase "for example" than the phrase "for apples", as the training data includes multiple instances of the first phrase but only one instance of the second. (To be precise, the LLM would predict the tokens "ex", "am", and "ple", but we see it as the word "example".) If you're interested in learning more, check out this excellent [visual article](https://ig.ft.com/generative-ai/) by the Financial Times (we are not affiliated with them).

Despite the fact that chatbots have been around awhile, the popularity of OpenAI's ChatGPT and DALL-E programs has sparked a recent surge of interest. These chatbots are in part particularly powerful due to the fact that large amounts of computing power were used to train their NLP models on very large datasets [@caldarini2022literature; @cahn2017chatbot]. Large language model AIs can be divided into two categories: those that can be reached using an internet browser, and those that can be reached using an integrated development environment (IDE).

Expand Down

0 comments on commit b5eb955

Please sign in to comment.