Breaking Down the Hype Surrounding Generative AI

What is Generative AI?

Generative AI is composed of two ideas, generation and Artificial Intelligence (AI). But what are these ideas; what do they mean? Simply put, AI is a term to state that a computer program will execute a task in place of a human. Combining this with generation makes it more exciting as the program can create original material. The computer creates things by itself that it doesn’t need to be given previously.

In this article, we will mainly cover the generation of text, an area called Natural Language Processing, one of the most common uses for AI today, and we will break down how the technology works to make it less mysterious.

How does Generative AI Work?

Though recent innovations have skyrocketed its popularity and use, Generative AI is not a new concept. For example, Google Translate launched in 2006 and thus has been around for nearly 20 years.

Even SIRI on the iPhone was introduced in 2011. It sparked major sensation even back then and is another example of generative AI.

Where did the buzz come from?

Let’s paint a picture, it’s March of 2023, and you are a high-school student about to take the SAT. A very difficult test given to all the students. But you are prepared, you have studied for months and are ready to ace it. You take the test and score 1400. A very respectable score placing you in the 90th% of test takers. However, days later OpenAI, a company in San Francisco announces GPT-4 and they claim that it scored a 1410. You realized that you have just lost to a robot on a test you spent months studying for. Well, that is exactly what happened. Just kidding, but GPT-4 really did score 1410 on the SAT and with the recent introduction of GPT-5, the possibilities could be endless. Additionally, it could do a wide variety of other things such as writing text or generating images.

What is the technology behind ChatGPT?

ChatGPT is based on the principle of Language Modelling (LM). Language models work by assuming we have a sentence, or a sequence of words, and guessing what follows. For example, if the sentence so far is “I want to…” the LM might predict, “eat,” “go,” or “learn.” This is a very simple case, but when a model has been trained on billions of lines of text, it starts to get good at spotting which words naturally follow others.

In the beginnings of AI, models literally counted common patterns of words. Today, most models have switched to neural networks in which they don’t just count, they learn. This makes the predictions far more accurate, flexible, and sophisticated.

That’s essentially how ChatGPT and similar tools work: they look at the context and generate the most likely continuation. However, because the system is always making assumptions based on probability, the generation may be inaccurate. Sometimes it would produce the most likely answer when you wanted the less likely one. But that is how they are trained — to produce the most common result.

How are Language Models Created?

Collect a very large compilation of data (corpus):
- Wikipedia Books, Stack Overflow
- Quora, Public social media
- GitHub, Reddit
Ask the LM to predict the next word in a sentence:
- Randomly truncate last part of input sentences
- Calculate probabilities of missing words
- Adjust and feed back to the model to match the original data set
Repeat over the whole corpus.

Once the machinery has been built, its job is simply to predict the next word. But how does it actually do that? The model’s neural network calculates the probability of different words and if it guesses correctly, success! If it’s wrong, then it adjusts its parameters and it will learn from its mistakes and try again. This cycle repeats repeatedly, it predicts, checks the real answer (original data), and continues. With enough training, the LM gradually improves until its predictions converge on something useful.

Are Large Language Models (LLMs) Always Right or Fair?

-Unfiltered training data: These models are trained on massive amounts of internet text, so it’s impossible to moderate all of it.
-Bias: Because the models learn from human writing, they inevitably pick up social or historical bias.
-Unwanted behavior: Sometimes they produce false or irrelevant answers that sound convincing.

Tech By DB