How Generative Models Work?

How can they Speak so well ?

Jan 29, 2025

Your English is very good!”: A compliment? (by Dr Zhongfeng Tian) | Belonging, Identity, Language, Diversity Research Group (BILD) — source : https://images.app.goo.gl/CNB25VusEQTAP3DP7

Note : This article appeared originally in my Betaflow publication in medium .I will leave the link in the references section below.

AI is all the rage now, especially with Large Language Models that can talk like humans. Cool, right? But aren’t you just a little curious about how they actually work? The concept is pretty straightforward—not easy, since it takes serious engineering and massive resources, but definitely not too intimidating to understand.

If I’ve sparked your curiosity, let’s dive in!

Note: I might throw in a few simple equations, but don’t worry—understanding them isn’t essential to grasp the main ideas.

What is Generative Models

Generative models are systems that predict the chances of seeing something (Y) based on hidden or unknown factors (X).

.Parameterized Generative Model (M) , that predicts the likelihood of seeing Y given the set of latent X

That’s a bit abstract, isn’t it? Let’s look at a real-world example:

Language Generative Models

Imagine we have a vocabulary with |V| words and a sentence of length L, indexed as t = 1, 2, ..., N, where w(t) represents the word at position t in the sentence. Almost all language generative models follow this approach:

They generate a distribution vector.

The Language model predicts the distribution vector at each time t, give all words at the positions 1,2,…,t-1.

Here’s a simple example: the goal is to generate the distribution vector for t=4, based on the words at positions t=1, 2, and 3. As humans, we know the correct word is “Fast,” but we want the machine to learn this on its own.

An example of distribution vector in a generative language model

Language models generally learn this in a similar way. Take a look at the following example using a 3-sentence dataset:

A sketch of how sentences are used to train a generative language model

To predict the word “fast” at its position, generative models focus on capturing the “context”—the previous words and their distributions. This is a simplified explanation of what happens behind the scenes.

To dive deeper, explore these topics:

In upcoming articles, I’ll provide more visual explanations of these concepts to help interested readers grasp the key ideas behind them.

The link of the original article on Medium is here.

Betaflow AI

Discussion about this post