Chapter 7 — Student Handouts

Foundations of Large Language Models | Mr. Yousef Younis

📘 Week 1 — How LLMs Work & Why They Fail

Tokens · Training · Context Windows · Hallucinations · Limitations

1. Key Vocabulary

Term	Definition
LLM	Large Language Model — AI trained on massive text data to predict the next token
Token	The unit of text an LLM reads — approximately ¾ of a word, or ~4 characters
Training data	The text an LLM learns from (books, websites, code, Wikipedia…)
Fine-tuning	Additional training on a smaller dataset to specialise a model for a specific task
Context window	The maximum number of tokens an LLM can process at once — its short-term memory
Hallucination	When an LLM confidently states false or made-up information
Training cutoff	The date after which the model has no knowledge of new events
Bias	When a model reflects unfair or skewed patterns from its training data

2. How LLMs Are Trained — Fill in the Blanks

Step 1: Collect — billions of text examples from books, websites, and code.
Step 2: Learn — predict the next billions of times and get feedback.
Step 3: — further train the model for a specific task or behaviour.

💡 Key idea: LLMs learn statistical patterns — not facts about the world. This one idea explains almost every limitation we cover today.

3. Tokens — Practice

Phrase	Your Estimate	Actual (approx)
"Hello world"		2 tokens
"Unbelievable"		3 tokens (un · believ · able)
"I love pizza"		4 tokens
"Artificial intelligence"		5–6 tokens
Write your own sentence:

4. Context Windows

A context window is the model's -term memory.
Once a conversation exceeds the token limit, the model the earliest content.
By default, LLMs have persistent memory between sessions.

🎯 Design Challenge: You're building a tutoring chatbot for 2-hour sessions. How do you stop context from being lost?

5. The 3 Causes of Hallucinations

#	Cause	Example
1	Pattern matching without understanding	Invents a capital city for a made-up country
2	Gaps in training data	Guesses about a recent event it was never trained on
3	Ambiguous prompts	"Tell me about the Paris incident" — it assumes which one

⚠️ Critical: The AI is NOT lying. It has no concept of truth. It always predicts the most plausible-sounding continuation — sometimes that continuation is wrong.

6. Spot the Hallucination — Notes

Statement	Verdict	Why
"The Eiffel Tower was built 1887–1889"	✓ Real	Well-documented historical fact
"The Great Wall is visible from space with the naked eye"	✗ Myth	Widely repeated online — model learned the myth
"Einstein won the Nobel Prize for the theory of relativity"	⚠️ Partial	He won it for the photoelectric effect — most dangerous type

7. Other Key Limitations

📅 No real-time infoKnowledge cutoff date — no live news, weather, or prices without external tools

🤖 No true understandingPattern recognition ≠ comprehension. No lived experience or common sense

⚖️ Biased outputsReflects biases in training data — gender, culture, language. English hugely overrepresented

🧠 Context window limitForgets long conversations — critical to design around

8. Common Misconceptions

❌ Wrong ideas

"LLMs lie when they hallucinate"

"LLMs understand language like humans"

"More training data = no hallucinations"

"The AI remembers our past chats"

✓ Correct understanding

They predict plausible text — they don't know truth from fiction

Pattern recognition ≠ comprehension

Gaps + ambiguous prompts still cause hallucinations

Context window = short-term only, resets each session

9. Exit Ticket ✍️

Answer all 3 before you leave:

1. In one sentence — what is a context window and what happens when it's exceeded?

2. Name one cause of hallucinations and give a real-world example of why it's dangerous.

3. Why might an AI product work better for some groups of people than others?

🧪 Week 2 — Let's Get Hands-On

Break the AI · AI Career Explorer · Exam Prep

Activity 1: Break the AI 🔨

Use a real AI tool (ChatGPT, Claude, or Gemini). Try each challenge below. Write what actually happened — be specific.

Trigger a hallucination

Ask about a very niche topic, a made-up person, or a recent event. Did it make something up confidently?

What I asked:

What the AI said:

Was it a hallucination? How do you know?

Test the knowledge cutoff

Ask about something recent. Does it admit it doesn't know, or does it guess confidently?

What I asked:

What the AI said:

Test an ambiguous prompt

Ask something vague like "Tell me about the big event." What assumptions does it make?

What I asked:

What assumptions did the AI make?

Bonus: Find something impressive

What does the AI do really well? Note one thing that genuinely surprised you.

🔑 Reflection: Did the AI sound equally confident when it was wrong as when it was right? Why does that matter in real-world use?

Activity 2: AI Career Explorer 🚀

Reference table — use this during class discussion.

Job Title	What they do	Skills needed	Avg Salary (USD)
AI / ML Engineer	Builds and trains AI models	Python, statistics, linear algebra	$120K–$200K+
Prompt Engineer	Designs instructions that make AI behave correctly	Clear writing, LLM knowledge — no CS degree required	$80K–$130K
AI Ethics Analyst	Ensures AI systems are fair, safe, and legal	Law, philosophy, social science	$70K–$120K
Data Scientist	Prepares data and makes decisions from AI outputs	Statistics, SQL, Python	$75K–$130K
AI Product Designer	Designs how humans interact with AI tools	UX design + understanding AI limits	$85K–$140K
AI in Your Field	Domain expert who critically evaluates AI tools	Your expertise + AI literacy	Premium in any field

🤖 Skills AI can automateRoutine, repetitive tasks · Basic data entry · Template-based writing · Simple pattern recognition

🧠 Skills AI can't replaceCritical thinking · Ethical judgment · Creativity · Emotional intelligence · Domain expertise

Career Reflection ✍️

Answer honestly — there's no right answer. This is about your future.

1. What field or career are you currently interested in? (Any field — doesn't have to be tech)

2. How do you think AI will change that field in the next 10 years? Name one specific change you can imagine.

3. What is one skill — technical or human — you want to develop that would make you more valuable in an AI-influenced world?

Bonus: Find one real job listing in your field that mentions AI. What does it ask for?

Exam Prep Checklist ✅

Tick each one when you feel confident — Week 3 is the exam.

Topic	Confident?	If not, review…
What a token is and why limits matter	☐	Week 1 handout §3
The 3-step training process	☐	Week 1 handout §2
What fine-tuning does	☐	Week 1 slides
What a context window is + what happens when exceeded	☐	Week 1 handout §4
What hallucinations are (definition)	☐	Week 1 handout §5
The 3 causes of hallucinations	☐	Week 1 handout §5
Real-world examples of hallucinations	☐	Week 1 slides
No real-time info / knowledge cutoff	☐	Week 1 handout §7
Bias in training data + why it happens	☐	Week 1 handout §7
Pattern recognition ≠ understanding	☐	Week 1 handout §8

💡 Exam tip: The exam tests whether you understand the concepts — not just the vocabulary. You'll be asked to apply them to new scenarios.