πŸ’Ύ Auto-Save: Your work is automatically saved as you type. You can close the browser and come back later.
Week 2 Β· Student Packet

AI System Design
Guide & Worksheet

Complete the system design today. Start your pitch deck before Week 3.

Team Name: _______________________
Date: _______________________
Part 1 β€” Reference Guide

The Input β†’ Model β†’ Output β†’ Feedback Loop

Every AI system follows this structure. Map your system to it before anything else.

πŸ“₯ Input

What data enters the system? (text, image, sensor, user click…)

βš™οΈ Model

What ML type? What has it been trained on?

πŸ“€ Output

Prediction? Classification? Generated content? Score?

πŸ” Feedback

How does it improve over time?

Types of Bias to Watch For

Bias TypeWhat it meansExample
Historical BiasTraining data reflects past injusticesHiring model trained on historical resumes β€” underrepresents women in senior roles
Sampling BiasCertain groups underrepresented in training dataFacial recognition trained mostly on lighter-skinned faces fails on darker skin tones
Label BiasHuman labelers bring their own assumptionsSentiment labels on text reflect the labeler's culture, not universal sentiment
Feedback Loop BiasModel outputs shape future training dataA content recommender shows more extreme content β†’ users engage more β†’ model amplifies it
Measurement BiasThe metric used doesn't capture the real goalOptimising for clicks instead of user satisfaction produces addictive but unhelpful content

On Mitigation β€” Be Specific

Saying "we'll use unbiased data" is not a strategy. Good answers include: auditing data for demographic balance, using fairness metrics during evaluation, building in human review for high-stakes decisions, or being transparent about limitations in the UI.

LLM-Specific Concerns

If your system uses a language model (chatbot, content generator, Q&A tool), you must also address:

Your Pitch Deck Structure

Slide 1
Hook: The Problem
Story, stat, or scenario. Make them feel it.
Slide 2
The Solution
What your system does from the user's view
Slide 3
How It Works
ML type, training process, diagram or mockup
Slide 4
Data & Bias
Sources, risks, and mitigation
Slide 5
Ethics, Limits & Failures
Honest critique of your own system
Slide 6
Future Work
What's next? What would you build with more resources?

πŸ† How You'll Be Judged

Your team's final pitch will be scored on these 5 criteria. Use this as YOUR self-assessment guide while designing.

🎨 Originality
Is the idea novel and creative? Does it solve a real, specific problem in a unique way?
/ 5 points
🎯 Real-World Usefulness
Does your system solve a genuine problem for real people? Would they actually use it?
/ 5 points
🎀 Clarity of Pitch
Can the audience understand your idea in 5-7 minutes? Is your explanation clear and engaging?
/ 5 points
🧠 AI Knowledge
Do you name the ML type? Can you explain HOW your system works (not just WHAT it does)?
/ 5 points
⚑ Feasibility
Is the scope realistic? Are your data requirements reasonable? Could this actually be built?
/ 5 points

Total: 25 points. BONUS: +2 points if you address bias/ethics proactively (students often skip this, but judges LOVE teams that own their challenges).

πŸ“– Real-World Bias Case Studies

Learn from these real cases of AI bias. These aren't hypothetical β€” they happened, caused harm, and are lessons for your project.

🚨 Case 1: COMPAS Recidivism Algorithm (Criminal Justice)

What happened: A system called COMPAS predicted the likelihood that released prisoners would re-offend. Black defendants were flagged as "high risk" at nearly twice the rate of white defendants.

Root cause: The algorithm was trained on historical crime data, which reflects decades of racial bias in policing and sentencing. More arrests of Black people β†’ model learned "Black = higher risk."

Harm: Black people were kept in prison longer based on a statistical fiction.

Lesson for you: Historical data encodes historical racism. Always ask: "What biases are baked into my training data from society's past?"

πŸ’Ό Case 2: Amazon Hiring Algorithm (Gender Bias)

What happened: Amazon built an AI recruiter to screen job applications for technical positions. The system systematically rejected qualified women.

Root cause: Amazon trained it on 10 years of hiring data. In tech roles, most hires were men (due to historical bias in the field). The AI learned "tech = male." When women applied, they scored lower.

Harm: Women's careers blocked. Amazon's diversity efforts failed. Amazon shut down the tool.

Lesson for you: Representation matters. If your data overrepresents one group, your AI will discriminate against others. Check your data profile BEFORE training.

πŸ₯ Case 3: Medical Risk Algorithm (Healthcare Discrimination)

What happened: A widely-used algorithm in U.S. hospitals predicted which patients needed extra medical care resources. It systematically under-flagged Black patients.

Root cause: The algorithm used healthcare spending as a proxy for illness. Because the healthcare system has historically spent less on Black patients, the AI thought they were less sick when they weren't.

Harm: Black patients were denied needed treatments. They suffered preventable complications.

Lesson for you: Proxies matter. Never use a variable that correlates with a protected group (race, gender) as a proxy for something else. "Spending β‰  illness; skin tone β‰  risk."

πŸ“Έ Case 4: Facial Recognition (Surveillance Bias)

What happened: Facial recognition systems used by police had error rates of 1% for light-skinned faces but 35% for dark-skinned faces.

Root cause: Training data oversampled lighter skin tones. Most popular datasets were from Western sources with predominantly light-skinned faces.

Harm: Black people were wrongly arrested based on AI misidentification. An innocent man in Detroit spent time in jail.

Lesson for you: Representation in training data directly impacts fairness. If your computer vision system uses faces, test it on diverse demographics.

🎯 What These Cases Teach

Part 2 β€” System Design Worksheet

Q1 β€” What Does It Do?

Q2 β€” How Does It Learn?

Q3 β€” What Data Does It Need?

Q4 β€” Where Can It Go Wrong?

Q5 β€” Ethics & Limitations

Who could be harmed if it's wrong?
Is there consent & data privacy?
Could it be misused?
Is there human oversight?

Future Work

Pitch Deck Checklist

Before Week 3, confirm your draft deck has all of these: