Complete the system design today. Start your pitch deck before Week 3.
Every AI system follows this structure. Map your system to it before anything else.
What data enters the system? (text, image, sensor, user clickβ¦)
What ML type? What has it been trained on?
Prediction? Classification? Generated content? Score?
How does it improve over time?
| Bias Type | What it means | Example |
|---|---|---|
| Historical Bias | Training data reflects past injustices | Hiring model trained on historical resumes β underrepresents women in senior roles |
| Sampling Bias | Certain groups underrepresented in training data | Facial recognition trained mostly on lighter-skinned faces fails on darker skin tones |
| Label Bias | Human labelers bring their own assumptions | Sentiment labels on text reflect the labeler's culture, not universal sentiment |
| Feedback Loop Bias | Model outputs shape future training data | A content recommender shows more extreme content β users engage more β model amplifies it |
| Measurement Bias | The metric used doesn't capture the real goal | Optimising for clicks instead of user satisfaction produces addictive but unhelpful content |
Saying "we'll use unbiased data" is not a strategy. Good answers include: auditing data for demographic balance, using fairness metrics during evaluation, building in human review for high-stakes decisions, or being transparent about limitations in the UI.
If your system uses a language model (chatbot, content generator, Q&A tool), you must also address:
Your team's final pitch will be scored on these 5 criteria. Use this as YOUR self-assessment guide while designing.
Total: 25 points. BONUS: +2 points if you address bias/ethics proactively (students often skip this, but judges LOVE teams that own their challenges).
Learn from these real cases of AI bias. These aren't hypothetical β they happened, caused harm, and are lessons for your project.
What happened: A system called COMPAS predicted the likelihood that released prisoners would re-offend. Black defendants were flagged as "high risk" at nearly twice the rate of white defendants.
Root cause: The algorithm was trained on historical crime data, which reflects decades of racial bias in policing and sentencing. More arrests of Black people β model learned "Black = higher risk."
Harm: Black people were kept in prison longer based on a statistical fiction.
Lesson for you: Historical data encodes historical racism. Always ask: "What biases are baked into my training data from society's past?"
What happened: Amazon built an AI recruiter to screen job applications for technical positions. The system systematically rejected qualified women.
Root cause: Amazon trained it on 10 years of hiring data. In tech roles, most hires were men (due to historical bias in the field). The AI learned "tech = male." When women applied, they scored lower.
Harm: Women's careers blocked. Amazon's diversity efforts failed. Amazon shut down the tool.
Lesson for you: Representation matters. If your data overrepresents one group, your AI will discriminate against others. Check your data profile BEFORE training.
What happened: A widely-used algorithm in U.S. hospitals predicted which patients needed extra medical care resources. It systematically under-flagged Black patients.
Root cause: The algorithm used healthcare spending as a proxy for illness. Because the healthcare system has historically spent less on Black patients, the AI thought they were less sick when they weren't.
Harm: Black patients were denied needed treatments. They suffered preventable complications.
Lesson for you: Proxies matter. Never use a variable that correlates with a protected group (race, gender) as a proxy for something else. "Spending β illness; skin tone β risk."
What happened: Facial recognition systems used by police had error rates of 1% for light-skinned faces but 35% for dark-skinned faces.
Root cause: Training data oversampled lighter skin tones. Most popular datasets were from Western sources with predominantly light-skinned faces.
Harm: Black people were wrongly arrested based on AI misidentification. An innocent man in Detroit spent time in jail.
Lesson for you: Representation in training data directly impacts fairness. If your computer vision system uses faces, test it on diverse demographics.
Before Week 3, confirm your draft deck has all of these: