7.5. Bias in AI#
Machine learning and AI systems are shaped by human choices, meaning biases (intentional or not) can become embedded in how these technologies are designed, trained, and deployed. These biases can inadvertently lead to inaccurate, unjust, or discriminatory outcomes if they are not properly identified and addressed during the development and deployment of AI systems.
7.5.1. Human Bias in AI Development#
Human bias refers to the unconscious or conscious preferences and assumptions that people bring into the AI development process. Examples are given below.
Problem Framing#
AI systems lack human judgment and will strictly optimise for the objectives they’re given, often without applying common sense or understanding of the broader context or ethical implications. AI systems may interpret instructions literally and carry them out in ways that are unexpected or misaligned with our real objectives.
Example: an AI system directed to maximise user engagement on a social media platform might discover that emotionally charged or provocative content, such as conspiracy theories or bigoted statements, captures attention most effectively. As a result, the algorithm could end up promoting harmful, misleading, or insensitive content, with no regard for the user’s well-being.
Feature Selection#
When designing AI models, humans make key decisions about which features (input information) the model will use. These choices can unintentionally reflect personal assumptions, societal biases, or cultural norms that are not universally applicable, potentially leading to biased.
Example: a human designing a hiring algorithm might decide to include only candidates’ formal educational qualifications as input data. This choice could unfairly disadvantage applicants from non-traditional backgrounds who have gained valuable skills and knowledge through work experience rather than formal training.
Optimisation Metric Selection#
When designing models, humans must decide which performance metrics to prioritise, such as which classification measure to optimise (See Extension: Further Classification Metrics), which directly influences how the model behaves. When tuning focuses solely on overall accuracy, the model may perform well for the majority group but poorly for smaller or underrepresented populations, ultimately reinforcing existing biases and inequalities.
Example: consider a classification model designed to determine whether a person has COVID or not. When infection rates are very low (around 1%), accuracy becomes a poor measure of performance. A model that always predicts “no COVID” would appear to be 99% accurate but would completely fail to identify anyone who actually has the virus. In such cases, balancing precision (correct positive predictions) and recall (catching all true cases) is far more appropriate for evaluating the model’s effectiveness.
7.5.2. Dataset Source Bias#
Dataset bias originates from the data itself. AI systems rely on large datasets to learn patterns and make predictions, but the source and quality of this data greatly affect the outcomes.
Representation Bias#
Representation bias occurs when certain groups are disproportionately included or excluded in a training dataset. If a dataset overrepresents dominant groups (e.g. certain ethnicities, genders, or geographies) and underrepresents minority or marginalised populations, the machine learning model learns patterns that are more accurate for the overrepresented groups.
Example: facial recognition systems trained mostly on lighter-skinned individuals often perform poorly for people with darker skin tones.
Historical Bias#
Historical bias reflects existing societal inequalities, stereotypes, and discriminatory practices. Even when datasets are collected accurately, they can still embed and perpetuate past injustices—because they are built on the decisions, behaviors, and systems of the past. This type of bias is particularly challenging because it isn’t introduced through flawed data collection or technical error, but rather through systemic issues already present in society.
Example: AI-generated images often reflect biased stereotypes due to the training data they are built on. For example, when prompted with the word “scientist,” these tools frequently produce images of white men in lab coats, underrepresenting women and people of color. This is partly because historically more white men have been represented in the field, but it also reflects bias in what history has chosen to record and highlight.
Measurement Bias#
Measurement bias occurs when the data used to train a model is inaccurately or inconsistently measured, leading to distorted results. This kind of bias arises when the features (inputs) or labels (outputs) in the dataset don’t accurately represent what they’re intended to capture.
Example: consider an AI system designed to evaluate student admissions into degree programs. Referees often provide subjective assessments such as an applicant’s aptitude, leadership, communication or collaboration skills, without consistent evaluation criteria. These judgments can reflect personal biases. For example, a referee might be more inclined to rate male applicants higher on leadership and female applications higher on collaboration.
7.5.3. Mitigating Bias in AI#
Mitigating bias is not a one-time fix but an ongoing process that must be embedded throughout the AI lifecycle. Here are some of the ways in which bias can be mitigated.
Diverse and Inclusive Teams: Having a multidisciplinary team with varied backgrounds can help spot and challenge biases early.
Data Curation: Ensuring datasets are representative and inclusive is a critical step in building ethical AI systems.
Transparency and Explainability: AI systems should be interpretable, allowing users and regulators to understand how decisions are made.
7.5.4. Recommended Videos#
The danger of AI is weirder than you think | Janelle Shane 2019 (10 mins)
AI is dangerous, bot not for the reasons you think | Sasha Luccioni 2023 (10 mins)