3/21/2026 | United Kingdom | business | ✓ Verified - theguardian.com

Thousands of people are selling their identities to train AI – but at what cost?

#AI training #identity selling #data privacy #ethical concerns #biometric data #exploitation #regulation

📌 Key Takeaways

Thousands of individuals are selling their personal data, including images and biometrics, to train AI models.
This practice raises significant ethical concerns about privacy, consent, and exploitation.
The financial compensation for sellers is often minimal compared to the value generated for AI companies.
The long-term societal impacts, such as identity theft and misuse of data, remain largely unregulated.

📖 Full Retelling

<p>Gig AI trainers worldwide are selling moments of their lives, including calls and texts, to AI companies for quick cash</p><p>One morning last year, Jacobus Louw set out on his daily neighborhood walk to feed the seagulls he finds along the way. Except this time, he recorded several videos of his feet and the view as he walked on the pavement. The video earned him $14, about 10 times the country’s minimum wage, or for Louw, a 27-year-old based in Cape Town, South Africa, hal

🏷️ Themes

AI Ethics, Data Privacy

📚 Related People & Topics

Machine learning

Study of algorithms that improve automatically through experience

Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Within a subdiscipline in machine learning, advances i...

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Machine learning:

🌐 Artificial intelligence 5 shared

🌐 Large language model 4 shared

🌐 Reinforcement learning 4 shared

🏢 OpenAI 3 shared

🌐 Review article 1 shared

View full profile

Mentioned Entities

Machine learning

Study of algorithms that improve automatically through experience

Deep Analysis

Why It Matters

This news matters because it reveals a hidden human cost behind AI development, where vulnerable populations may be exploited for data collection. It affects both the individuals selling their identities who risk privacy violations and future discrimination, and society at large as AI systems trained on potentially unethical data become embedded in critical systems. The practice raises urgent questions about consent, compensation, and the ethical foundations of the AI revolution that will shape everything from hiring algorithms to financial services.

Context & Background

AI models require massive datasets of human faces, voices, and personal information to learn patterns and recognize emotions, demographics, and behaviors.
Previous controversies have emerged around companies like Clearview AI scraping billions of faces without consent, and Amazon's Rekognition showing racial bias in law enforcement applications.
The gig economy and economic precarity in many regions create conditions where people feel compelled to sell personal data for immediate cash, often without understanding long-term implications.
Regulatory frameworks like GDPR in Europe provide some data protection, but enforcement is inconsistent globally and often doesn't cover 'consensual' data sales.
AI ethics researchers have warned for years about 'data colonialism' - where personal information from marginalized communities is extracted for corporate profit with minimal benefit returning to those communities.

What Happens Next

Increased regulatory scrutiny is likely in 2024-2025, with possible legislation targeting 'data broker' platforms facilitating identity sales. Lawsuits may emerge from individuals whose sold data leads to identity theft or discrimination. AI companies will face growing pressure to audit training data sources and implement ethical sourcing standards, potentially slowing development timelines. Some platforms may shift to synthetic data generation as an alternative, though questions about bias in synthetic data will persist.

Frequently Asked Questions

What kinds of identity information are people selling?

People are selling facial scans, voice recordings, handwriting samples, and personal demographic details including age, ethnicity, and employment history. This data is packaged as 'training datasets' for AI systems that need to recognize human characteristics and behaviors.

Why would someone sell their identity data?

Primary motivations include immediate financial need, particularly in economically disadvantaged regions where small payments for data represent meaningful income. Many participants don't fully understand how their data will be used permanently or what risks they're accepting for relatively small compensation.

What are the main risks for people who sell their data?

Risks include permanent loss of privacy, identity theft potential, and future discrimination if AI systems associate their data with negative outcomes. Once data is in training sets, it's nearly impossible to remove, creating lifelong digital footprints that could affect employment, insurance, or legal situations.

How does this affect AI system quality and bias?

Systems trained on commercially purchased identity data may inherit and amplify societal biases if datasets overrepresent certain demographics or contexts. This can lead to discriminatory outcomes in hiring algorithms, facial recognition systems, and other AI applications that affect real people's lives.

Are there legal protections for people selling their data?

Protections are minimal and vary by jurisdiction. Most platforms use broad consent forms that waive future claims, and data protection laws often don't cover voluntarily sold information. Once data is aggregated and anonymized (often imperfectly), it typically falls outside privacy regulations.

What alternatives exist for AI training data?

Alternatives include synthetic data generation, using publicly available data with proper licensing, and carefully curated datasets with transparent sourcing. However, these approaches have their own challenges including computational costs, potential bias in synthetic data, and limitations in capturing real-world diversity.

}

Original Source

              Thousands of people are selling their identities to train AI – but at what cost? Gig AI trainers worldwide are selling moments of their lives, including calls and texts, to AI companies for quick cash O ne morning last year, Jacobus Louw set out on his daily neighborhood walk to feed the seagulls he finds along the way. Except this time, he recorded several videos of his feet and the view as he walked on the pavement. The video earned him $14, about 10 times the country’s minimum wage, or for Louw, a 27-year-old based in Cape Town, South Africa , half a week’s worth of groceries. The video was for an “Urban Navigation” task Louw found on Kled AI, an app that pays contributors for uploading their data, such as videos and photos, to train artificial intelligence models. In a couple of weeks, Louw made $50 by uploading pictures and videos of his everyday life. Thousands of miles away in Ranchi, India , Sahil Tigga, a 22-year-old student, regularly earns money by letting Silencio, which crowdsources audio data for AI training, access his phone’s microphone to capture ambient city noise, such as inside a restaurant or traffic at a busy junction. He also uploads recordings of his voice. Sahil travels to capture unique settings, like hotel lobbies not yet documented on Silencio’s map. He earns over $100 a month doing this, enough to cover all his food expenses. And in Chicago, Ramelio Hill, an 18-year-old welding apprentice, made a couple hundred dollars by selling his private phone chats with friends and family to Neon Mobile, a conversational AI training platform that pays $0.50 per minute. For Hill, the calculation was simple: he figured tech companies already capture so much of his private data, so he might as well get a cut of the profit. These gig AI trainers – who upload everything from scenes around them to photos, videos and audio of themselves – are at the frontlines of a new global data gold rush. As Silicon Valley’s hunger for high-quality, human-grade data out...
            

Read full article at source

Source

theguardian.com

Thousands of people are selling their identities to train AI – but at what cost?

📌 Key Takeaways

📖 Full Retelling

🏷️ Themes

📚 Related People & Topics

Machine learning

Entity Intersection Graph

Mentioned Entities

Machine learning

Deep Analysis

Why It Matters

Context & Background

What Happens Next

Frequently Asked Questions

Source

More from United Kingdom

News from Other Countries

🇺🇸 USA

🇺🇦 Ukraine