Training the Trainers: Gig Workers Teaching Robots to Be Human

In a studio apartment in central Nigeria, a medical student named Zeus finishes a long hospital shift, straps his iPhone to his forehead, and starts recording himself make his bed.

He raises his hands like a sleepwalker. He moves deliberately, carefully, making sure his fingers stay visible to the camera. For $15 an hour — solid income in Nigeria’s high-unemployment economy — he films himself doing laundry, ironing clothes, and cooking meals. Not for YouTube. Not for social media.

He’s training the next generation of humanoid robots.

This is the hidden labour market powering the robotics gold rush. While the press focuses on the billion-dollar valuations, the cutting-edge AI models, and the factory deployments, the actual intelligence inside these machines is being built by thousands of gig workers across 50+ countries who are filming their most mundane domestic routines and selling the footage to the world’s biggest robotics companies.

Why Robots Need to Watch Humans Do Chores

The problem with humanoid robots has always been manipulation — the ability to pick things up, fold them, place them, stack them, open them. Tasks that toddlers master by age three remain stubbornly difficult for machines worth millions of dollars.

The rise of large language models changed the thinking. If LLMs learned to generate language by training on billions of text examples scraped from the internet, why couldn’t robots learn physical movement from billions of video examples of humans doing physical tasks?

The challenge: that data doesn’t exist on the internet. You can’t scrape Reddit for footage of someone folding a fitted sheet. Virtual simulations can teach robots acrobatics but not how to grasp irregular objects, because physics simulations still can’t perfectly model the real world’s unpredictability.

So the industry invented a solution: pay humans to generate the data themselves.

The Data Brokers in the Middle

Companies like Micro1, Scale AI, and Encord have become the intermediaries in a new global supply chain. They recruit workers — disproportionately from India, Nigeria, and Latin America — vet them through AI interview processes, and assign them weekly video quotas.

Micro1, based in Palo Alto, has hired thousands of contractors across more than 50 countries. Workers are screened by an AI agent named Zara, which conducts video interviews and reviews sample chore footage before onboarding. Accepted workers submit weekly videos of domestic tasks — following strict protocols about hand visibility, pacing, and frame coverage.

The videos are then processed by AI, annotated by human labellers, and sold to robotics companies. Micro1 CEO Ali Ansari estimates the industry is now spending over $100 million per year on real-world data, and that demand is “increasing really fast.”

The clients — Tesla, Figure AI, Agility Robotics, and others — remain undisclosed for confidentiality reasons. Workers never know which robot their footage is training.

DoorDash Joins the Data Race

It’s not just specialist firms. DoorDash has begun paying its 8-million-strong courier network to generate training data during their deliveries. Couriers in select markets are offered paid tasks — filming themselves doing chores, documenting physical environments, capturing real-world motion — turning an existing logistics workforce into a data collection engine overnight.

This is a strategic masterstroke. DoorDash already has infrastructure, insurance, and relationships with millions of gig workers. Pivoting part of that capacity toward AI training data requires almost no marginal cost. They’re effectively monetising their labour network twice — once for food delivery, once for robot intelligence.

The pattern is spreading. Any company with a large physical workforce is sitting on a potential data goldmine.

China Goes Industrial-Scale

While Western companies crowdsource data through individual workers at home, China has industrialised the process entirely.

Dozens of state-owned robot training centres have been built across the country, where workers wear VR headsets and full-body exoskeletons to teach humanoid robots physical tasks in controlled environments. Workers demonstrate how to open a microwave, wipe down a table, sort objects by shape and weight — and the exoskeleton captures every joint angle, force measurement, and motion vector simultaneously.

This is teleoperation at national scale. (For more on how state-level infrastructure investment is reshaping tech, see our piece on Microsoft’s $10B Japan sovereignty play.) Where Micro1 gets diverse, messy real-world data from a student’s cramped studio, China’s centres get precise, controlled, high-fidelity data from purpose-built facilities. Both approaches have trade-offs — but China’s industrial model can scale in ways the distributed gig model cannot.

The competition isn’t just between Tesla and Figure AI. It’s between two fundamentally different approaches to building robot intelligence — and two very different economic systems powering them.

The $6 Billion Bet on Physical Data

Investors poured over $6 billion into humanoid robotics in 2025 alone. A significant chunk of that is going not into the robots themselves, but into the data infrastructure required to train them.

This is a pattern anyone who watched the LLM boom will recognise. The real money in AI has often been in the unsexy infrastructure layer — the compute, the data, the annotation pipelines — rather than the models themselves. The same dynamic is playing out in physical AI.

MIT Technology Review readers recently voted humanoid robots the “11th breakthrough technology” to add to their 2026 list — a reader-poll addition that reflects just how rapidly mainstream perception is catching up with technical reality.

The Workers Left in the Dark

Behind the optimistic headlines, the gig workers at the centre of this system operate with almost no information about what they’re actually building.

Workers interviewed by MIT Technology Review understood their data was being used to train robots — but none knew how it would be stored, shared, or which companies it would end up with. Privacy questions compound the ambiguity: workers are asked not to show their faces or reveal personal information, but their footage captures the interior of their homes, their possessions, their daily routines.

For Arjun, a tutor in Delhi, the logistical challenge is constant: his two-year-old daughter keeps wandering into frame. For Sasha, a Nigerian banker-turned-data-recorder, filming laundry means tiptoeing around a shared residential compound to avoid capturing neighbours who watch her with bewilderment.

These are the real conditions under which robot intelligence is being built.

UC Berkeley robotics professor Ken Goldberg offered a note of caution: “It’s going to take longer than people think.” The data problem is being partially solved by this global gig workforce — but creating robots that can generalise across environments, handle novel objects, and adapt to human spaces remains an open research challenge. More data helps. It doesn’t solve everything.

What This Means for the Robotics Race

The emergence of this data economy changes the competitive calculus in humanoid robotics in several ways:

1. Data is the new moat. The companies that lock in the best real-world training pipelines earliest will have durable advantages. This is less about the robot hardware and more about proprietary datasets that can’t be easily replicated.

2. The Global South is building the future. The labour for this data collection is concentrated in Nigeria, India, Argentina, and similar markets — where $15/hour is genuinely transformative income. There’s a certain irony in the fact that the robots that may eventually automate jobs in wealthy countries are being trained by workers in developing economies.

3. China’s industrial model threatens Western distributed approaches. State-owned VR training centres can produce higher-fidelity, more consistent data at scale. The gig model is flexible and cheap but inherently noisy.

4. The data layer is where regulation hasn’t caught up. Workers don’t know who holds their data or how it’s used. Consent frameworks are minimal. This is a significant privacy exposure that regulators have not yet addressed.

The Bigger Picture

The race to build humanoid robots capable of working in factories and homes is ultimately a race for data — and that race is being run by gig workers in studio apartments in Lagos, Delhi, and Buenos Aires.

Every time Zeus irons his shirt with an iPhone strapped to his head, he’s contributing to a training dataset that will eventually teach a robot to do the same task. The robot that replaces a factory worker in Ohio is being trained by a student in Nigeria who will never meet it.

This is the supply chain of the physical AI revolution. It’s unglamorous, it’s geographically distributed, it raises serious ethical questions — and it’s moving extraordinarily fast.

The robots are coming. They’re learning from us. And we’re getting paid $15 an hour to teach them.

The Ethics Engine: How AI Safety Became the Industry Most Valuable Feature

AI Arms Race Escalates: GPT-5.5 Launches, Pentagon Picks Sides, and Washington Finally Wakes Up

The Hybrid AI That Knows You: Why Personalised RAG + Agents Will Outperform Everything

You Can Now Sequence the Human Genome on a Mac Mini

Synthetic Data: The Complete Guide to AI’s Secret Weapon in 2026

The Ethics Engine: How AI Safety Became the Industry Most Valuable Feature

AI Arms Race Escalates: GPT-5.5 Launches, Pentagon Picks Sides, and Washington Finally Wakes Up

GPT-5.5: The Smart Play for AI Cost Efficiency in 2026

AI Is Eliminating Wall Street Jobs — And Banks Are Finally Admitting It

Google Wins Pentagon AI Contract After Anthropic Gets Blacklisted

OpenAI at $852 Billion: The IPO That Could Change Everything

AI Is Eating Private Equity’s Lunch: The Trillion-Dollar Reckoning

Ostium Labs: The Next Hyperliquid? Inside the Perpetual DEX Bringing Wall Street to Crypto Rails

Malaysia Kenanga Bank Just Bet Big on Tokenized Assets: What the KDX Move Means for RWAs

AI Agents Are Learning to Spend Money: The Payments Infrastructure Race

Training the Trainers: How Gig Workers Around the World Are Teaching Robots to Be Human

Why Robots Need to Watch Humans Do Chores

The Data Brokers in the Middle

DoorDash Joins the Data Race

China Goes Industrial-Scale

The $6 Billion Bet on Physical Data

The Workers Left in the Dark

What This Means for the Robotics Race

The Bigger Picture

Related Reading

Sources

Related articles

The Ethics Engine: How AI Safety Became the Industry Most Valuable Feature

AI Arms Race Escalates: GPT-5.5 Launches, Pentagon Picks Sides, and Washington Finally Wakes Up

The Hybrid AI That Knows You: Why Personalised RAG + Agents Will Outperform Everything

You Can Now Sequence the Human Genome on a Mac Mini

Recent articles

The Ethics Engine: How AI Safety Became the Industry Most Valuable Feature

AI Arms Race Escalates: GPT-5.5 Launches, Pentagon Picks Sides, and Washington Finally Wakes Up

GPT-5.5: The Smart Play for AI Cost Efficiency in 2026

The Hybrid AI That Knows You: Why Personalised RAG + Agents Will Outperform Everything