Indian Startups Drive Egocentric Data Collection for Robotics

Indian Startups Drive Egocentric Data Collection for Robotics

Synopsis

Indian startups are entering the lucrative egocentric data collection business. This data, captured from a first-person view, is vital for training robots. Leading robotics labs require billions of hours of this data for advanced manipulation and safe operation. Companies like Humyn AI and Objectways are meeting this demand, collecting data across various contexts to fuel the future of robotics.
Captured via wearable cameras, egocentric data is emerging as a key input for robot training
Close to 50 people on a factory floor in Ahmedabad are assembling electronic components, shouldering, screwing and finally putting the finished product into a box, wearing a GoPro camera on their foreheads.

The camera records the process, which is then annotated, passed through quality checks, and finally delivered to customers, who can use it to train their robots. This type of data collection is called egocentric, which refers to data collected from a first-person point of view using wearable cameras.

There is a huge market for them. A report by Stellaris Venture Partners pegs that leading robotics labs need 100 million to 1 billion hours of egocentric data in the next 2-3 years.

To tap into this, multiple Indian startups such as Humyn AI, FPV Labs and Neo Cambrian are entering this business to build a data pipeline for robotics companies. In addition, those in the data collection business such as Objectways are now expanding to collect data for physical AI companies.

Ishank Gupta, cofounder, Humyn AI, explained that to train robots in a single context, the training data required is anywhere between 100,000 and 1 million hours. He defines a single context as one task, for instance, picking up a glass and placing it on a designated shelf in the kitchen.

The current consensus, he said, is that for those using egocentric videos to train robotics arms and limbs, estimated data requirement is a few billion hours of data.

"These billions of hours of data cannot be scraped and have to be created because there is no repository in the world which has such data," he explained.

This is the biggest bottleneck for robotics labs, who require egocentric data that allows bots to learn better manipulation of hands, and operate safely in the complex real-world environment.

Ravi Shankar, president, Objectways, said, "We started noticing this trend in mid 2025." The company, which was in the data collection for LLMs, started offering data across egocentric and RGB-D data for calculating depth for robots. "We are doing 1,000 hours of data per day, and the demand is for 200,000 to 300,000 hours of data," he said. The company works with Encord, who counts global robotic labs as clients.

Data Collection

Humyn Labs has a verified network of people, who work across 18 countries across India, Latin America, Europe and Southeast Asia, to collect data based on customer needs. The company is currently collecting data for manufacturing, and residentials needs such as washing dishes, folding laundry.

Abhinav Kukreja, cofounder, Neo Cambrian, in a LinkedIn post said that they are deploying proprietary hardware to collect accurate and detailed data closer to the real-world environment across manufacturing units in India.

This editorial summary reflects ET Tech and other public reporting on Indian Startups Drive Egocentric Data Collection for Robotics.

Reviewed by WTGuru editorial team.