Blog / form follows data

Blog

Form Follows Data

July 12, 2025 / 10 min / Stefan Weirich

One of the first questions we are always asked at mimic is: "Why the human hand form factor?" It's a fair question, often presented in combination with a lot of interesting ideas for robot morphologies, from the simplest grippers to hands even more capable than the human, for example with 6 fingers.

Here I want to lay out our design philosophy on the quest towards solving general-purpose robotic dexterity, and what I think the future of the field will look like. As the title already suggests the reason isn't purely functional: it's data-driven.

Are you excited about this as well? We are hiring.

The Data Problem in Robotics

One of the greatest challenges in building robotic foundation models is the scarcity of robot behavior data. Unlike other fields, like large language models (LLM) where easily and cheaply available data already existed, robotics lacks extensive, systematic records of robot data. Unfortunately, collecting such data is extraordinarily expensive and time-consuming, requiring a small army of human operators to teleoperate robots for months. So if we want this to really work there must be a way to scale robot training data more efficiently.

The Human Data Scale

However, we do have access to one enormous, readily available dataset: human behavior.
The first obvious step is human video data: countless hours of video showing humans performing all sorts of manipulations tasks with their own hands freely available on the internet.
This is something more and more experts in the field are now leveraging for pre-training. Interestingly, there are first start-ups emerging entirely focused on providing such training data to robotics companies, enabling quick access to egocentric video data recorded by an on demand workforce around the world.

But video data, although rich in diversity, lacks quality and misses a lot of crucial information, from occlusion to position accuracy and applied forces.

Quality vs. Quantity

So one of the key questions is how we can create higher quality data at a sufficient quantity.
And we have to be able to do this at an unprecedented scale and execution speed.

Our solution was to create a wearable method for data collection: a sensorized glove set-up equipped with integrated cameras to record accurate positions, applied force and point of view video data.

The key point here is not how the sensors work, but rather that it allows humans to perform tasks with their own hands, as they are used to and already do in their day to day work. No training on robot teleoperation needed. No physical presence of a robot needed. This makes our method approximately 5x cheaper and 7x faster than collecting conventional teleoperation data.

Bridging the Embodiment Gap

Now, this new approach on robot learning and data collection drastically changes the way that we need to think about robot hardware. To effectively utilize this wealth of human data, we strive to minimize what we call the "embodiment gap" – the difference between the physical form generating the training data (humans) and the robot's form. We don’t engineer robotic gripper hardware hyperoptimized to one task anymore. We optimize the hardware design to match our data collection methods: "form follows data"

Consequently, If we want to leverage human demonstration data effectively, our robots need to physically resemble humans, particularly in their manipulation capabilities.

At this point we come back to the system design question: suddenly it makes sense to approximate anatomy and degrees of freedom to that of the human counterpart. Converting this into an engineering problem, there are obviously tradeoffs between anthropomorphism and complexity or reliability to be made. At the core, mimic’s hardware is designed to match key human functionalities, like an opposable thumb and independently controllable fingers including abduction and adduction, and combining them with the sensing that we need to train and execute models most effectively, while creating the simplest and most reliable system that can do this. Ultimately, this means we need to tailor the integration of position, tactile and torque sensing, as well as vision/cameras to fit this new training approach and match robot execution with data collection methods.

The Limits of Anthropomorphism

Interestingly, this principle doesn't necessarily extend to the entire robot body. While manipulation benefits enormously from human-like design due to available demonstration data, legged locomotion follows different rules. Most legged robot locomotion is trained in simulation using reinforcement learning, and quadrupeds can walk just as well as bipeds as external demonstration data is not relevant (or does not exist for your robot morphology). This is why we can be more flexible with robot locomotion design, potentially deviating from human anatomy where beneficial, as wheeled platforms can be quite efficient. While the pace of recent developments in bipedal humanoid robots is surely fascinating from an engineering perspective, we rarely encountered real world customer requirements that would justify this complexity. Instead we are choosing a platform agnostic approach compatible with most standard robotic arms and mobile robots.

What’s next?

Looking ahead, we'll likely see the lines in between imitation learning and reinforcement learning, as well as real world data and simulation data, blur more and more. The emergence of sophisticated multi-embodiment training systems will allow AI models to generalize across different robotic manipulators, from human-like hands to more specialized grippers (as our team has shown recently). However, this future capability doesn't negate our current need for human-like robots – it builds upon it.

The "form follows data" principle isn't about permanently binding robots to human form. Rather, it's about recognizing that right now, human-like robot manipulators represent our best opportunity to leverage existing data for rapid advancement towards truly generative foundation models. As we build better systems for collecting and generalizing robot behavior data, future designs may well transcend human limitations. But for now, the path to more capable robots runs through human-like design, because that's where the data is.