Why We Keep Building Ourselves: The Obsession with Humanoid Robots

Introduction: So, The Humanoids Are Back 🦾

The humanoid robot has re-entered the spotlight，yet not as a science-fiction fantasy, but as a product demo, a national competition, and a household promise. 1X recently unveiled its home-oriented domestic robot NEO; XPeng introduced the humanoid robot IRON at its 2025 Tech Day as part of its embodied intelligence matrix, announcing open SDKs and plans for mass production; meanwhile, Beijing hosted the first “World Humanoid Robot Games.” It feels as though the world of Detroit: Become Human is no longer distant.

NEO Robot

Yet this is not the first wave of fascination with humanlike machines. As early as the 1950s, fascination with the “human form” had already begun, from 18th-century automata that imitated human gestures, to Asimov’s Three Laws of Robotics that gave machines moral agency, and Japan’s Wabot-1 in the 1970s that sought to recreate human movement itself. However, between the 1980s and 2010s, the rise of pragmatism and the AI turn shifted robotics away from bodily resemblance. Industrial robots pursued precision, efficiency, and modularity, while artificial intelligence evolved into pure computational intelligence that lives in the code, machine learning detached from any physical embodiment. During this period, the humanoid robot became a symbol of the impractical, wasteful, and unnecessary.

But today, in an age where Roombas and smart homes already serve our daily needs, why do we still refuse to abandon the dream of replicating ourselves? Why has the human form returned — not as an obsolete fantasy, but as the central pursuit once again? What psychological, cultural, or political desires are reflected in this obsession? And why, despite our awareness of the “uncanny valley,” do we still insist on building robots in our own image? This essay grows out of that curiosity, as an attempt to chase the question without any guarantee of correctness. 🤓

Is It Just Engineering Logic?

At its most practical level, the persistence of humanoid robots can be explained through a simple material fact: the world has been built for human bodies. Doors are designed for hands that grasp and wrists that turn. Stairs assume a certain leg length and gait. Tools anticipate fingers, opposable thumbs, and a vertical posture. Infrastructure from kitchens to factories is calibrated to the proportions and movements of the human body.

If intelligence must operate within this preexisting environment, then giving machines a human-like morphology appears to promise immediate compatibility. A robot with arms can reach where we reach. A robot with legs can traverse spaces we traverse. A robot with a head and binocular vision can align itself to the height of counters and tables. In this sense, humanoid design is not necessarily a symbolic choice but an infrastructural one. It reflects an assumption that the most efficient way to insert machines into human life is to make them conform to the physical grammar of our world. Instead of redesigning the environment, we redesign robots to fit cities. The human body becomes a universal adapter.

But the more I sit with this argument, the less stable it feels.

Efficiency does not always follow resemblance. Wheels are steadier than legs. A fixed robotic arm is more precise than a torso trying to balance. A Roomba quietly does its job without needing to learn how to crouch. The effort required to reproduce human balance, dexterity, and coordination often seems wildly disproportionate to the simplicity of the tasks themselves.

So something doesn’t quite add up.

If wheels work better, why insist on legs? If a clamp is stronger, why insist on fingers? If specialized machines already outperform humanoids in most controlled tasks, why keep chasing a body that is harder, more fragile, and far more complicated?

Could It Be Psychological?

If it is not only technical, could it has something to do with psychology?

We have always had a habit of seeing ourselves in things. We name our cars (or maybe that’s just me). We talk to pets. We apologize to objects when we bump into them. It does not take much for us to project intention, emotion, even personality onto what surrounds us. When a machine begins to resemble a body—when it has a face, a posture, a gaze—the projection becomes easier, almost automatic.

And perhaps that resemblance does something more than make interaction intuitive. It creates intimacy.

A humanoid form feels present. It feels like something is there with you. Not just processing input, but occupying space. When I asked my mother and several friends—many of whom use ChatGPT to vent about their daily frustrations—whether they would feel more comfortable speaking if it had a human body, many of them said yes. A face, they imagined, would make it easier to talk. A body would make it feel more real and more present.

I once asked my mother, and later several friends, a simple question. Many of them use ChatGPT to vent about their daily frustrations. I asked them: if ChatGPT had a human body, a face sitting across from you, would that make it harder to speak or easier? And surprisingly, many of them said it would make it easier. A body, they felt, would make the interaction feel more real, more present, more understood. It would not replace the safety of the machine, but it would deepen the sense of being heard.

Yet not everyone responded the same way. Some said it would depend on what the humanoid looked like. When they encounter robots that are strikingly humanlike, they sometimes feel a strange unease instead of comfort. I feel that too. The more realistic the robot appears, the more something feels slightly off. Engineers are trying to overcome this barrier. Researchers such as Yuhang Hu have worked on human–robot facial coexpression, refining the synchronization between mouth movements and vocal output to reduce the uncanny valley effect. But for many of my friends and often for me, these improvements can make the experience even more unsettling. The closer it gets, the more fragile the illusion becomes.

Yuhang Hu’s work on human–robot facial coexpression