R ← Back

Yo Wall-E

Personal project · 2026 · From an Amazon kit to an autonomous AI companion

Last year I watched a clip of humanoid robots doing kung fu, and it rearranged something in my head. Robots aren't science fiction anymore. They're products. You can order a humanoid on AliExpress for twenty thousand euros the way you'd order a desk lamp.

AliExpress search results for Unitree G1 humanoid robots, priced around 20,000 euros Amazon listing for a KEYESTUDIO 4WD Arduino robot car kit, about 60 euros
Humanoids by the pallet on AliExpress; the €60 Arduino kit my wife found on Amazon. This project started at the cheap end.

Around the same time my wife gave me a Wall-E-inspired robot kit off Amazon. Treads, a couple of servos, an Arduino. A toy. The last time I'd touched an Arduino was fifteen years ago, and all I managed then was a blinking LED. Zero robotics knowledge, zero electronics experience.

So I did what I keep telling clients to do: I sat down with the AI and started.

Drive it like a game

The first milestone was making it move with a PS5 controller: Bluetooth pairing, joystick mapping, differential drive, triggers for speed, buttons for the servos. The first time it rolled across the floor under my control was pure magic, and also clearly not enough. What if it could see? Hear? Talk? Think?

First milestone: differential drive on a PS5 controller.

Yo Wall-E

It now runs a full voice pipeline: a locally trained wake word ("Yo Wall-E"), Whisper for transcription, Gemini or Claude as the brain, Piper or ElevenLabs for the voice, and a choreography layer that syncs speech with head movement, servo gestures and a 16x8 LED face that is tiny but surprisingly expressive. When it's happy, it does a little dance.

The happy routine: speech, head movement and the LED face running off one choreography layer.

It sees, too. Object detection runs on the camera's own VPU at 15 frames a second, it recognizes faces and remembers who you are, and it reads eight emotions and reacts to your mood. In autonomous mode it roams, avoids obstacles, scans the room and narrates what it finds in one of four personalities, from a David Attenborough homage to a protocol droid.

Six models run on the device, no internet needed. A web dashboard with 35+ endpoints controls everything from any phone on the network, and the LLM and voice are swappable at runtime.

The dashboard. Drive, voice, vision, personality and diagnostics, from any phone on the network.
The finished robot: sensors, camera, LED face and a lot of visible wiring
The current state: ultrasonic sensor, OAK-D camera, LED face, and wiring I am no longer ashamed of.

What the hardware taught me

Software is forgiving; hardware is not. If you wire it wrong, it literally smokes. I learned voltage and current limits, why servos jitter when motors spin up, the difference between battery chemistries, and the deep satisfaction of a clean solder joint.

The AI didn't build it for me. It gave me the knowledge to build it myself.

That's the honest lesson, and it's the same one I bring to product work. Claude and Gemini were the patient experts beside me: firmware, async servers, ML pipelines, debugging serial protocols at midnight. Capability arrived instantly. The judgment about what to build, what "done" feels like, and when the little dance is charming rather than gimmicky stayed a human job.

Make it real, make it usable, keep human taste in the loop. Even when it has treads.