Test automation infrastructure at robotics companies remains stuck in the mobile app era while the machines themselves edge toward human-level dexterity. Atharv Kolhar, who builds testing systems for humanoid robots at Figure AI, says the industry's validation methods have not kept pace with the complexity of modern autonomous systems. The disconnect matters because every additional degree of freedom in a robot's movement multiplies the number of edge cases engineers must account for, and traditional unit testing frameworks collapse under that combinatorial load.
Figure AI operates at the sharp end of this problem. The Sunnyvale-based company builds general-purpose humanoid robots designed for warehouse and manufacturing work, machines that must navigate unstructured environments and manipulate objects with two hands. Kolhar's role involves designing the automated test suites that catch failures before robots reach customer sites. He describes a testing philosophy that treats robots more like distributed systems than mechanical devices, borrowing concepts from cloud infrastructure validation. The approach reflects a broader industry realization: as robots gain autonomy, their behavior becomes less deterministic, and testing must account for probabilistic outcomes rather than binary pass-fail results.
The challenge scales exponentially with capability. A mobile robot navigating a warehouse might require hundreds of test scenarios covering different floor surfaces, lighting conditions, and obstacle types. Add manipulation tasks with two seven-degree-of-freedom arms, and the scenario count enters the millions. Traditional approaches involve human testers running through checklists or developers writing scripted test cases for anticipated situations. Both methods break down when the robot encounters novel combinations of environmental factors, which happens constantly in real deployments. Kolhar advocates for simulation-heavy testing that generates edge cases automatically, paired with continuous validation loops that feed real-world failure data back into the test generation system. The strategy mirrors what autonomous vehicle developers learned after early deployments revealed gaps in their scenario coverage.
Several robotics companies now dedicate entire engineering teams to testing infrastructure, a shift from five years ago when validation remained a secondary concern. Boston Dynamics maintains a dedicated simulation team that builds digital twins of deployment environments before shipping hardware. Tesla's Optimus program runs millions of simulated grasping attempts daily, training both the control algorithms and the test systems simultaneously. Figure AI's approach, according to Kolhar's public statements, emphasizes modular test components that can be recombined as the robot's capabilities expand. The methodology treats each new behavior as a testing problem first and an implementation problem second. This inverted priority structure slows initial development but reduces the lag between capability deployment and validation coverage. The tradeoff becomes critical as robots move from controlled pilot programs to scaled commercial deployments where failure modes carry financial and safety consequences.
What to Watch: Figure AI's next-generation humanoid platform, expected in late 2026, will test whether modular testing frameworks can scale with hardware iterations. Look for announcements from Boston Dynamics and Agility Robotics around simulation partnerships, as both companies face similar validation bottlenecks with their commercial robot deployments. The IEEE Robotics and Automation Society has scheduled a dedicated testing standards workshop for October 2026, which may produce the industry's first formal guidelines for autonomous system validation.




