RLHF for Robots
Building the reinforcement learning feedback loop that transformed language models — now for physical robots.
We've spent the last several months building the equivalent of ChatGPT's RLHF system for robots. A pre-trained foundation model generates physical motions, humans evaluate success or failure, and reinforcement learning steers the model toward better performance. We're currently in Phase 2: collecting feedback data on real hardware.