These models were deployed on Stretch, a robot consisting of a wheeled unit, a tall pole, and a retractable arm holding an iPhone, to test how successfully they were able to execute the tasks in new environments without additional tweaking. Although they achieved a completion rate of 74.4%, the researchers were able to increase this to a 90% success rate when they took images from the iPhone and the robot’s head-mounted camera, gave them to OpenAI’s recent GPT-4o LLM model, and asked it if the task had been completed successfully. If GPT-4o said no, they simply reset the robot and tried again.
A significant challenge facing roboticists is that training and testing their models in lab environments isn’t representative of what could happen in the real world, meaning research that helps machines to behave more reliably in new settings is much welcomed, says Mohit Shridhar, a research scientist specializing in robotic manipulation who wasn’t involved in the work.
“It’s nice to see that it’s being evaluated in all these diverse homes and kitchens, because if you can get a robot to work in the wild in a random house, that’s the true goal of robotics,” he says.
The project could serve as a general recipe to build other utility robotics models for other tasks, helping to teach robots new skills with minimal extra work and making it easier for people who aren’t trained roboticists to deploy future robots in their homes, says Shafiullah.
“The dream that we’re going for is that I could train something, put it on the internet, and you should be able to download and run it on a robot in your home,” he says.