Engineers have combined the AI model powering ChatGPT with a humanoid bust to create a robot receptionist for the UK National Robotarium, a center for robotics and AI.
“We are exploring how to use and further develop the recent AI advances in LLMs to create more useful, usable, and compelling systems for collaboration between humans, robots, and AI systems in general,” researcher Oliver Lemon told Tech Xplore.
The AI revolution: Large language models (LLMs) — AI systems that can understand and respond to natural language — are exploding in popularity, largely due to the release of ChatGPT in 2022.
“We wanted to investigate several aspects of embodied AI for natural interaction with humans.”
Oliver Lemon
Because these systems are trained on huge databases, they can typically respond to questions on a broad range of topics, but they usually aren’t experts in anything in particular and can sometimes “hallucinate,” giving responses that sound true but aren’t, which limits their usefulness.
Additionally, because most LLMs communicate via text, chatting with them isn’t as natural as talking to another person — verbal and nonverbal communication are hugely important to human interaction.
What’s new? Now, engineers at Heriot-Watt University and Alana AI have combined OpenAI’s GPT-3.5 — the same LLM used for ChatGPT — with a humanoid bust to create a robot receptionist that can interact with visitors to the UK National Robotarium.
To minimize the chances of the robot providing false information about the Robotarium, they scraped the center’s website and stored the information in a special database that its AI accesses before responding to users.
They say this is the first system that combines an LLM that’s particularly knowledgeable about one subject with an animated robot capable of verbal and nonverbal communication.
“We wanted to investigate several aspects of embodied AI for natural interaction with humans,” said Lemon. “In particular, we were interested in combining the sort of general ‘open domain’ conversation that you can have with LLMs like ChatGPT with more useful and specific information sources.”
How it works: The basis for the robot receptionist is Furhat, a social robot that is essentially a humanoid bust capable of lifelike expressions, movements, and speech.
When a person talks to the robot, their speech is transcribed into text. Various systems then work together to determine what the person is asking and produce a text response and appropriate facial expressions and movements.
Text-to-speech tech is then used to convert the text into audio, which is played through a speaker in the bust.
In a demo video shared by the researchers, the bot is able to describe the Robotarium and answer several questions about pop culture and the future of robotics.
While the answers themselves are conversational and appropriate, the delivery is still a bit robotic, with odd inflections and unnaturally long pauses while the bot “thinks” — those pauses are accompanied by an unblinking stare straight out of the uncanny valley.
Looking ahead: The researchers say their robot receptionist was able to interact naturally with Robotarium visitors and provide accurate information about its research, events, and more.
They’re now exploring ways to enable the bot to interact with multiple people at once — rather than just one-on-one — while continuing to look for ways to further minimize the chance of hallucinations.
We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.