The Dream of AI Agents: Can We Trust the Machines?
Imagine a world where intelligent machines, dubbed "agentic AI agents," are in control. No longer will humans be burdened with mundane tasks, and innovation would be limitless. However, for many experts, this vision is more like a pipe dream than a reality. A recent research paper has cast doubt on the feasibility of agentic AI agents, suggesting that they may not live up to their promise.
According to Vishal Sikka, a former SAP CTO and current head of an AI services startup called Vianai, AI agents are inherently flawed due to "hallucinations" – or made-up information. This limitation is fundamental to the nature of language models like LLMs (Large Language Models), which are used to power these intelligent machines.
"Sloppy thinking will be tolerated," Sikka warns. While not all human errors can be replicated by AI, it's unlikely that every single mistake will be entirely eliminated. "So we should forget about AI agents running nuclear power plants?" he jokingly asks.
However, some prominent figures in the field disagree with Sikka's pessimism. Robinhood CEO Vlad Tenev and Stanford-trained mathematician Tudor Achim have founded a startup called Harmonic that claims to have made breakthroughs in guaranteeing the trustworthiness of AI systems. By using formal methods of mathematical reasoning to verify an LLM's output, they believe it is possible to filter out hallucinations and create more reliable agents.
Harmonic's solution involves encoding outputs in the Lean programming language, which can be verified for correctness. While their focus has been narrow so far, coding is an organic extension of their main mission – mathematical superintelligence.
Despite these advancements, most models still suffer from "hallucinations," as OpenAI scientists have demonstrated. In a paper published last September, they asked three models, including ChatGPT, to provide the title of a lead author's dissertation and found that all three provided fake titles.
However, rather than abandoning the concept of agentic AI agents, some argue that guardrails can be created to mitigate hallucinations. This perspective holds that it is possible to create systems that learn from and overcome limitations like hallucinations.
Ultimately, the question remains whether these advancements will lead to a world where human cognitive activity is automated, potentially improving quality and lives. While mathematical verifiability may not provide clear answers, one thing is certain: we are rapidly hurtling towards an AI-driven future, and it's up to us to consider what that means for our society.
For now, the debate rages on, with no definitive resolution in sight. As Alan Kay so aptly put it, "The mathematical question is beside the point." Instead, we must focus on understanding the impact of these machines on our lives.
Imagine a world where intelligent machines, dubbed "agentic AI agents," are in control. No longer will humans be burdened with mundane tasks, and innovation would be limitless. However, for many experts, this vision is more like a pipe dream than a reality. A recent research paper has cast doubt on the feasibility of agentic AI agents, suggesting that they may not live up to their promise.
According to Vishal Sikka, a former SAP CTO and current head of an AI services startup called Vianai, AI agents are inherently flawed due to "hallucinations" – or made-up information. This limitation is fundamental to the nature of language models like LLMs (Large Language Models), which are used to power these intelligent machines.
"Sloppy thinking will be tolerated," Sikka warns. While not all human errors can be replicated by AI, it's unlikely that every single mistake will be entirely eliminated. "So we should forget about AI agents running nuclear power plants?" he jokingly asks.
However, some prominent figures in the field disagree with Sikka's pessimism. Robinhood CEO Vlad Tenev and Stanford-trained mathematician Tudor Achim have founded a startup called Harmonic that claims to have made breakthroughs in guaranteeing the trustworthiness of AI systems. By using formal methods of mathematical reasoning to verify an LLM's output, they believe it is possible to filter out hallucinations and create more reliable agents.
Harmonic's solution involves encoding outputs in the Lean programming language, which can be verified for correctness. While their focus has been narrow so far, coding is an organic extension of their main mission – mathematical superintelligence.
Despite these advancements, most models still suffer from "hallucinations," as OpenAI scientists have demonstrated. In a paper published last September, they asked three models, including ChatGPT, to provide the title of a lead author's dissertation and found that all three provided fake titles.
However, rather than abandoning the concept of agentic AI agents, some argue that guardrails can be created to mitigate hallucinations. This perspective holds that it is possible to create systems that learn from and overcome limitations like hallucinations.
Ultimately, the question remains whether these advancements will lead to a world where human cognitive activity is automated, potentially improving quality and lives. While mathematical verifiability may not provide clear answers, one thing is certain: we are rapidly hurtling towards an AI-driven future, and it's up to us to consider what that means for our society.
For now, the debate rages on, with no definitive resolution in sight. As Alan Kay so aptly put it, "The mathematical question is beside the point." Instead, we must focus on understanding the impact of these machines on our lives.