Press ESC to close

AI’s Flaws Are Showing and Researchers Say It’s Time for Stricter Standards

As artificial intelligence systems are rapidly integrated into society, researchers are raising red flags: AI models are still producing harmful content — from hate speech and sexual material to copyright violations — and the current safety checks are far from sufficient.

“The truth is, after nearly 15 years of work, we still don’t know how to guarantee model alignment — and we’re not getting much closer,” said Javier Rando, a specialist in adversarial machine learning. According to him and other experts, AI development has outpaced both regulatory oversight and safety testing, leaving major blind spots in how these systems behave in the real world.

One of the key concerns is the lack of standardized, comprehensive evaluation. Current efforts, like red teaming — a method borrowed from cybersecurity where experts intentionally probe models for weaknesses — are too limited in scope and personnel. “There simply aren’t enough qualified people doing this work,” said Shayne Longpre, lead of the Data Provenance Initiative. He and his co-authors argue that the vetting process should involve not just internal testers but also independent third parties: researchers, ethical hackers, and domain experts.

Longpre’s team has proposed a framework similar to “bug bounty” programs in software security: a formal structure for submitting AI flaw reports, incentivizing disclosure, and publicly sharing information about risks. This, he argues, would create a more transparent and resilient system — one better suited for dealing with the growing complexity of modern AI.

From Testing to Trust

Singapore’s Project Moonshot represents one attempt to bridge this gap. Developed by the Infocomm Media Development Authority in collaboration with companies like IBM and DataRobot, the open-source toolkit combines benchmarks, red teaming protocols, and evaluation baselines to help startups audit their large language models before and after launch.

“Some startups embraced it quickly,” said Anup Kumar, head of client engineering for data and AI at IBM Asia Pacific. “But much more can be done.” Future iterations of Moonshot aim to support multilingual red teaming and offer tailored testing for specific industry needs.

Still, the challenge isn’t just technical — it’s systemic.

Time to Raise the Bar

Pierre Alquier, a professor of statistics at ESSEC Business School, argues that AI needs regulatory oversight more akin to drug or aviation safety. “When pharmaceutical companies create a new drug, they must go through months of rigorous testing before release,” he said. “Why should AI — with the potential to impact millions of lives — be any different?”

Alquier and others also warn against overgeneralized AI systems. Broad, do-everything models like large language models (LLMs) are harder to test, harder to control, and easier to misuse. By contrast, task-specific models can be evaluated more precisely and are less likely to exhibit unexpected behaviors.

Developers must also stop overselling their defenses. “Too often, companies market their models as safer than they really are,” said Rando. In reality, even the most advanced safeguards struggle to keep up with evolving misuse scenarios.

The consensus from researchers is clear: AI isn’t a moonshot anymore — it’s here, and its risks are growing. Building systems that are safe, accountable, and well-understood is no longer optional. It’s essential.

Prepared by Navruzakhon Burieva

Leave a Reply

Your email address will not be published. Required fields are marked *