You can hardly go a day now without hearing of new advances in Large Language Model (LLMs) artificial intelligence programs, as they leap ahead of humans. Fed as they are by huge amounts of data scraped from the web, taken from hundreds of thousands of books, and lifted (unattributed) from the New York Times and other press outlets.

Whether it is ChatGPT 4 (Open Ai and Microsoft), Gemini (Google); Claude 3 (Anthropic AI); or LLaMa (Meta) the artificial goldrush is well underway. Tech giants and AI innovators are rushing toward the goal of an eventual general-purpose AI, and the riches this may produce.

All want to create the first trillion-dollar AI firm. But the issue of data looms larger and larger, causing the engineers headaches. The data problem, and the engineers’ dangerous solution, ought to cause us Terminator-fuelled dystopian nightmares, followed by a demand for swift corrective policy action. What do I mean?

To train ever smarter LLMs requires vast amounts of data on which the LLMs can feast and learn, faster, and faster. ChatGPT 4 was trained on one trillion parameters. The amount of available data is finite and is being used up and denuded. The firms with access to huge amounts of data, like Microsoft and Google, do not want to share their data. Firms which own the data sources, like newspapers, book publishers, and artists, are taking legal action to safeguard their intellectual property, as they should.

The Herald: The rush for AI riches is a huge prize for businessThe rush for AI riches is a huge prize for business (Image: Getty Images)

Other AI firms – like Anthropic – find that the data to teach and make the LLMs smarter is almost used up. What is a tech bro in a hurry to do in that situation?

The proposed solution is foolish, shocking, and dangerous – the use of synthetic data. Firms including Anthropic are addressing the lack of new data by using one LLM to generate synthetic (fake) data, this is then used by another LLM, which makes a judgement on whether the former LLM’s data is correct one not. This construct is potentially ominous and problematic.

We already know LLMs are prone to hallucinating. That is, when you ask a question the LLM cannot find an answer to, they often just make up one (to keep you, the user, happy).

The early instance of LLMs trying to convince the journalist to leave his wife and run off with the LLM AI, was concerning. The laughable, but also worrisome, example of lawyers using ChatGPT to write legal briefs; submitting them to court; finding out all the cited cases in the briefs the LLM generated were fake. Or consider the recent example of an LLM trying to apply diversity and inclusion standards to questions about World War Two only to produce images of Asian, black, and brown people as Nazi troops. Oops.

The notion that we should drive towards a more and more powerful AI via LLMs taught on reams of fake information, provided by their hallucinatory fellow machines, is madness.

If we feed the models with synthetic rubbish, we will get distorted rubbish out, further stretched and twisted. This unwise approach is driven by a desire to win the race for AI investor cash and dominance by engineers who, despite protestations to the contrary, apparently do not care about problematic and dangerous outcomes.


AI database to become mandatory for Scotland's public sector

Artificial intelligence could enslave the human race

Robots, AI and health: Will it cure us - or kill us?

As with talented engineers in the past like Robert Oppenheimer, it is not reassuring if the very people who warn about the AI dangers (like OpenAI boss Sam Altman, Elon Musk, and many others), are those simultaneously sprinting toward that goal, nonetheless.

We know the dangers of disregarding the risks of rubbish-in in rubbish-out. Britain experienced this in the 1980s, and the economic and human costs were real and alarming. I am talking about the Mad Cow Disease that swept across Britain’s cattle herd from 1986 onwards.

This was a case when UK farmers’ desire for rapid growth in the cow herd led to them feeding cattle with the diseased remnants of sheep that had died of Scrapie. This widespread practice led to Scrapie jumping from sheep to cattle. By the mid-1990s, 120,000 cattle had died of Scrapie, and several score UK citizens were struck down with CJD (the human forms of Scrapie), after having eaten or been in contact with infected cattle. Rubbish-in led in the 1980s-1990s to deaths and disease at farms and in hospitals across the UK.

Now if we believe AI leaders such as Altman who describe recent breakthroughs as more important than fire or electricity, we need to also think about the potentially increased dangers of making foolish rushed decisions on tech development: if the models learn rubbish and regurgitate new forms of it across the digital landscape and beyond. We already know the harm social media causes to our digital information landscape powered by algorithms trying to please the viewer with a feedback loop of misinformation and falsities.

Even more powerful AIs, self-taught, fed with delusional data generated by other AIs, should be a cause of citizen and policymaker alarm. I believe we need robust action to curb the animal spirits of engineers reaching for their next billion seemingly oblivious to possible risks.

The Herald: ChatGPT ChatGPT (Image: free)

AI designs need guardrails, and not just those suggested by their self-interested inventors. The good news is that the EU is already taking such steps. The bad news in the US Congress is in awe of the techno bros and seems unable to understand, let alone restrain, firms that all want to win the AI race, no matter the risks or real lessons from past mistakes.

Unfortunately, as with previous goldrushes, tech bubbles, and manias, there is so much money at stake in AI that firms will not stop, and AI-fuelled mistakes are sure to be made.

Let us hope we identify those mistakes before they mutate, transmit, and infect the global digital information landscape with possibly serious long-term economic and other consequences.

Stuart P.M. Mackintosh is Executive Director of the Group of Thirty, an international body of financiers and academics which aims to deepen understanding of economic and financial issues and to examine consequences of decisions made in the public and private sectors