The controller and the bots that chat

The controller and the bots that chat

There are moments in time when you feel that the set of plausible future states of the world shifts noticeably. You wake up to just another day but end up going to bed in a distinctly different reality.

Sometimes these moments do stretch out, spanning years. There's almost two years between Mr Krenz watching helplessly as the Berlin wall came down (on 9 November 1989) and the tired look on Mr Gorbachev’s face as he returned to Moscow (on the morning of 22 August 1991), victorious but irrelevant. Sometimes they are compressed in time: Everyone born before that wall came down in the western world remembers the long day when the twin towers fell.

Sometimes these inflexion points bunch up, and in the past year, I have had that feeling twice. This is written around the first anniversary of the start of the war in Ukraine, a momentous crime that is not the topic here. And then, in the late autumn of that same year of 2022, they let loose ChatGPT and its relatives, notably Sydney Bing, and Bard, and friends. Some of these are just rattling their cages, not yet loose in the wild, but the bottle is open, and the genie is out. Generative AI in general, and large language models (LLM) with chat interfaces, in particular, are now part of the landscape we inhabit. Everyone, controllers and finance professionals most certainly included, is thinking about the implications for them and their work. Will they replace me?

Rise of the Chatbot

A useful way to think about LLM-based chatbots is as advanced autocomplete systems that try to generate the next set of sentences that would be likely to follow the input given by the user, taking into account a fair amount of built-in and hidden text and parameters in addition to the actual user input, and of course based on a truly enormous reference model of ingested text. Our friends at Silo AI have a very handy overview and FAQ about the underlying technology and its implications.

ChatGPT, and the likes, may not be at the cutting edge of the AI research frontier, but it is certainly not an empty bubble of hype. It can generate fluent, idiomatic and grammatically correct page-length text on demand, about almost any topic, in many different languages, making this remarkable feat seem almost easy. This is extremely impressive, and yes, the world has changed significantly because this capability is now available. As the journalists who interacted with early versions of Sydney (the Bing search chatbot based on ChatGPT) discovered, if you prod an LLM-based advanced autocompletion chatbot with a sequence of suitably leading questions, you can get it to generate paranoid, psychotic, and downright disturbing responses. This ought not to be surprising – the autocompletion algorithm is just doing its job and doing it well.

 

"The result will look very impressive and will largely be correct – but surprisingly, often partly not."

 

The linguistic skills of ChatGPT and its LLM cousins are scarily good. It is excellent at translating, rewriting, condensing, clarifying and rephrasing almost any text you feed it. However, there is major trouble when you ask it to generate text based on its internal corpus of “knowledge,” not just to process what you give it. It will generate text, computer code, or whatever you ask for. The result will look very impressive and will largely be correct – but surprisingly, often partly not. An LLM-based autocompletion algorithm doesn’t aim to provide accurate answers or facts. Its aim is to generate plausible responses that sound good. The problem is that you cannot tell what is fact-based and what is not because they both sound equally good. ChatGPT cannot either, it isn’t built that way, and LLMs do hallucinate, it’s how they manage to appear so impressive.

ChatGPT - Stage magic?

The first encounters with ChatGPT reminded me, and many others, of Clarke’s third law: “Any sufficiently advanced technology is indistinguishable from magic.” When this law is invoked in discussions, it is usually in the context of how advanced the technology needs to be and for whom to qualify as magic. It is relative, of course – firearms or aeroplanes are magic if your own technology level is spears and canoes. But ChatGPT did make me realise that we also need to be careful about what sort of magic we are talking about. It’s not always of the Hogwarts kind.

The ability of ChatGPT to produce fluent text about almost anything does seem indistinguishable from magic. But it turns out to be stage magic: A simulation for entertainment purposes – misdirection, sleight of hand, suspension of disbelief. We applaud the craft, admire the feat, pay to be entertained, and indeed we are. But in fact, the magician’s hat does not add a single bunny to the rabbit count of the universe – and ChatGPT generates no new insights. It just pretends to. In doing so, it is not concerned with the truth, validity, or reliability of what it writes. It is impossible to tell where the facts taper off and the hallucinations begin.

 

"ChatGPT generates no new insights. It just pretends to."

 

I asked ChatGPT about Clarke’s law. It chatsplained, in a slightly condescending and pompous manner, in flawless prose and a list of numbered items, about the need to educate people about advanced technology. In effect, padding its reply with platitudes, it completely missed the point and lacked the capability to realise this and pipe down. I could of course have asked it to reply in verse, in the style of Coleridge after three pipes of opium, and it would have done it just as well. The entertainment value of such is limited and rapidly diminishing when trying to get actual work done against a deadline.

And this is supposed to replace or threaten business controllers and financial professionals? People who actually understand what is going on, who can analyse and investigate, figure things out and know what is important, significant, grounded in facts – and what is not? Maybe, and when they burn the next suspected spy balloon out of the sky, it will smell of fried bacon.

The current generation of generative AI and large language models has drastically lowered the cost of generating BS. There’s a tsunami of the stuff approaching, an avalanche, a perfect storm – and this particular vintage is well and truly uncorked, there is nothing we can do. But the fact that arbitrary quantities of low-quality but plausible-looking information is now, in effect, free for the asking means that high-quality, fact-based, checked and edited information – with a pedigree and an audit trail – will become more valuable, not less. The skill sets of financial professionals and controllers will be needed and appreciated more than ever before. But at the same time, the baseline for acceptable language quality in communication is increasing. There’s really no excuse for being obtuse and confusing when it’s so easy to use an LLM to clarify the message, and this goes for controllers too.

To stay relevant in a changing landscape, we need to think ahead. ChatGPT is not a threat, it’s a combined massive nuisance, major convenience, and significant opportunity. And it does have entertainment value.

But what about the children of its children?

 


 

We’ll discuss these developments, and many others, in our benchmarking and development programs, The Controller Performance Programme for business controllers and Accounting Performance Programme for financial accounting professionals.

 


 

About the author

 

Anders blog

Anders Tallberg is Senior Fellow at Hanken & SSE Executive Education. Anders has previously worked at Hanken School of Economics, as professor of accounting and head of the Department of Accounting. Anders has authored books, software applications and scientific papers on various aspects of accounting. Among other, he currently serves as the vice chairman of the Finnish Accounting Standards Board, and as a member of the Finnish Auditing Board.