The remarkable capabilities of generative artificial intelligence (AI) are clear the moment you try it. But remarkableness is also a problem for managers. Working out what to do with a new technology is harder when it can affect so many activities; when its adoption depends not just on the abilities of machines but also on pesky humans; and when it has some surprising flaws.
Study after study rams home the potential of large language models (LLMs), which power AIs like ChatGPT, to improve all manner of things. LLMs can save time, by generating meeting summaries, analysing data or drafting press releases. They can sharpen up customer service. They cannot put up IKEA bookshelves—but nor can humans.
AI can even boost innovation. Karan Girotra of Cornell University and his co-authors compared the idea-generating abilities of the latest version of ChatGPT with those of students at an elite university. A lone human can come up with about five ideas in 15 minutes; arm the human with the AI and the number goes up to 200. Crucially, the quality of these ideas is better, at least judged by purchase-intent surveys for new product concepts. Such possibilities can paralyse bosses; when you can do everything, it’s easy to do nothing.
LLMs’ ease of use also has pluses and minuses. On the plus side, more applications for generative AI can be found if more people are trying it. Familiarity with LLMs will make people better at using them. Reid Hoffman, a serial AI investor (and a guest on this week’s final episode of “Boss Class”, our management podcast), has a simple bit of advice: start playing with it. If you asked ChatGPT to write a haiku a year ago and have not touched it since, you have more to do.
Familiarity may also counter the human instinct to be wary of automation. A paper by Siliang Tong of Nanyang Technological University and his co-authors that was published in 2021, before generative AI was all the rage, captured this suspicion neatly. It showed that AI-generated feedback improved employee performance more than feedback from human managers. However, disclosing that the feedback came from a machine had the opposite effect: it undermined trust, stoked fears of job insecurity and hurt performance. Exposure to LLMs could soothe concerns.
Or not. Complicating things are flaws in the technology. The Cambridge Dictionary has named “hallucinate” as its word of the year, in tribute to the tendency of LLMs to spew out false information. The models are evolving rapidly and ought to get better on this score, at least. But some problems are baked in, according to a new paper by R. Thomas McCoy of Princeton University and his co-authors.
Because off-the-shelf models are trained on internet data to predict the next word in an answer on a probabilistic basis, they can be tripped up by surprising things. Get GPT-4, the LLM behind ChatGPT, to multiply a number by 9/5 and add 32, and it does well; ask it to multiply the same number by 7/5 and add 31, and it does considerably less well. The difference is explained by the fact that the first calculation is how you convert Celsius to Fahrenheit, and therefore common on the internet; the second is rare and so does not feature much in the training data. Such pitfalls will exist in proprietary models, too.
On top of all this is a practical problem: it is hard for firms to keep track of employees’ use of AI. Confidential data might be uploaded and potentially leak out in a subsequent conversation. Earlier this year Samsung, an electronics giant, clamped down on usage of ChatGPT by employees after engineers reportedly shared source code with the chatbot.
This combination of superpowers, simplicity and stumbles is a messy one for bosses to navigate. But it points to a few rules of thumb. Be targeted. Some consultants like to talk about the “lighthouse approach”—picking a contained project that has signalling value to the rest of the organisation. Rather than banning the use of LLMs, have guidelines on what information can be put into them. Be on top of how the tech works: this is not like driving a car and not caring what is under the hood. Above all, use it yourself. Generative AI may feel magical. But it is hard work to get right.■
Correction (28th November): An earlier version of this article stated that the study by Karan Girotra and his co-authors took place at several elite American universities. It actually took place at just one elite university. It also stated that R. Thomas McCoy’s co-authors are also at Princeton University. Not all of them still are. Apologies.
Read more from Bartleby, our columnist on management and work:
How not to motivate your employees (Nov 20th)
The curse of the badly run meeting (Nov 13th)
How to manage teams in a world designed for individuals (Nov 6th)
Also: How the Bartleby column got its name