- Joined
- Aug 15, 2020
- Posts
- 6,275
New research shows that coding AIs such as ChatGPT suffer from the Dunning-Kruger Effect, often acting most confident when they are least competent. When tackling unfamiliar or obscure programming languages, they claim high certainty even as their answers fall apart. The study links model overconfidence to both poor performance and lack of training data, raising new concerns about how much these systems really know about what they don’t know.
Anyone who has spent even a moderate amount of time interacting with Large Language Models about factual matters will already know that LLMs are frequently disposed to give a confidently wrong response to a user query.
Along with more overt forms of hallucination, the reason for this empty boastfulness is not 100% clear. Research published over the summer suggests that models give confident answers even when they know they are wrong, for instance; though other theories ascribe overconfidence to architectural choices, among other possibilities.
What the end user can be certain about is that the experience is incredibly frustrating, since we are hard-coded to put faith in people’s estimations of their own abilities (not least because in such cases there are consequences, legal and otherwise, to a person over-promising and under-delivering); and a kind of anthropomorphic transference means we tend to replicate this behavior with conversational AI systems.
But an LLM is an unaccountable entity which can and will effectively return a ‘Whoops! Butterfingers…’ after it has helped the user to inadvertently destroy something important, or at least waste an afternoon of their time; assuming it will admit liability at all.
www.unite.ai
Anyone who has spent even a moderate amount of time interacting with Large Language Models about factual matters will already know that LLMs are frequently disposed to give a confidently wrong response to a user query.
Along with more overt forms of hallucination, the reason for this empty boastfulness is not 100% clear. Research published over the summer suggests that models give confident answers even when they know they are wrong, for instance; though other theories ascribe overconfidence to architectural choices, among other possibilities.
What the end user can be certain about is that the experience is incredibly frustrating, since we are hard-coded to put faith in people’s estimations of their own abilities (not least because in such cases there are consequences, legal and otherwise, to a person over-promising and under-delivering); and a kind of anthropomorphic transference means we tend to replicate this behavior with conversational AI systems.
But an LLM is an unaccountable entity which can and will effectively return a ‘Whoops! Butterfingers…’ after it has helped the user to inadvertently destroy something important, or at least waste an afternoon of their time; assuming it will admit liability at all.

Coding AIs Tend to Suffer From the Dunning-Kruger Effect
New research shows that coding AIs such as ChatGPT suffer from the Dunning-Kruger Effect, often acting most confident when they are least competent. When tackling unfamiliar or obscure programming languages, they claim high certainty even as their answers fall apart. The study links model...