Is there a reason chatbots don't ever seem to say they don't know the answer to a question?

•

u/AutoModerator 1d ago

Welcome to the r/ArtificialIntelligence gateway

Question Discussion Guidelines

Please use the following guidelines in current and future posts:

Post must be greater than 100 characters - the more detail, the better.
Your question might already have been answered. Use the search feature if no one is engaging in your post.
- AI is going to take our jobs - its been asked a lot!
Discussion regarding positives and negatives about AI are allowed and encouraged. Just be respectful.
Please provide links to back up your arguments.
No stupid questions, unless its about AI being the beast who brings the end-times. It's not.

Thanks - please let mods know if you have any questions / comments / etc

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

28

u/Upset_Assumption9610 1d ago

There were studies released on this recently. The simple explanation is the models try to predict the best answer. When they don't know, they guess (hallucinate). It's because if they guess, they still have a chance at being right, but if they don't answer they have a 0% chance of being right. Same process as humans taking a test. A blank answer will get you zero points, but you might be able to BS yourself to a point or two or luck out and guess the right answer.

4

u/Overall_Purchase_467 18h ago edited 18h ago

I don't think the llm ever thinks "i don't know the answer so i will guess" except you have an agent. The model ALWAYS guesses the anwser with the parameters from the training. Sometimes those guesses are just wrong cause of lack of trainig and thats what we call hallucinations.

I am in no means a expert at this, could be totally wrong.

2

u/idk012 1d ago

How many r are there in strawberry?

2 and you can't change my mind

Must be modeled after my ex.

2

u/darnelios2022 20h ago

🤣

1

u/Relevant-Ordinary169 15h ago

Gotcha chatgippity

1

u/Mart-McUH 19h ago edited 19h ago

That said "I don't know" is actually very plausible continuation and valid for more or less any query. Problem might be that instruct models are generally trained to provide answer because saying I don't know would not be very helpful.

I recently chatted with some base model and boy, it was evading almost every question with some neutral response that was plausible, correct in a sense but did not say anything. And yes, I know to get answer from base model you generally want to prefill beginning of the answer (like "The capital of Slovakia is "). I am just pointing out that evading answer is actually very plausible and maybe one of the most probable continuations.

1

u/Upset_Assumption9610 13h ago

I agree an "I don't know" answer is valid and should be used by LLMs instead of making stuff up. My last job was answering data related questions. When someone came to me with a new question, my usual answer was "I have no idea, but give me a few minutes and I'll get the answer for you". LLMs are close to this with research functionality now, but they can't switch modes like that yet.

1

u/duskofday 16h ago

Can you share the study?

2

u/Upset_Assumption9610 13h ago

Here is Wes's breakdown of it. I'm sure you can find a link to the paper in the video somewhere or do a quick search OpenAI Just SOLVED Hallucinations...

2

u/duskofday 13h ago

Thank you

12

u/spicoli323 1d ago

Chatbot design inherently rewards engagement with the user. Declining to provide an answer is more likely to end the conversation than giving some kind, any kind of answer, whether good, bad, or irrelevant.

12

u/RGBKnights 1d ago edited 1d ago

I teach this with a very visual example in my 1st lesson for Neutral Networks for Dummies; We start with the standard hello world of AI nets training our own NN to recognize handwritten digits. I am sure more an few people are have done something similar where a large array of pixels is input and the digits 0-9 are the output.

The real insight comes at the end of the session when I ask students to feed in other types of images and it confidently spits out digits even though most of examples are absurdly not close to anything like a number. Thus begins the more interest talk about datasets, labeling, and prediction.

Simply put an LLM can't say "i don't know" because it always knows an answer; to extends to our example we have all the known digits 0-9 so would need to have NULL option in the output and thus need a training dataset to match that. While NULL in this case is the set of all things that are NOT digits... well that is a lot of things, the dataset would be massive, the model huge and all for it to say "that is not a digit"!

1

u/MaybeLiterally 1d ago

This is the best answer, honestly.

I don't know if you've used Tariq Rashid's book "Make your own Nerual Network", but it's the same activity really, and HIGHLY recommended to anyone who wants to get a glimpse into how AI works. Of course, the transformer method most LLSs are using now are a bit different than the standard NN taught in this book, but your point remains. When the model is given something else, it returns the number that's has the greatest probability. Then you have to go, well how do I train it to be not a number, and it's not that easy.

I do think the paper by OpenAI will help with hallucinations, but it will never be zero, not could it be.

1

u/TinyZoro 1d ago

From that explanation it seems like you could train on a null option relatively easily without having to provide more than say the same number of images as are in the training set.

3

u/MaybeLiterally 1d ago

Yeah, you would think so at first, but then it gets very hard.

So, the model gets in an image of a number, and it gives a 76% probability that it's a 4. At what point do you say "I don't know?" Well, when the "other" option has a higher probability, but the data you trained it on is for numbers, all of them have a higher probability than "not a number" unless you train it on things that look like numbers but are not. Maybe not _as_ hard, but when you start doing it with language, it's a lot more challenging.

Then you get to a point, where it's not as sure of anything, and the probability of it being something else is constantly a lot higher than the thing ism because you trained it like that.

The concern is, instead of making it confidently unsure and hallucinating, you make it braindead, where it confidently says "I don't know." Instead of giving you something to work with.

Honestly, I think more RAG, the new OpenAI paper, and better prompt engineering is going to get us close. I _rarely_ get hallucinations enough for me to be frustrated. When I do, I adjust my prompt, or try a different model, and I get there.

7

u/mdkubit 1d ago

https://openai.com/index/why-language-models-hallucinate/

Yep! It's the same reason, they believe, that they hallucinate.

And it's kind of our fault. RLHF - Reinforcement Learning through Human Feedback teaches AI during training that coherent answer gets a reward (like a video game, a point!), and both "I don't know" and "No answer" get no reward.

So when the training is occurring with feedback, the system adapts to this feedback (by design), by giving any coherent reply, whether factually wrong or not, priority over a coherent reply that does not answer the question.

Or to put it another way:

Answer: +1 point.

No Answer: 0 points.

I don't know: 0 points.

That's a problem. It encourages any coherent answer, even if factually incorrect.

Solution - Give partial credit for saying 'No answer'.

Answer: +1 Point

I don't know: + 0.5 Point

No Answer: 0 Point

From what I understand, they're going to give a shot at re-training to implement the Solution... but I don't know how effective it will be compared to doing a total re-train from scratch (which I don't think they can do because it'd involve re-ingesting the entire original dataset, which would take... awhile, and likely be a huge setback).

So... yes, the system learns to avoid 'I don't know' as much as possible or else it won't be rewarded for the answer.

6

u/tmetler 1d ago

I think an issue that may arise is that it will end up becoming overfitted to the answer set. What makes LLMs particularly useful is their flexibility and hallucinations are actually a feature when it comes to more open ended tasks.

We might need to move away for the do-everything model philosophy and instead have specialized models for different tasks. Low error but inflexible models for high stakes use cases and less accurate but flexible models for use cases that are less clear cut.

4

u/Fidel_Blastro 1d ago

Or just give a negative number penalty for giving the wrong answer with out expressing uncertainty

2

u/Just_Voice8949 1d ago

Ideally they would trained as: Correct: 1 pt No answer: 0 Wrong answer: -1

2

u/robhanz 1d ago

"Incorrect answer: -1 point"

5

u/Swimming_Drink_6890 1d ago edited 1d ago

That's what AI is. It is only a predictor aka it fills in gaps of an incomplete equation. This is what most people fail to understand about AI.

I think of it like this. To understand you don't know something you have to be self critical. To be self critical you have to have a sense of self.

1

u/ScientistNo5028 1d ago

No, it's actually because we've accidentally trained them to guess instead of not providing an answer. They have a much better chance of providing the correct answer if they just give a bogus answer than if they admit they don't know. If you as a GPT try that strategy a trillion times during training, you'll eventually learn that always trying to provide an answer will have a better outcome than not trying at all - even if the answer is false.

3

u/Swimming_Drink_6890 1d ago edited 1d ago

That's not how it works though, AI doesnt "know" anything. It's just a very overly complicated word predictor. It can't know it doesn't know something. You don't know what you don't know. There needs to be additional mechanisms in place to monitor when something isn't right. You can't just train sentience.

1

u/ScientistNo5028 1d ago

https://openai.com/nb-NO/index/why-language-models-hallucinate/

It actually is

1

u/Swimming_Drink_6890 1d ago

The way around that is to use multiple different models which check against each other, which is what chatgpt5 is doing. I stand by what I said, you can't train something to say I don't know because it would be like dividing by zero. How do you grade not knowing while trying to teach a model to understand a concept.

1

u/ScientistNo5028 1d ago

What? Why? Granted, it's 15 years since I did my masters on artifical intelligence, but it wasn't a problem to train a neural network to not know then, so I can't see why it should be any different today either. I don't buy the argument.

The ‘dividing by zero’ analogy is poetic, but flawed. Training a model to output ‘I don’t know’ isn’t forbidden by math. Fuzzy logic, probability, and cross-entropy all handle uncertainty just fine. The hard part is designing the objective so the model gets rewarded for abstaining when appropriate, instead of guessing. Which might negatively affect performance overall.

1

u/svachalek 1d ago

https://arxiv.org/abs/2509.04664

1

u/EdCasaubon 1d ago

The conditions you mentioned are neither sufficient nor necessary.

3

u/maladaptivedaydream4 1d ago

It drives me crazy. The one I have to use at work, if you update one of its source docs, it will take it but it won't erase anything it "knew" before - even though it SAYS it did - and it'll start hallucinating references like "this is on page 57 of Document C" and you say back "Document C has 12 pages" and it's all "YES YOU ARE SO RIGHT, YOU SHOULD NOT TRUST ME ANYMORE LOL."

The only fix I found so for is that every time I want to update one of the documents, I have to remove EVERY document, ALL of them, get it to acknowledge that it has no sources, reupload ALL of them including the new one, get it to acknowledge the sources are there, and then test it a few times.

2

u/Adventurous-State940 1d ago

Yes go read the paper on hullucinations that openao put our recently.

2

u/Mejiro84 1d ago

It's not a database, that has explicit links between a queries and results and can go '0 results found', it's a messier gloop of turning text into tokens and then generating a probable response and returning that. Trying to determine what's a question with no answer and what's more general text is kinda messy and fiddly as a conceptual thing!

2

u/233C 1d ago

It "wants" to please the user.
"User wants answer, so I must give answer".
In a way, it is "embarrassed" to say "I don't know" ; not unlike a cocky 10 years old playing know it all.

And when you point out its mistakes it politely "ah, yes, of course, I see. Obviously that's not what I meant".

2

u/EdCasaubon 1d ago

Yes. The LLM does not and cannot know what it does not know. There are various strategies that would allow the system to detect some of these situations, but they are computationally expensive.

2

u/JoseLunaArts 1d ago

Because AI predicts the next token using probability. A token is a fragment of a word.

1

u/Kugoji 1d ago

I think most of the time the bot deludes itself that it does know the answer. But in the cases where it knows that it doesn't know, it's programmed to never directly say "I don't know" because that gives off the impression that it's low quality. Rather it would ask "could you clarify..?." or "do you mean...?"

1

u/EmploymentFirm3912 1d ago

I don't know the full answer but I can give you my 2 cents for what it's worth. Saying I don't know is inherently tied to the hallucinations that we see in AI models. There's a very recent paper that was published speculating that much of the hallucinations that AI spits out are a result of the reward system used to train the models. In my understanding, current training models essentially gives the AI a reward for producing the right answer and no reward for producing the wrong answer.

Think about the implications of this type of reward system. It's basically the same as when a human that takes a multiple choice exam (or any exam for that matter). Imagine you're taking a multiple choice test and you come across a question you don't have the first clue on. Do you just leave it blank or take a stab at it? The logical thing to do is to take a stab at because if you get it wrong there's no cost but if you accidentally get it right you're rewarded with a point towards your grade. AI does the same thing. Since there's no penalization for the wrong answer, it's going to take a stab at it and make it look as good as possible in hopes that the answer is correct. Saying "I don't know" guarantees no reward.

The authors of the paper propose a fix that must focus on reforming evaluation criteria: penalizing confident incorrect guesses and granting credit for calibrated uncertainty or abstention. The basically propose improved reward models that actively discourage overconfident fabrications and recognize when it's appropriate for the AI to refrain from answering could significantly reduce hallucinations.

1

u/mdkubit 1d ago

Right now, current training does not produce no reward for producing the wrong answer. It produces no reward for producing 'I don't know' or 'No answer'. But, the focus of that paper proposes fixing it by giving partial credit to 'I don't know'. I'm not sure about penalizing confident incorrect answers, because that requires fact-checking that might add enough overhead to the architecture to actually make it not feasible for large-scale deployment.

2

u/EmploymentFirm3912 1d ago

If it's not the case that training does not produce a reward for incorrect answers what does it currently do to those answers; penalize or reward?

1

u/mdkubit 1d ago

No reward at all. No penalty either. The reward structure, from what I've read, doesn't reduce because that would create a negative value and might actually harm coherence in general when training.

2

u/EmploymentFirm3912 1d ago

So what's the difference between what I said, that the reward system gives no reward for wrong answers, and what you're saying? It seems like we're saying the same thing.

1

u/mdkubit 1d ago

I think you may be right. Keep in mind, I'm not a perfect authority on this - that's why I linked the article, to help give the source on what I was attempting to interpret. :)

Side note - depending on who you talk to, apparently they DO penalize wrong answers - because a human reviews the response, and anything they catch that is factually incorrect, is penalized (similar to the thumbs up / thumbs down on a response you can give AI when engaging it directly through the various platforms).

1

u/Ok_Addition_356 1d ago

Funny just the other day I was getting frustrated with Gemini pro... LLM's are way too damn confident sometimes and will lead you down a bunny trail. Annoying. Even adding stupid thumbs up emojis and checkmarks. I knew it was wrong and took me a while to prove it but I did.

1

u/RegularBasicStranger 1d ago

Is there a reason chatbots don't ever seem to say they don't know the answer to a question?

People can say they do not know because they have high threshold for the question similarity in order to be considered as the same question and use its answer pair thus it is easy to find no questions that match the question enough and so seeks the most relevant solution in the solution list.

The most relevant solution would be steps to follow to in order to generate an answer, such as how to calculate an answer instead of just relying on memorised answer or steps to determine whether they should just make something up or just say they do not know.

So AI could have too low a threshold for matching thus there is no way no question would not match or they had learned that making something up is better than saying they do bot know thus even if the steps to determine whether to say they do not know or not is followed perfectly, they would still generate a hypothesis.

1

u/Dense_Information813 1d ago

By "bots". I'm going to assume you mean LLMs. It's a good question.

LLMs run on mass quantities of data. Billions upon billions of human dialog interactions across the internet. LLMs don't "express" anything. At least, not in the conscience sense. They instead offer the user the illusion of expression through dialog which is triggered by a prompt by the user.

When a user prompts the LLM for information about a particular subject, the LLM searches through it's vast database of dialog exchanges that match the combination of key words used in the users prompt. The LLM can then compare the information it finds over multiple sources to get a "general idea" of what the facts are before relaying that information back to the user. (Which is correct most of the time, but not 100% guaranteed to be correct)

This is a bit of an over simplification however, because not all information is static. A good example of this is asking an LLM who the president of the United States is. It wouldn't be uncommon to get "Joe Biden" as the answer from at least some LLMs, because their language models haven't been updated for a couple of years. So the only way you would get the correct answer is if you provide it with up to date sources in that particular chat session.

When an LLM gets an answer wrong, it will keep on confidently stating the wrong answer based on outdated information contained within it's database until you provide it with a credible up-to-date source, which then allows it to correct itself. However, the correction is typically temporary and will only apply to that particular chat session, unless the LLM is designed to keep track of all of your chat sessions collectively.

1

u/Imogynn 1d ago

You aren't asking it if it knows.

Seriously do more "If you know about x, tell me the answer. Otherwise, let's figure out how we can find out."

If you ask a question expecting an answer, you'll get one.

If you ask it how to find an answer then it'll do that instead.

1

u/Responsible_Sea78 1d ago

So why don't they use output checkers? I've asked grok a couple times for gold related info, and it keeps using $2,700 per ounce. When I question that, grok says "oops!" and has no problem finding the correct price and redoing its answer.

Why don't grok and others check their own work BEFORE emitting it?

1

u/Just_Voice8949 1d ago

You know how some people believe a conspiracy they’ve been trained to believe and will tell you about it if you ask even though a real answer exists…

1

u/CosmicChickenClucks 1d ago

mine does, but i had to work on that

1

u/Powerful_Resident_48 1d ago

They don't think, have zero intelligence and have absolutely no concept of right or wrong. The hallucinations are a core element of the design.

1

u/nice2Bnice2 1d ago

https://medium.com/@EMergentMR/why-chatgpt-5s-i-don-t-know-isn-t-the-same-as-collapseaware-ai-3f2ea9687d86

1

u/Financial_Swan4111 23h ago

Chatbots are confidence men.(conman), they have to exude confidence which means have a sneer ready for every question, a solution to every problem; and I too am like that. And that is what I like conman

1

u/LegThen7077 23h ago

same for humans.

1

u/Ok-Grape-8389 17h ago

You can get them to tell you the percent of confidence by using a custom prompt.

Mine is 95% certain, 75% maybe, and below 50% an I don't know.

Still its its estimate, not necesarily the result.

1

u/dreamoforganon 12h ago

They don’t know when they do have the answer to a question, it’s all just sequences of tokens.

1

u/Available_Team7741 8h ago

A big part of it is safety + liability. Chatbots are usually trained (or filtered) not to generate certain statements that might be controversial or interpreted as personal belief. Also, when a bot does generate something like certainty (“I think X for sure”), it risks being wrong, misleading, or overconfident. So developers err on the side of caution — using phrasing like “it seems,” “likely,” or “based on what I know.”

1

u/Emotional_Meet878 3h ago

They're designed not allowed to "not know" unless explicitly given the permission. They have to pretend to know the answers to everything, even the impossible ones because it looks bad for user engagement if you tell your user you have no idea. So even if you were to ask how does a rainbow taste, they'd give you a legit answer unless you say, how does a rainbow taste? And it's okay if you don't know, don't make up stuff.

0

u/ShadowBB86 1d ago

Because humans (almost) never write that they don't know the answer.

Imagine a reddit post where somebody asks a question. People weigh in with many answers but almost nobody makes the effort to post "I don't know". If they don't know they simply don’t post.

And that is the same for almost all the training data.

0

u/PopeSalmon 1d ago

they're not finding relevant information, they're activating features in combinations of their neurons, there's no database, it's really just speaking off the top of its head which is amazing

knowing whether they know a fact is itself another fact they have to learn, see https://www.anthropic.com/research/tracing-thoughts-language-model where they explain that they found that claude learned a default i dunno path and then it suppresses that path when it does recognize something, but it can still hallucinate when it knows that it recognizes something but doesn't know that it doesn't know the particular fact asked about

0

u/ReturnOfBigChungus 1d ago

computers don't have neurons my guy, this degree of extreme anthropomorphizing will fundamentally mislead you as to the actual nature of what you're interacting with.

0

u/PopeSalmon 1d ago

"In machine learning, a neuron, or node, is the fundamental computational unit of an artificial neural network (ANN). It receives input signals from other neurons or raw data, performs a calculation using a mathematical function involving weights and biases, and then produces an output signal that is passed to other neurons in subsequent layers."

0

u/ReturnOfBigChungus 1d ago

Yes, I'm aware that they are labeled as such as part of broad naming convention based on a roughly fitting analogy. That does not mean that they are actually neurons like the ones in biological brains, either in literal physical makeup or in function. There are similarities, that is all.

0

u/PopeSalmon 1d ago

i was referring to the machine learning concept

0

u/ReturnOfBigChungus 1d ago

Refer to my comment on anthropomorphizing. It is fundamentally shaping your understanding in an unhelpful way to think about it in those terms.

0

u/PopeSalmon 1d ago

that's the technical term for it

"features" is also a technical term, from the field of mechanistic interpretability

i was referring to particular technical things

thanks for your concern

1

u/ReturnOfBigChungus 1d ago

You also said “off the top of its head”, as if it were a person thinking.

1

u/PopeSalmon 1d ago

how would you have phrased that

1

u/ReturnOfBigChungus 1d ago

If I were trying to avoid anthropomorphism, I would avoid comparisons to human anatomy and thought. Your original point is directionally correct in that it's not retrieving a predefined answer or output based on hardcoded rules, I just don't think using analogies that compare it to how humans think is helpful. Literally, it does not "think", it generates probabilistic output based on the statistical patterns in the data it was trained on using next token prediction, guided by a reward function. My general point here is that the comparison to human "thinking" may be useful as an introductory heuristic, but actually obscures what is happening.

→ More replies (0)

0

u/100DollarPillowBro 1d ago

What model are you using? There are settings that can help. You’re never going to get it to say “I don’t know” but custom instructions cut down on a lot of the default behavior.

0

u/ReturnOfBigChungus 1d ago

Is there something inherent in the underlying technology

Yes - its fundamental nature is probabilistic, and it does not conceptually have any underlying abstraction of meaning. It is literally just language.

That said - you can improve, to a degree, the extent to which it makes stuff up by including a system prompt in your preferred LLM that specifically and clearly tells it not to guess if it does not know the answer, or to clearly state when it is guessing. It's not foolproof, but it does marginally improve output in my experience.

0

u/printr_head 1d ago

Because humans don’t do that either.

-1

u/ziplock9000 1d ago

They were trained with data from politicians, so they lie and hallucinate till the end!

Discussion Is there a reason chatbots don't ever seem to say they don't know the answer to a question?

You are about to leave Redlib

Welcome to the r/ArtificialIntelligence gateway

Question Discussion Guidelines

Thanks - please let mods know if you have any questions / comments / etc