Drainpipe Knowledge Base
What is a Instruction Inconsistency AI Hallucination?
An Instruction Inconsistency AI Hallucination occurs when an AI model’s output either:
- Fails to adhere to explicit instructions, constraints, or formatting rules provided in the prompt.
- Directly contradicts foundational information given to it within the same prompt or conversation.
This type of hallucination isn’t about the AI’s general knowledge being wrong; it’s about its failure to correctly use the specific “rules of the game” or the factual groundwork. Instruction Inconsistencies are one type of AI Hallucination.
- Chance of Occurrence: Common (especially with complex or multi-step prompts).
- Consequences: Frustration for users, incorrect task execution, inefficient workflows, and AI acting contrary to its intended purpose (e.g., answering a question instead of translating it).
- Mitigation Steps: Unambiguous prompt phrasing; use delimiters to separate instructions from content; few-shot prompting with examples; adversarial testing to find and fix instruction-following failures.