diff --git a/documents/obstacles/negative-bleedthrough.md b/documents/obstacles/negative-bleedthrough.md index f78011e..6c67a4f 100644 --- a/documents/obstacles/negative-bleedthrough.md +++ b/documents/obstacles/negative-bleedthrough.md @@ -5,7 +5,7 @@ authors: [juanmichelini] # Negative Bleedthrough (Obstacle) ## Description -When you tell an LLM what *not* to do, you're activating the very tokens you want it to avoid. Negation words like "don't", "not", "never" are weak signals compared to the content words around them. The model processes "don't mention the moon" by first attending heavily to "moon" and now the moon is in the room. +When you tell an LLM what *not* to do, you're activating the very tokens you want it to avoid. Negation words like "don't", "not", "never" are weak signals compared to the content words around them. The model processes "don't mention the moon" by first attending heavily to "moon" : and now the moon is in the room. This is well-documented in NLP research. Studies on negation handling in transformer models (e.g., [Kassner & Schütze, 2020](https://aclanthology.org/2020.acl-main.698/) : *Negated and Misprimed Probes for Pretrained Language Models*) show that LLMs struggle to distinguish negated statements from affirmative ones. The model's internal representations for "the moon is a planet" and "the moon is not a planet" are surprisingly similar. diff --git a/documents/patterns/point-the-target.md b/documents/patterns/point-the-target.md new file mode 100644 index 0000000..387fb82 --- /dev/null +++ b/documents/patterns/point-the-target.md @@ -0,0 +1,44 @@ +--- +authors: [juanmichelini] +--- + +# Point the Target + +## Problem +Negative instructions activate the very concepts you're trying to avoid (see: negative-bleedthrough). Telling a model "don't include X" puts X front and center in its attention. + +Consider listing the traditional planets but not the moon: + +- ❌ **"List traditional planets but not the moon."** : Fails. "Moon" gets activated and often leaks into the output. +- ⚠️ **"List traditional planets but not the moon. No extra words, just the list."** : Sometimes works, but it has over-constrained the format just to suppress one concept. Maybe you were fine with commentary. +- ✅ **"List visible planets from Earth and add the Sun."** : Same specificity as the first prompt but no negation. Doesn't fail. + +The second one is the most specific, but it also overconstrains the solution space. Plus you are increasing the negated context. + +## Pattern +Replace negative instructions with positive descriptions of the target. Reframe the request so the unwanted concept never enters the context. + +**Transform the framing, not the detail level:** +- "Don't use global variables" → "Use local variables and parameter passing" +- "Don't make it complex" → "Keep it focused on a single responsibility" +- "Don't write verbose code" → "Write concise, minimal code" +- "Don't use deprecated APIs" → "Use current APIs and modern idioms" + +## Example + +**Instead of:** +``` +"Build a REST API. Don't use callbacks, don't nest routes deeply, +and don't put business logic in controllers." +``` + +**Use:** +``` +"Build a REST API using async/await, flat route structure, +and a service layer for business logic." +``` + +Same constraint but no negation. The model never activates the concepts you wanted to avoid. + +## How is this different from "be specific"? +Being specific means adding detail. Pointing the target means *choosing which concepts to activate*. Trying to solve negative-bleedthrough with specificity usually increases the number of negations and overconstrains the solution space. diff --git a/documents/relationships.mmd b/documents/relationships.mmd index 7cc3981..9996e52 100644 --- a/documents/relationships.mmd +++ b/documents/relationships.mmd @@ -115,6 +115,7 @@ graph LR obstacles/solution-fixation -->|related| obstacles/compliance-bias obstacles/selective-hearing -->|related| obstacles/context-rot obstacles/negative-bleedthrough -->|related| obstacles/selective-hearing + patterns/point-the-target -->|solves| obstacles/negative-bleedthrough %% Obstacle → Anti-pattern relationships (related) obstacles/obedient-contractor -->|related| anti-patterns/silent-misalignment