The Game Companion Bot That Accidentally Worked

I built a game sidekick for fun in a weekend. It held together for weeks. Every serious AI companion I built before it fell apart in days. Here's why the dumb one won.

Somewhere around attempt 25 of building an AI companion, I took a break. I was burned out on Python physics engines and activation-level steering and “why does my bot’s personality dissolve by turn 20.” I wanted to do something stupid and fun.

So I built a game sidekick. A voice that sits in an overlay on top of a single-player RPG, sees the game, and comments on what’s happening. Like a friend watching you play.

I spent maybe a weekend on it. The code was embarrassingly simple:

  1. Capture a screenshot of the game every few seconds.
  2. Keep a small dictionary of user-told facts (party members, current chapter, active quest).
  3. Optionally look up game-world info on named NPCs.
  4. Send all of that, plus a short personality description, to a vision-capable LLM.
  5. Display the response in an overlay.

No phase detection. No state machine. No memory system. No observer. No rule engine. Just: here’s what’s on screen, here’s what we know, respond like a friend watching.

It worked better than anything I’d built before. It held character for weeks.

The confusion

I was embarrassed by this. All my serious builds had observer pipelines tracking the user’s typing patterns and message length and sentiment. They had phase detection identifying which of 12 emotional archetypes the user was in. They had structured memory with corrections, constraints, soft preferences, and positive examples. They had context packers optimizing the prompt for “30B survivability” against long-conversation drift.

The game bot had a dict and a screenshot.

Why did the dict beat the context packer?

The answer, which took me months to see

I was doing the hard thing the whole time. The game bot was doing the easy thing. I just couldn’t tell the difference because I’d conflated “serious” with “complex.”

Here’s what the game bot had that my serious builds didn’t:

1. An external oracle that re-anchored the bot every turn.

The game screen was the oracle. Every time the bot spoke, the screen was in the context. The screen is specific (this monster, this party, this dialogue option). That specificity constantly pulled the bot’s attention back to “a specific situation happening right now in a specific game.” There was no way for the bot to drift into generic-assistant mode because generic-assistant mode doesn’t know what’s on screen.

My other builds had no equivalent. They were trying to maintain identity purely through text. Text is malleable. The screen was not.

2. Memory that stored facts, not identity.

The game bot’s memory stored “user is in chapter 2, party is the three NPCs they picked, current quest is the investigation mission.” That’s all facts. None of it was “the bot is getting more sarcastic over time” or “the bot now prefers brief responses.” The bot’s identity lived only in the small, re-injected personality description. Memory couldn’t corrupt it because memory wasn’t holding it.

My other builds had memory that stored identity traits. Every time the memory saved “user seems tired, adjust bot’s energy level,” I was blurring the line between what the bot knows and who the bot is. Once that line blurs, identity leaks into memory and starts getting overwritten.

3. Scope so narrow you couldn’t break it.

“Game companion for this specific RPG” is not a general assistant. It’s a narrow, specific role. “Talk about this game, this screen, this moment.” That narrowness was doing work. The model couldn’t drift toward general-assistant behavior because the context (screen + game facts) was constantly dragging it back to “we are playing a specific game right now.”

My other builds were “general assistant” or “research companion” or “coding buddy.” All of which are within spitting distance of the base LLM’s default behavior. The model wants to be a general assistant. Fighting that is hard. I was making my own job harder by picking scopes that were close to the default.

4. No feedback loop that could corrupt identity.

The game bot’s identity was specified once, re-injected every turn, and never updated. It couldn’t “learn who it was” from its own outputs. There was no memory of “the bot said X, so the bot is now X-ish.”

My other builds had self-updating memory systems. “The bot just told a joke, so increase the humor setting.” “The bot just apologized, so decrease confidence.” These feedback loops sound smart. They’re not smart. They let the model’s behavior feed back into its own identity spec, which means priors leak into identity, which means identity drifts.

What I was doing wrong, restated

I was trying to build identity. The game bot was trying to display identity.

Building is complex. You accumulate state, you update it, you make it richer, you add rules. It’s also wrong, because identity built inside a model gets eaten by the model’s priors. Displaying is simple. You have a spec somewhere, you show it to the model every turn, you let the model render text against it.

Display works. Build doesn’t. I had been building for a year.

The lesson, generalized

If you’re making an AI companion or agent that needs to feel like a consistent character, here’s the structural checklist:

  1. Is there an external oracle? Something outside the LLM that keeps getting re-injected, that anchors the bot’s attention to something specific. For the game bot it was a screen. For a coding assistant it might be the current file. For a customer service bot it might be the user’s account state. If your bot has nothing external, its only anchor is its own conversation history, which drifts.

  2. Does identity live outside the model and get re-injected? If yes, good. If identity lives “inside” via system prompts getting buried under conversation turns, it will drift.

  3. Does memory store facts, not identity? If your memory schema has fields for “personality traits” or “bot mood” or “current persona state,” you’re blurring the line. Strip it back to facts only.

  4. Is the scope narrow enough that the priors help instead of hurt? “Game sidekick for this specific RPG” is far from “helpful AI assistant.” The priors for each differ enough that narrow scope forces the model out of default behavior. “General companion” is basically the default, which means you’re fighting priors the whole time. Narrower is easier.

  5. Are there any self-updating feedback loops from bot behavior to bot identity? If yes, rip them out. They sound smart. They’re entropy sources.

The meta-lesson

Simple things that work are almost always doing something subtle right that complex things are doing wrong.

My complex builds had machinery that looked like it should stabilize identity. None of that machinery addressed the actual problem, which was “where does identity live and how does it survive long conversations.” The game bot addressed that problem incidentally, by being too simple to do the wrong thing.

Next time you’re building an AI system and the simple version is working better than the complex one, don’t assume the simple version is lucky. It’s doing something right that you haven’t named yet. Find what that is before you add more complexity.

Postscript

I went back and rebuilt my serious companion bot using the same architecture. Small personality spec, re-injected every turn. Facts-only memory. External anchors where possible. No feedback loops.

It works. Finally.

Eleven months. The answer was a weekend project.