An AI Agent Left Running Overnight Burned Through $340 Doing Absolutely Nothing.

The 847-API-call disaster that happens when your agent has no memory — and the architecture fix that actually stops it.

Engineering
7 min read
By Cortyxia

A support agent left running overnight made 847 API calls, consumed $340 worth of tokens, and accomplished exactly nothing. The dashboard showed a single user query: "What's the status of my refund?" The agent had spent eight hours calling every service in the stack, looping between order APIs, shipping endpoints, and inventory databases like a confused customer service rep with no notepad.

This is not a bug in the agent. It is architecture without memory. Every team building agents right now is one bad prompt away from the same bill.

Why Agents Loop

The agent loop is not a coding mistake. It's a memory architecture failure. Here's the chain:

  1. User asks a question. The agent decides it needs order data.
  2. Agent calls order API. The response includes a shipping reference.
  3. Agent calls shipping API. The response includes inventory data.
  4. Agent calls inventory API. The response includes a price.
  5. Price triggers pricing API. Which includes a customer segment.
  6. Customer segment triggers customer API. Which references the original order.
  7. Agent calls order API again. It has no memory of step 2.

Without persistent state, every tool result is a fresh input. The agent cannot distinguish between "I already checked this" and "This is new information I should act on." The loop is inevitable.

Why Prompting Doesn't Fix It

The standard response is "just prompt it better." Add instructions like "Remember what you've already done" or "Don't call the same API twice." This works in demos and fails in production for three reasons:

  • Context windows forget too. Even with perfect instructions, the conversation history is limited. After 20-30 tool calls, earlier actions fall out of the context window entirely.
  • Models hallucinate memories. Ask an LLM to "remember what it did" and it may confidently recall actions that never happened — or forget actions that did. Memory via prompting is fiction.
  • Determinism matters. "Don't call the same API twice" is ambiguous. Same endpoint? Same parameters? Same result? Production requires structured event logs, not semantic approximations.

The Architecture That Actually Works

The fix is not more rules. It's structured memory.

Cortyxia stores every tool execution as a structured event with a unique hash of the inputs. Before the agent executes any new tool call, it queries its memory layer for functionally identical previous calls. If found, it returns the cached result. If not, it executes, stores, and continues.

This is not caching. Caching is time-based and brittle. This is semantic deduplication: the agent recognizes that "check order #12345" is the same operation regardless of when or why it was triggered. The loop never starts because the second iteration sees the first in memory and exits immediately.

In the documented test, adding memory architecture reduced the 847-call loop to 3 calls. Total cost: $0.04. The refund query was answered in 400ms instead of 8 hours.

Key Takeaways

  • AI agent loops are not bugs — they're memory architecture failures.
  • Without persistent state, agents cannot recognize they're repeating actions.
  • Better prompting fails because context windows forget and models hallucinate memories.
  • A single unresolved agent loop cost $340 overnight and 847 API calls.
  • Structured memory with semantic deduplication eliminates loops at the source.

Agent Loops & Memory — Frequently Asked Questions

Agents without persistent memory cannot track what actions they've already taken. When tool calls trigger each other in a cycle, the agent has no record of previous calls and repeats them indefinitely.
Persistent memory stores every action as a structured event. Before executing a new tool call, the agent checks for identical previous actions and returns cached results instead of re-executing.
Prompting helps but is unreliable. Context windows have limited capacity and models may hallucinate false memories. Structured persistent storage is the only robust production solution.
In the documented test, a single unresolved loop consumed $340 and 847 API calls overnight. At production scale, this becomes a major line item with no error messages — just silently escalating costs.

The Bottom Line

Building agents without memory is like hiring a consultant with no notepad. They'll ask the same questions, call the same APIs, and charge you for every repetition. The $340 overnight bill was not an edge case — it was the predictable result of architecture that discards state after every turn. Cortyxia gives your agents a memory. The loops stop. The bills drop. And your agent finally starts acting like it knows what it's doing.

Sources & References

Explore the Documentation

Related Reading