Writing
Case Studies
/Sertan Helvacı/8 min read

A Real History Test: 30x-50x Lower Context Cost

In one history-heavy task, Pathrule replaced broad shell search output with a focused context response that pointed to the relevant prior work.

Pathrule
Pathrule routes scoped team knowledge into AI coding sessions.

What this covers

  • A user-provided real test compared broad local shell read/search output with a Pathrule history response.
  • Without Pathrule, the task used 8 shell read/search outputs and produced roughly 35K-45K seen tool-output tokens, with about 71K raw output.
  • With Pathrule, one pathrule_get_context response produced roughly 900-1.2K tokens.
  • The observed ratio was about 30x-50x lower context or tool-output token cost for this history-heavy query.
  • The article treats the result as a real case study, not a universal benchmark.

Key metrics

35K-45K
Without Pathrule
Approximate seen tool-output tokens
~71K
Raw local output
Broad shell search and read output
900-1.2K
With Pathrule
One pathrule_get_context response
30x-50x lower
Observed ratio
Context or tool-output token cost in this test

Before and after

AreaWithout PathruleWith Pathrule
Method8 shell read/search outputs1 pathrule_get_context output
RouteBroad local rg and file readsCloud history narrowed the task to prior C5 billing work
EvidenceLarge billing plan matches in raw outputTwo activity evidence items plus relevant spec and plan paths
Cost shapeVery expensive context payloadSmall focused context response

The test was a history question

The task was the kind of question that makes AI assistants expensive: what happened before, where is the relevant plan, and which prior work should shape the next answer?

Without a context layer, the assistant has a reasonable default move. It searches locally. It reads files. It runs broad matches. It tries to reconstruct history from whatever the repo returns.

That can work, but it can produce a huge tool-output payload before the assistant even reaches the useful part of the answer.

The broad-search path was expensive

In the non-Pathrule run, the assistant used 8 shell read/search outputs.

The approximate seen tool-output token count landed around 35K-45K, with raw output around 71K. The heavy part was not the final reasoning. It was the broad local search output, including large matches from a billing plan file.

This is the hidden cost of rediscovery. The assistant is not doing something irrational. It is trying to find evidence without knowing where prior work lives.

The Pathrule path was narrower

In the Pathrule run, one pathrule_get_context response produced roughly 900-1.2K tokens.

The important difference was not that Pathrule summarized the same huge output more aggressively. The important difference was that Pathrule routed the question to the relevant prior work.

For this test, cloud history already narrowed the task to C5 billing work with two activity evidence items. The response could point to the relevant spec and plan paths and summarize the decision context directly.

That produced a 30x-50x gap

The rough ratio was 30x-50x lower context or tool-output token cost with Pathrule for this history-heavy query.

This should not be read as a universal benchmark. It is one real test, on one kind of task, with one clear failure mode: broad local search was forced to discover prior work that Pathrule already knew how to route.

That is still exactly the kind of task Pathrule is built for. When the question is about project history, the assistant should not have to rediscover the history from raw files if the team already has structured activity evidence.

Why this matters beyond tokens

The token difference is easy to notice, but the workflow difference is more important.

A broad search produces many opportunities for distraction. The assistant may over-weight a large file because it appeared in output. It may spend reasoning budget sorting irrelevant matches. It may answer with confidence after seeing volume instead of relevance.

A focused history response changes the review surface. The team can ask whether the surfaced prior work is the right prior work, not whether the assistant searched enough random places.

This is the test that matters

This kind of test is more useful than a polished demo. It uses real workflow pressure: prior work, billing context, local files, activity evidence, and a measurable difference in tool-output cost.

Pathrule does not need to store your repository source code to make this work. It stores and routes the team knowledge and activity context your team chooses to capture.

The messy, history-heavy tasks are the ones that prove whether the context layer is doing real work. Questions can go to [email protected].