Cut The Cable

April 20, 2026

8 min read

Cut The Cable — ArtemisDiana/Shutterstock.com

April 20, 2026

8 min read

Brady Licht BHSSC

Overview: Most AI-powered learning tools send far more to the AI than they need to. By handling what you can with structured data and saving the generative call for what you genuinely cannot, you get tools that are cheaper, more reliable, and often produce better results.

Summarise this page with your favorite AI assistant

Why Your AI-Powered Learning Tools Should Use Less

When I started building AI-powered tools for education, my instinct was the same one most people have: send the user's input to the AI, get a response, display it. The AI was the tool. Everything ran through it.

It worked, until it didn't. An API key expired over a long break, and nobody noticed for weeks. The service went down occasionally, and the tool went with it. And at some point, I realized I was paying a monthly AI API bill for what amounted to answering the same twelve questions over and over, which started to feel less like a solution and more like a dependency I had built for myself.

I have built about half a dozen learning tools over the past year, and somewhere around the third or fourth one, an approach started to take shape that I wish I had understood from the beginning: the AI should be the finishing layer, not the foundation.

In this article...

The Principle: Do The Work Before You Call The API
- What This Looks Like In Practice
- Less Context Often Means Better Answers
What Happens When The AI Is Unavailable?
A Quick Check Before You Build (Or Buy)
Why This Matters Beyond Cost Savings

The Principle: Do The Work Before You Call The API

Every AI API call costs money, measured in tokens sent and received. But the financial cost is actually the easiest one to see. What took me longer to appreciate was the latency cost (an API call adds seconds to a user's experience that a local look-up does not), the reliability cost (any external dependency is a point of failure you do not control), and the environmental cost that is easy to ignore at the scale of one query but accumulates across a tool serving hundreds of users a day.

The approach I have landed on is straightforward: do as much as possible with structured data and deterministic logic before the AI ever gets involved. If you can answer a question with a keyword match against a curated knowledge base, do that. If you can filter a set of recommendations down to the relevant ones using a user's drop-down selections, do that. Save the AI API call for the part of the task that actually requires generative capabilities, where the response needs to be personalized or synthesized in a way that a pre-written answer cannot provide.

To be clear, this is not an argument against using AI. It is an argument for using it precisely.

What This Looks Like In Practice

The FAQ chatbot I built for an organization's help desk is probably the most intuitive example. The version most people would build first, the version I almost built, takes every user question and sends it to the AI along with the entire FAQ database, letting the model figure out which answer to return. That works, but it means every question, including "What are your office hours?", costs tokens, adds seconds of wait time, and requires an active connection to an external service.

What I ended up building uses a hybrid approach. When a user types a question, the tool first runs keyword matching against a curated set of FAQ entries. Each entry has associated keywords and synonyms, and the matching logic calculates a confidence score. If the confidence is high enough, the tool returns the pre-written answer immediately. No AI API call needed.

Only when the keyword matching cannot produce a confident result does the tool escalate to AI. And even then, it does not dump the entire knowledge base into the prompt. It sends only the top candidate entries, the ones that scored highest but fell below the confidence threshold, along with the user's original question. The AI gets a focused set of possibilities instead of everything.

What I like about this pattern is how broadly it transfers. Whether the tool is an internal support bot, a coaching recommendation engine, or a resource finder integrated with an LMS, the architecture is the same: handle what you can with structured data, and send only what you cannot resolve to the generative model.

Less Context Often Means Better Answers

This is the part that surprised me. I expected that sending less data to the AI would save money. I did not expect it to produce noticeably better responses.

In a behavior intervention tool I built for educators, users select a grade band, a behavior category, a framework, and a setting. The tool uses those selections to filter a research base of strategies before the AI sees any of them. Instead of sending 200 strategies and asking the model to pick the relevant ones, it sends 15 to 20 that have already been matched to the user's context and asks the model to synthesize them into personalized recommendations.

That changes the AI's job from search-and-filter to synthesis-and-personalization. The prompt is smaller and the material is already relevant, so the model is not spending its capacity sifting through noise. The output is more focused as a result. I have seen this pattern hold across multiple tools now, consistently enough that I think of pre-filtering as a quality improvement rather than just a cost reduction.

The broader takeaway applies to anyone building or evaluating a tool that sends a knowledge base to an AI: can that knowledge base be filtered first using deterministic methods? Drop-down selections, keyword matching, categorical filters, role-based scoping. Any structured data that narrows the field before the generative call is doing double duty, saving cost and clearing out noise the model would otherwise have to navigate around.

What Happens When The AI Is Unavailable?

This is the question that separates tools built for demos from tools built for sustained use, and it is the one I have learned to ask first rather than last. The tools I build operate in tiers. When a valid API key is present and the call succeeds, the user gets the full AI-enhanced experience. When the key is missing or the call fails, the tool falls back to returning curated content directly. Users still get filtered, relevant resources matched to their inputs. They just do not get the AI-generated personalization layer on top. I default to having this fallback mode turned on for any tool I hand off to another organization, because the situations where the AI disappears are predictable: API keys expire, billing lapses over the summer, a new administrator does not realize the tool depends on an external service.

Here is the thing I think gets overlooked in conversations about AI-powered learning tools. For most of them, curated content from known sources is the real value. A tool that returns "here are four research-based strategies matched to your selections" with descriptions and source information is providing something an educator or L&D professional can actually act on and verify. The AI adds conversational polish and situational specificity, but the substance does not depend on it. If your tool cannot function at all without an active AI connection, that fragility is going to show up at the worst possible time.

A Quick Check Before You Build (Or Buy)

Whether you are building a learning tool internally or evaluating one from a vendor, a few questions help clarify how much of the tool's value actually depends on AI.

Can the core function work without AI at all?
This is the most revealing question because it forces a distinction between a tool that is fundamentally an AI product and a tool that uses AI to enhance something that already works. Those two things have very different risk profiles.
What data can be filtered or structured before the AI call?
Every drop-down selection, keyword match, or categorical filter that narrows the context before the generative call is pulling double duty on cost and quality.
Are there responses that can be pre-written instead of generated?
For common questions and standard recommendations, human-authored responses stored in a database are faster, cheaper, and often more trustworthy. Generating them fresh each time is solving a problem that does not exist.
What happens when the API is unavailable?
The answer should not be "the tool stops working."
Does the data need to leave the user's environment?
For anything involving sensitive information, particularly in education and HR contexts, the default should be local processing. Even a simple, imperfect local check (regex pattern matching for names or ID numbers, for instance) keeps data within the organization's control rather than routing it through an external API to determine whether it is sensitive. The irony of sending potentially private data to an external service in order to check whether it contains private data is one of those design problems that becomes obvious only in hindsight.

Why This Matters Beyond Cost Savings

The argument here is not primarily about saving money on AI API credits, although for organizations on tight budgets that is a real consideration. The deeper issue is about what kind of technology we are choosing to depend on.

A tool that requires a constant connection to an external AI service to function at all is fragile in exactly the ways that matter most when organizational circumstances shift. And in every organization I have worked with, circumstances shift regularly. Budgets change, staff turns over, priorities move. A tool that degrades gracefully when the AI layer disappears is a tool that survives those transitions. One that goes dark is a tool that gets replaced.

I keep coming back to a question that sits underneath all of this: if we believe AI should serve learning rather than the other way around, what does that actually look like at the level of tool architecture? For me, it has started to look like tools that depend on AI as little as possible while using it as effectively as possible. The goal is not to avoid AI. The goal is to use it where it genuinely adds something, and to handle everything else with methods that are simpler and that your team can actually see into and control.