The IP Framing Problem in AI

Your Data Isn't a Moat. Your Loop Is.

Strategy May 22, 2026 Justin Donnaruma 12 min read

A contrarian read on the AI industry's favorite defensibility argument — and a three-filter test for telling a real moat from data hoarding theater.

Part One title card — Your Data Isn't a Moat. Your Loop Is. — featuring The Three-Filter Moat Test framework

On February 27, 2026, thousands of federal employees lost a tool they had been using every day. Anthropic had refused the Pentagon’s demand to grant unrestricted military use of Claude, and within hours the Trump administration ordered every agency and defense contractor to cease business with the company (the agency-wide directive was later enjoined in court; only the Defense Department blacklist survived on appeal). By the following Monday, GSA had pulled Anthropic from the governmentwide testing tool, HHS staff were given a few hours’ notice to save their work, and defense tech companies were emailing their engineers to switch models immediately. The Pentagon itself estimated months to replace Claude across its programs.

I want you to notice something about that story. None of those organizations had lost their data. The contracts, case files, intelligence summaries, logistics records — the actual proprietary corpora — were sitting safely on government servers. What they lost was the loop: the working relationship between their data, the model that processed it, the prompts the analysts had refined, the integrations a contractor had built, the muscle memory their staff had developed. The corpus was untouched. The moat was gone.

This is not how the AI industry has been talking about defensibility for the last two years. The dominant story — recited by Sequoia, by every CIO panel, by half the LinkedIn essays in your feed — is that proprietary data is the moat. Models commoditize. Compute commoditizes. The one thing competitors cannot buy or replicate, the argument goes, is your unique data.

That story is mostly wrong. It’s so close to right that it has fooled the industry for two years. But the part it gets wrong is the part that matters, and getting it wrong has cost organizations enormous amounts of money on assets they don’t have, while leaving the assets they do have undefended.

Let me tell you what’s actually true.

The strongest version of the data-as-moat argument

I want to give the dominant view its best shot before I take it apart, because this is one of those arguments where the steelman is genuinely compelling and the weakened version is genuinely confused.

The strongest case for data as moat is not the static-corpus case. It’s the flywheel case. Tesla, the example everyone reaches for, doesn’t actually have a moat in 50 billion miles of driving data. It has a moat in 5 million cars producing more miles tomorrow, every one of them tagged with sensor readings, edge cases, driver corrections, and outcomes. Bloomberg’s archive isn’t a moat because it’s old and big; it’s a moat because every Bloomberg terminal in the world is feeding the system new prices, new news, new annotations, every second of every trading day. The compounding asset is the system that produces fresh proprietary data through real usage, in tight enough loops that the model improves faster than competitors can catch up.

This is the argument worth taking seriously. It’s the argument I find most compelling in my own work building AI infrastructure. Foundation models are commoditizing on a clock you can almost set. Synthetic data and retrieval-augmented generation are closing the gap on static corpora at speed. Patents on prompts are mostly performance art. Trade-secret protection on prompt libraries leaks the moment a queryable agent gets prompt-injected. If you put me in a room and asked me what survives — what kind of AI asset will still be defensible in 2031 — I would point to the flywheel.

So why am I telling you it’s wrong?

Because the flywheel is not a data asset. It’s a workflow asset. The data is a byproduct of the loop, and people have been mistaking the byproduct for the prize.

The category error

Here’s the move that has cost the industry two years and an enormous amount of misallocated investment: somebody saw Tesla’s flywheel, named the durable thing “data,” and let everyone else run with that label.

Tesla doesn’t have a data moat. Tesla has a fleet moat. The fleet is the asset. The data is exhaust.

If you took Tesla’s 50 billion miles and gave them — every byte, every label, every edge case — to General Motors tomorrow, GM would not have a moat. They would have a one-time training advantage that erodes as Tesla collects 10 billion more miles in the next year and as GM struggles to instrument its own fleet to keep producing comparable signal. The compounding system that makes the data valuable does not transfer when you transfer the data. The thing that’s defensible is the system, not its outputs.

This is true for every flywheel example you have ever heard, and the industry has consistently mislabeled it. Bloomberg’s moat is the terminal on every desk. Stripe’s moat is the integration in every checkout. Salesforce’s moat is the workflow your sales team has built habits around. The data each of those companies sits on is real, valuable, and beside the point. If you forced any of them to publish their corpora tomorrow morning, none of them would lose meaningful market share by Wednesday — because the corpus isn’t what’s keeping the customer in the seat.

The mistake matters because it sends people down the wrong defensive path. Once you believe the data is the asset, you start protecting the data: NDAs, data-loss prevention, “AI confidentiality” policies, repository access controls, and a thousand other instruments built on the implicit assumption that if you keep the bits inside the building, you keep the moat. That assumption is wrong. The bits are not the moat. The loop is. And the loop is protected by completely different mechanisms — workflow integration, switching costs, feedback closures, contractual posture, and architectural choices about where information flows through your stack.

The Anthropic-Pentagon story is so useful because it isolates this mistake in a clean experiment. None of the agencies lost their data. They lost their loop. Every dollar those organizations had spent treating “AI IP” as a data-protection problem turned out to be insurance against a risk that wasn’t the one that actually fired.

A test for telling the difference

Here’s the test I use when somebody — a founder, an executive, a board member — tells me they have a data moat. Three filters, in sequence. If a dataset fails any one of them, it’s not a moat; it’s something else with a moat-shaped costume on.

The Three-Filter Moat Test — a flowchart applying flywheel, replicability, and decision-utility filters to determine whether a data asset is a real moat. — The Three-Filter Moat Test

Filter one: the flywheel test. Does this dataset produce new data daily through customer use of your workflow? Not “could it, in principle, with the right pipeline” — does it, today, in production, generate fresh proprietary signal as a byproduct of normal operation? If the answer is no, you don’t have a flywheel. You have an archive. Archives are useful, but they are not moats. Foundation models will eat their value on a depressingly predictable schedule, and the more interesting your archive, the faster it gets eaten because it is exactly what some well-funded competitor will pay to license, scrape, or synthesize next quarter.

Filter two: the replicability test. If a well-funded competitor — pick a number that means “serious,” say five million dollars and twelve months — could reproduce eighty percent of this dataset’s value, you don’t have a moat; you have a timing asset. Timing assets are real. They give you a window of eighteen to thirty-six months where you have something competitors can’t easily match. But the window closes, and if you have not converted that window into workflow ownership before it closes, you arrive at month thirty-seven with no advantage and a lot of sunk cost. The strategic question for timing assets is not “how do we keep the data secret” but “how do we use this window to lock in workflow position before the data advantage erodes.”

Filter three: the decision-utility test. Is anyone — internal user, customer, machine — making decisions from this dataset that they couldn’t otherwise make? Or is it sitting in a warehouse because somebody told the CIO that data is the new oil? This filter is the one that catches the largest single category of “data moat” claims, which is data hoarding dressed as strategy. If you cannot point to specific decisions getting made, specific actions getting taken, specific outcomes that change because of this dataset, what you have is storage cost. The most expensive theater in enterprise AI is the theater of pretending storage is strategy.

A real moat passes all three filters. Most “proprietary data” passes none. Some passes one or two — and the ones that pass one or two are the most interesting cases, because they tell you exactly where to invest.

If you pass the flywheel test but fail replicability, you have a workflow-bound timing asset and you should be racing to extend its life through integration depth, not legal protection. If you pass replicability and decision utility but fail the flywheel, you have a static archive that is genuinely irreplaceable today and will be commoditized in two years; you should be using that window to build the loop, not to staff up your IP attorneys. If you pass the flywheel and replicability but fail decision utility, you have built an exquisite engine that produces unique data nobody is using, and you should fix that before you do anything else.

What this means for the Anthropic story

Look at the Anthropic-Pentagon crisis through the three-filter test and the right lessons fall out immediately.

The agencies that experienced this as a catastrophe were the ones whose loops had been built directly on top of Claude — applications calling Anthropic’s API, prompts tuned to Claude’s specific behaviors, contractor-built integrations with Claude in the middle, training and onboarding materials with Claude in the screenshots. None of those loops would have passed the replicability test even one day before the blacklist hit, because the loop was not bound to the agency’s own infrastructure; it was bound to a single vendor’s continued willingness to do business. The data each agency owned was fine. The workflow was a borrowed apartment they had been decorating, and the landlord changed the locks.

What about the agencies and contractors that had built provider abstraction — a gateway between their applications and the underlying models — before the crisis hit? I want to be honest about the evidence here, because the temptation in a piece like this is to tell a story I cannot actually substantiate. There is, as of this writing, no publicly named organization that has come forward to describe its workflow surviving the February crisis intact because of architectural decoupling. The defense contractors, intelligence-adjacent contractors, and regulated enterprises most likely to have built this kind of architecture are also the kinds of customers who do not publish case studies about how they handled a procurement crisis. The silence is consistent with the story being true and intentionally unpublished. It is also consistent with the story not being true at scale yet. Silence is not evidence either way, and I am not going to assert what I cannot back.

What I can assert is the structural point. An architecture in which applications call a gateway rather than calling any single provider’s API directly, in which prompts are versioned against the customer’s own organization rather than baked into a vendor relationship, in which the model behind the gateway is a substitutable parameter rather than a load-bearing dependency — that architecture would, mechanically, have turned the Anthropic blacklist into a configuration change rather than a service outage. The architectural lesson is forward-looking, not backward-looking. The next adverse event in the AI stack has not happened yet, and the question worth asking is not who survived February but who is positioned to survive what comes next.

This is the architectural posture I built into AOSentry — the AI gateway product I founded AOCyber to build — for exactly this reason. I spent two years before AOSentry exploring gateway architectures and AI tooling, watching organizations integrate directly with AI providers as though that were a defensibility strategy. AOSentry is the product that came out of that exploration, designed from the ground up around the assumption that any single AI provider can become unavailable on any given day. The point isn’t that the gateway is novel; gateways for non-AI traffic have existed for decades. The point is that for the last three years, the AI industry has acted as though the right way to build with foundation models was to integrate directly with each provider’s API, treat the prompts and integrations as IP, and trust that whichever provider you picked would still be there next quarter. The Anthropic blacklist proved that this posture is not a defensibility strategy. It’s an unhedged bet on a single vendor’s continued availability. AOSentry isn’t the only way to hedge that bet — you can build a gateway in-house, and many large organizations should — but the underlying lesson stands regardless of how you implement it. If your moat depends on any single AI provider continuing to exist and continuing to do business with you, you don’t have a moat. You have a counterparty risk dressed in moat clothing.

What to do instead

If the data isn’t the moat and the loop is, the practical implications are not subtle.

Stop measuring AI defensibility in terms of corpus size. Start measuring it in terms of feedback closure rate — how fast user actions in your workflow turn into improvements in the model that serves the next action. A small dataset with a tight closure beats a huge dataset with a slow one, every time, on a clock that the foundation model providers cannot help you beat.

Stop investing in IP protection for prompts and start investing in workflow integration depth. The most defensible AI products in 2026 are not the ones with the cleverest system prompts. They are the ones that have become the system of record for a workflow that mattered before AI showed up. The “AI” is the least valuable thing about them; it’s a productivity layer on a position they had already earned.

Stop signing AI vendor contracts as though IP indemnification is the most important clause and start fighting hardest for the no-training-on-our-data clause, the portability and termination rights, and the model-substitution rights. The Anthropic story did not turn on copyright doctrine; it turned on which contracts had termination provisions that survived a political event. The next story will not be different.

And — the part that’s hardest because it requires admitting two years of mislabeled investment — stop treating “AI IP” as a category that maps onto twentieth-century intellectual property. It mostly doesn’t. The Copyright Office concluded the prompt question in January of last year. The Bartz settlement priced training data at roughly three thousand dollars per work. The Thomson Reuters ruling and the NYT case in front of it are not adjudicating doctrine; they are establishing a price-discovery market for training inputs. Treating AI assets as IP — copyrights, patents, trade secrets — is borrowing a frame from a different era and applying it to a category of asset that does not behave like the things those frames were built for. The frame mostly produces theater. The frame’s main beneficiaries are the law firms that bill for the theater.

The line

There is a clean way to say all of this. I have been working it over for a few months and I think it’s just this:

Stop protecting. Start owning the loop.

The protection instinct is the right impulse pointed at the wrong target. The thing worth defending is not the data, not the prompts, not the model weights, not the integrations as artifacts — it is the workflow position those things sit inside of, the loop in which user actions, model outputs, system actions, and outcomes compound on each other faster than competitors can match. The loop is the moat. Everything else is exhaust, accident, or theater.

If you run the three-filter test on your own organization right now and find that you don’t pass all three, that’s not a problem; it’s information. Most organizations will fail at least one filter, and the failure tells you exactly where to invest. If you pass all three, congratulations — you have a real moat, and the worst thing you can do for it is treat it like IP that needs to be locked down. Treat it like a position that needs to be deepened.

In the next post in this series, I’ll do for prompts what this post did for data: take the dominant “prompts are trade secrets” framing, give it its best shot, and then show why the half-life of a prompt is shorter than your NDA — and what the actual protection model looks like in 2026.

Justin Donnaruma is the founder and CEO of AOCyber. He built AOSentry from scratch after two years exploring gateway architectures and AI tooling. AOSentry is an AI security gateway and governance platform that gives organizations one API across every major AI provider, with PII tokenization, immutable audit logs, and post-quantum cryptography from Day 1. If you’re rethinking your AI architecture after the events of February 2026 — or you’d rather not learn the same lessons the hard way — start a conversation.

← Back to Blog