Unique Blog 2024

The Myth of the Data Moat in Vertical AI

Written by Andreas Hauri | Apr 14, 2025 12:41:28 PM

“Data moats” have become a buzzword in AI. Especially in vertical AI companies, the idea is often treated as a core requirement for defensibility. Greylock recently shared a take on this, sparking interest among VCs. At Unique, we’ve spoken to many investors about where they see real data moats—and got no clear answers. Their feedback mostly echoed Greylock. But when we asked other GenAI companies, the response was blunt: “data moat is bullshit.” Here’s why they might be right.

Defining a Moat

 

The term "economic moat" refers to a business's ability to maintain a competitive edge over its competitors, much like the moats that surrounded medieval castles, acting as barriers of protection. According to Investopedia, this analogy highlights the protective measures a company can employ to safeguard its market position. A friend of mine once offered a more straightforward definition: if someone gave a competitor $50 million, and they still couldn't catch up to you, then you have a moat. This perspective emphasizes the resilience and uniqueness of a company's competitive advantage.

Where the 'Data Moat' Idea Comes From

 

Greylock’s take: AI companies can build moats by using data generated through their product interactions. On paper, it sounds reasonable. But in practice, it's flawed. The data tied to those interactions is deeply entangled with private customer data. You can’t export it, can’t reuse it safely. I haven’t seen a single vertical AI company with a real data moat built this way. For more details, you can refer to Greylock's article here.

 

The Illusion of Data Moats

 

VCs love the idea of data moats, believing that vertical AI companies must possess them to succeed. But the reality? Companies like Harvey, Truewind, Hebbia, DeepJudge, Roggo, Anterior, and Silna haven't proven a defensible data moat. While there might be some data that can be separated and exported, such as metadata on how users interact with the software, its value is debatable. This data often lacks the depth needed to provide a competitive advantage. What truly matters is understanding your customers: how they work, what they need, where they struggle. That’s where the real moat is. 

 

It’s Not Just Financial Services:


The issue of data access is not limited to financial services. It spans across various industries. Healthcare companies can’t use private patient data for training. Legal AI firms can’t touch client correspondence. If that kind of data leaks, deals are lost and trust is gone. The potential leakage of sensitive data could have severe business implications, including the loss of significant deals.

Public and Synthetic Data Aren’t Moats:

 

While direct access to customer data is restricted, companies can utilize publicly available data to train models. Public datasets offer a starting point for developing AI solutions, but they do not provide a unique competitive advantage since they are accessible to everyone. Another approach is the creation of synthetic datasets, which mimic real data without compromising privacy. Although synthetic data can be useful, it does not constitute a true data moat.

The True Competitive Edge:

 

For Unique, the real advantage lies in our deep understanding of our customers' workflows and needs. We spend time inside customer workflows. We see the friction points, the inefficiencies, the unspoken needs. That context shapes everything we build. Horizontal players can’t match that level of fit. Our expertise in the financial services industry gives us an edge, but it is not a data moat in the traditional sense.


First-Mover Advantage

 

Another form of competitive advantage is the first-mover advantage. Being the first to market with a solution allows a company to establish relationships with customers and build a reputation before competitors can catch up. You build trust and get ahead in adoption. But again, that’s not a moat in the data sense. It’s about timing and execution. 


Conclusion

 

In vertical AI, the kind of data moat Greylock describes just doesn’t exist. Customer interaction data is tightly tied to private information. You can’t extract it, reuse it, or build on it. Sure, you can collect some metadata, but it rarely moves the needle. The real advantage comes from knowing your customers deeply, their workflows, pain points, and needs. That’s what gives you an edge. At Unique, that’s where we focus. 

At Unique, we’ve recognized that the latest data management approaches, particularly vector stores, are not built for the complexities of financial AI agents. That’s why we’ve developed an entirely new file system, designed to address the fundamental issues of data organization, searchability, and access control optimized for financial AI Agents.