Original article: The data OS by Benn Stancil
Motivation
Fracturing tools and workflows into many small pieces is destroying the UX. However, “too many tools” in the Modern Data Stack isn’t the correct diagnosis:
“Consolidating the ecosystem is a futile effort. Instead, we should try to figure out how to better manage the tools we have and the shiny future ones we might want.”
The journey from “data mess” through “data mesh” to “data OS”
There are several ways to solve the fragmentation problem, all of which have problems:
Extreme centralization - one vendor owns the entire stack (cloud data warehouse vendor or cloud provider integrating more data tools and services). It’s unlikely that this will become the widely preferred option because the industry values modularity and open-source technologies.
Extreme decentralization - developing a standard to exchange metadata between tools and APIs, controlled by nobody but agreed to by everybody. This outcome is also unlikely because we don’t need complex, fully decentralized systems just to integrate a couple of core tools in the data stack.
Hybrid —not entirely decentralized but rather delegated architecture; individual teams manage their own data and provide those datasets for access to others via data mesh.
While the hybrid data mesh approach seems to be the least “worst” option, it serves more as a catalog and a query wrapper on top of data sources than a system providing a holistic experience.
In contrast, a true “data OS” would do more than data mesh. Benn suggests that dbt could become a standardized platform for accessing, transforming, and governing data, unifying data and tools.
Core message & CTA
Great apps are not sufficient on their own to provide a functional modern data experience. While decentralized contracts and modular tools provide a lot of flexibility, Benn accurately points out that:
“Convenience can be worth more than autonomy.“