While the literature on data mesh is extensive, it often describes a final state, rarely how to achieve it in practice. The question then arises:
What approach should be adopted to transform data management and implement a data mesh?
In this series of articles, get an excerpt from our Practical Guide to Data Mesh where we propose an approach to kick off a data mesh journey in your organization, structured around the four principles of data mesh (domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance) and leveraging existing human and technological resources.
- Part 1: Scoping Your Pilot Project
- Part 2: Assembling a Development Team & Data Platform for the Pilot Project
- Part 3: Creating your First Data Products
- Part 4: Implementing Federated Computational Governance
Throughout this series of articles, and in order to illustrate this approach for building the foundations of a successful data mesh, we will rely on an example: that of the fictional company Premium Offices – a commercial real estate company whose business involves acquiring properties to lease to businesses.
—
In the previous articles of the series, we’ve identified the domains, defined an initial use case, assembled the team responsible for its development, and created our first data products. Now, it’s time to move on to the final data mesh principle, federated computational governance.
What is federated computational governance?
Federated computational governance refers to a system of governance where decision-making processes are distributed across multiple entities or organizations, using computational algorithms and distributed technologies. In this system, decision-making authority is decentralized, with each participating entity retaining a degree of autonomy while collaborating within a broader framework. Federated computational governance’s key characteristics are:
- Decentralization: Decision-making authority is distributed among multiple entities rather than concentrated in a single central authority.
- Computational Algorithms: Algorithms play a significant role in governing processes, helping to automate decision-making, enforce rules, and ensure transparency and fairness.
- Collaborative Framework: Entities collaborate within a broader framework, sharing resources, data, and responsibilities to achieve common goals.
- Transparency and Accountability: Using computational algorithms and distributed ledgers can enhance transparency by providing a clear record of processes and ensuring accountability among participating entities.
- Adaptability and Resilience: Federated computational governance systems are designed to be adaptable and resilient, capable of evolving and responding to changes in the environment or the needs of participants.
The challenges of a federated governance in a data mesh
The fourth data mesh principle, federated computational governance, implies that a central body defines the rules and standards that domains must adhere to. Local leaders are responsible for implementing these rules in their domain and providing the central body with evidence of their compliance – usually in the form of reporting.
Although the model is theoretically simple, its implementation often faces internal cultural challenges. This is particularly the case in heavily regulated sectors, where centralized governance teams are reluctant to delegate all or part of the controls they historically had responsibility for.
Federated governance also faces a rarely favorable ground reality: data governance is closely linked to risk management and compliance, two areas that rarely excite operational teams.
Consequently, it becomes difficult to identify local responsible parties or to transfer certain aspects of governance to data product owners – who, for the most part, must already learn a new profession. Therefore, in most large organizations, the federated structure will likely be emulated by the central body and then gradually implemented in the domains as their maturity progresses.
To avoid an explosion of governance costs or fragmentation, Dehghani envisions that the data platform could eventually automatically support entire aspects of governance.
The aspects of governance that can be automated
At Zeenea, we firmly believe in harnessing automation to address this challenge on multiple fronts:
- Quality controls - many solutions already exist.
- Traceability - development teams can already automatically extract complete lineage information from their data products and document transformations.
- Fine-grained access policy management - there are already solutions, all of which rely at least on tagging information.
The road is long, of course, but decentralization allows for iterative progress, domain by domain, product by product. And let’s also remember that any progress in automating governance, in whatever aspect, relies on the production and processing of metadata.
PREMIUM OFFICES EXAMPLE
At Premium Offices, the Data Office has a very defensive governance culture – as the company operates in the capital market, it is subject to strict regulatory constraints.
As part of the pilot, it was decided not to impact the governance framework. Quality and traceability remain the responsibility of the Data Office and will be addressed retroactively with their tools and methods. Access control will also be its responsibility – a process is already in place, in the form of a ServiceNow workflow (setting permissions on BigQuery requires several manual operations and reviews). The only concession is that the workflow will be modified so that access requests are verified by the Data Product Owner before being approved and processed by the Data Office. In other words, a small step toward federated governance.
Regarding metadata, the new tables and views in BigQuery must be documented, at both the conceptual and physical levels, in the central data catalog (which is unaware of the concept of data product). It is a declarative process that the pilot team already knows. Any column tagging will be done by the Data Office after evaluation.
For the rest, user documentation for data products will be disseminated in a dedicated space on the internal wiki, organized by domain, which allows for very rich and structured documentation and has a decent search engine.
The Practical Guide to Data Mesh: Setting up and Supervising an enterprise-wide Data Mesh
Written by Guillaume Bodet, co-founder & CPTO at Zeenea, our guide was designed to arm you with practical strategies for implementing data mesh in your organization, helping you:
✅ Start your data mesh journey with a focused pilot project
✅ Discover efficient methods for scaling up your data mesh
✅ Acknowledge the pivotal role an internal marketplace plays in facilitating the effective consumption of data products
✅ Learn how Zeenea emerges as a robust supervision system, orchestrating an enterprise-wide data mesh