While the literature on data mesh is extensive, it often describes a final state, rarely how to achieve it in practice. The question then arises:
What approach should be adopted to transform data management and implement a data mesh?
In this series of articles, get an excerpt from our Practical Guide to Data Mesh where we propose an approach to kick off a data mesh journey in your organization, structured around the four principles of data mesh (domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance) and leveraging existing human and technological resources.
- Part 1: Scoping Your Pilot Project
- Part 2: Assembling a Development Team & Data Platform for the Pilot Project
- Part 3: Creating your First Data Products
- Part 4: Implementing Federated Computational Governance
Throughout this series of articles, and in order to illustrate this approach for building the foundations of a successful data mesh, we will rely on an example: that of the fictional company Premium Offices – a commercial real estate company whose business involves acquiring properties to lease to businesses.
—
In the previous article, we discussed the essential prerequisites for defining the scope of your data management decentralization pilot project, by identifying domains and selecting a use case. In this article, we will explain how to establish its development team and data platform.
Building the pilot development team
As mentioned, the first step in our approach is to identify an initial use case and, more importantly, to develop it by implementing the 4 principles of data mesh with existing resources. Forming the team responsible for developing the pilot project will help implement the first principle of data mesh, domain-oriented decentralized data ownership.
PREMIUM OFFICES EXAMPLE
The data required for the pilot belongs to the Brokerage domain, where the team responsible for developing the pilot will be created. This multidisciplinary team includes:
- A Data Product Owner
- Should have both a good understanding of the business and a strong data culture to fulfill the following responsibilities: designing data products and managing their lifecycle, defining and enforcing usage policies, ensuring compliance with internal standards and regulations, and measuring and overseeing the economic performance and compliance of their product portfolio.
- Two Engineers
- One from the Brokerage domain teams - bringing knowledge of operational systems and domain software engineering practices, and the other from the data team - familiar with DBT, GCP, and BigQuery.
- A visualization developer
- Who can design and build the dashboard.
Domain tooling: the data platform of the data mesh
One of the main barriers to decentralization is the risk of multiplying the efforts and skills required to operate pipelines and infrastructures in each domain. But in this regard, there is also a solid state-of-the-art inherited from distributed architectures.
The solution is to structure a team responsible for providing domains with the technological primitives and tools needed to extract, process, store, and serve data from their domain.
This model has existed for several years for application infrastructures and has gradually become generalized and automated through virtualization, containerization, DevOps tools, and cloud platforms. Although data infrastructure tooling is not as mature as software infrastructure, especially in terms of automation, most solutions are transferable, and capabilities are already present in organizations as a result of past investments. Therefore, nothing is preventing the establishment of a data infrastructure team, setting its roadmap, and gradually improving its service offering: simplification and automation being the main axes of this progression.
The three planes of the Data Mesh platform
The data platform for data mesh covers a wide range of capabilities, broader than infrastructure services. This platform is divided into three planes:
1. The Data infrastructure provisioning plane – provides low-level services to allocate the physical resources needed for big data extraction, processing, storage, real-time or non-distributed distribution, encryption, caching, access control, network, co-location, etc.
2. The Data product developer experience plane – provides the tools needed to develop data products: declaration of data products, continuous build and deployment, testing, quality controls, monitoring, securing, etc. The idea is to provide abstractions above the infrastructure to hide its complexity and automate the conventions adopted on the mesh scale.
3. The Data mesh supervision plane – provides a set of global capabilities for discovering data products, lineage, governance, compliance, global reporting, policy control, etc.
But others have several platforms, some entities, or certain domains having their infrastructure. It is entirely possible to deploy the data mesh on these hybrid infrastructures: as long as the data products respect common standards for addressability, interoperability, and access control, the technical modalities of their execution are of little importance.
PREMIUM OFFICES EXAMPLE
Premium Offices has invested in a shared cloud platform – specifically, GCP (Google Cloud Platform). The platform includes experts in a central team who understand its intricacies. For its pilot project, Premium Offices simply chose to integrate one of these experts into the project team. This individual will be responsible for finding solutions to automate the deployment of data products as much as possible and identifying manual steps that can be automated later, as well as any missing tools
In our next article, learn how to execute your data mesh pilot project through the design and development of your first data products.
The Practical Guide to Data Mesh: Setting up and Supervising an enterprise-wide Data Mesh
Written by Guillaume Bodet, co-founder & CPTO at Zeenea, our guide was designed to arm you with practical strategies for implementing data mesh in your organization, helping you:
✅ Start your data mesh journey with a focused pilot project
✅ Discover efficient methods for scaling up your data mesh
✅ Acknowledge the pivotal role an internal marketplace plays in facilitating the effective consumption of data products
✅ Learn how Zeenea emerges as a robust supervision system, orchestrating an enterprise-wide data mesh