Introducing the Federated Data Catalog
- Feature Note
The rise of the data mesh architecture has revolutionized data management, emphasizing a domain-specific, decentralized model that treats data as a product. This approach ensures that data ownership, accountability, and management are handled by the teams most knowledgeable about their data.
To support this transition, we have already implemented data product management capabilities in Zeenea, enabling our users to easily create, maintain, and consume enterprise data products. We are proud to introduce our federated data catalog capabilities to support these decentralized efforts further.
This feature note dives into how Zeenea’s Federated Data Catalog empowers your organization to decentralize metadata management and scale your data architecture.
What is a Federated Data Catalog?
Zeenea enables organizations to decentralize metadata management by building a federated data catalog. This allows you to split metadata curation and stewardship responsibility between domains by creating several catalogs inside your platform.
Autonomy in metadata management allows for faster decision-making, data quality, and more efficient resource use. In a federated catalog, organizations can give domains autonomy in metamodeling, data source crawling, and curation, as well as in user and permission management and information sharing.
Moreover, you can level up this federated catalog to an Enterprise Data Marketplace (EDM), an e-commerce-like solution where data producers publish their products, and data consumers explore, understand, and acquire these published products. The EDM enhances data decentralization, transforming data assets into shareable, discoverable data products that foster cross-domain collaboration.
How does it work?
Zeenea offers one shared platform that supports a federated data catalog and EDM use cases. All metadata is managed in a single knowledge graph that can be split between various domains by creating several catalogs on the same platform. Therefore, no technical integration or data mapping is required.
You can associate specific connections and user permissions to each of these catalogs so that each domain is responsible for curating its assets and can control the visibility of these assets by the rest of the federation.
Zeenea Studio serves as the back-end tool for managing and curating data in the federated catalog or EDM. At the same time, Zeenea Explorer provides an intuitive front-end for users to explore and access data products across domains.
Once the federated catalog option is activated, your default data catalog becomes a shared container at the organization level where all users share items of all types across domains. This common catalog can then be used to define a standard classification across domains or even to catalog physical items not supposed to be owned by a specific domain.
Within this framework, the common catalog also contains the business glossary to share a common language at the organization level.
Building a federated catalog
Building a federated catalog is mostly about declining Zeenea generic concepts to subsections of your platform (user permissions, connections, etc.). Zeenea always provides a single catalog, which is the common catalog. The federated catalog lets you create additional catalogs from the admin interface.
A catalog is like a container for your items (e.g., datasets, data products, and metadata entries), similar to a folder in a file system: even if each item is part of the global platform’s graph, it belongs to only one catalog, and you can define read or write permissions for users for each catalog. That means items are not copied or synchronized between catalogs.
Creating catalogs – Zeenea Admin – © 2024
Defining domains (scope)
Defining the scope of these catalogs should be addressed with the same approach as defining domains in a data mesh, as it serves the same purpose. However, even if the data mesh defines an ideal goal, your organization’s context and constraints can make this goal difficult to reach. Zeenea’s federated catalog offers much flexibility in supporting any organization’s topologies.
The concept of domains is widely understood, and domain division is often stable—whether structured according to value chains, major business processes, or your organization’s operational capabilities.
When you have defined the scope of your first catalogs, you can define permissions and access for users with groups.
Managing Groups: our new permission system
Our new permission system, Groups, offers fine-grained, role-based control, enabling organizations to tailor access to the needs of each domain while maintaining centralized oversight. In a federated catalog, Groups work mainly like in a single-catalog platform. The difference is that you can define different permission levels per catalog inside the same group. It also enables assigning different groups to a single user and offers much more flexibility in adjusting users’ permissions by adding/removing them from specific groups.
For example, you could assign Group A read/write permissions to Catalog 1, while Group B has only read access. Users in multiple groups inherit permissions from all their assigned groups.
The assignment of users among different Groups can be achieved using the SCIM API.
Managing Groups – Zeenea Admin – © 2024
Feeding the catalogs
In the federated catalog:
- Each connection must belong to a single catalog.
- Connections are, by default, attached to the common catalog, but you can configure them for a specific catalog.
- Associating connections with specific catalogs ensures that the right teams manage data sources and metadata, contributing to overall efficiency.
The list in the Administration section allows you to identify the catalog to which each connection belongs. Then, in Zeenea Studio, you can select items to import by selecting a connection.
Managing Connections – Zeenea Admin – © 2024
Defining the metamodel
Catalog design and metamodeling remain global features in the federated catalog to ensure consistency across domains. Therefore, the catalog design section in Zeenea Studio defines item types, properties, responsibilities, and templates for the whole federation.
We’re currently working on new metamodeling capabilities. While keeping one central metamodel to the federation, these will allow you to customize specific sections and set visibility options at the catalog level.
Defining the metamodel – Zeenea Studio – © 2024
Operating the federated catalog
Documenting in Zeenea Studio
Zeenea Studio’s stewardship job remains unchanged: edit names, descriptions, properties, etc. As mentioned above, the Studio is the back-office tool for one specific catalog. Users who have access to several catalogs can switch from one to another by using the catalog selector in the Studio header.
Managing the Business Glossary
The business glossary allows you to share a common language at the organization level. Therefore, the common catalog contains all the glossary items, meaning two things: all glossary items are public to all users in Zeenea Explorer, and you will need the “Manage glossary” permission on the common catalog to create or edit glossary items in Zeenea Studio.
Like in a single-catalog platform, you can design the metamodel of your glossary (parent/child relations, implementations, templates). As the glossary belongs to the common catalog, curators can associate their items with business definitions from their catalogs.
Exploring the Business Glossary – Zeenea Explorer – © 2024
Monitoring completion and usage
In the Analytics section, you can track the completion level of your catalog items, monitor metadata curation progress, and measure how frequently different data products are accessed across domains.
Monitoring completion and usage – Zeenea Studio – © 2024
Publishing in the marketplace
Zeenea also allows you to set up a marketplace at the organization level by creating a federated knowledge graph. Indeed, curators can give all users in the federation via the Studio access to a specific item, even if they do not have read access to the catalog to which the item belongs.
By sharing an item in the EDM, they enable curators from other catalogs to create links between it and other items in their catalog (e.g., lineage construction). Shared items will also be searchable by any user in the Explorer.
The EDM supports metadata management decentralization and breaks data silos. It turns your federated catalog into an organizational asset by transforming data into a shareable, discoverable resource that fosters collaboration between teams, domains, and business units.
Publishing to another catalog – Zeenea Studio – © 2024
Searching in the Federated Data Catalog
In the federated catalog, end-users can search and explore the data landscape of the whole organization through a single interface: Zeenea Explorer. Based on your user groups and related catalog access rights, you can perform searches among all your items and items shared in the EDM by any catalog in the same place.
Thanks to a badge in the search results, you can identify each item from which catalog it belongs. You can also filter items by catalog(s) to refine your search by selecting the ones you are interested in.
Topics are also available in the federated data catalog. Each catalog curator is responsible for defining Topics for their perimeter. Therefore, Topics offer a way to determine entry points to explore and search inside a single catalog.
The Enterprise Data Marketplace Topic is an exception. This built-in Topic is available to all Explorers, Groups, and shared items in the federation regardless of their catalog.
Searching in the federated data catalog – Zeenea Explorer – © 2024
Ready to build your federated data catalog?
Zeenea’s Federated Data Catalog brings the future of decentralized data management to your organization. Start building your federated catalog today and unlock the potential of seamless data access, discoverability, and collaboration. For more information or to get started, contact your Customer Success Manager or our Sales Team.