A Data Catalog is NOT a Business Modeling Solution
Some organizations, usually large ones, have invested for years in the modeling of their business processes and information architecture.
They have developed several layers of models (conceptual, logical, physical) and have put in place an organization that helps the maintenance and sharing of these models with specific populations (business experts and IT people mostly).
We do not question the value of these models. They play a key role in the urbanization, the schema blueprints, the IS management, as well as regulatory compliance. But we seriously doubt that these modeling tools can provide a decent Data Catalog.
There is also a market phenomenon at play here: certain historical business modeling players are looking to widen the scope of their offer by positioning themselves on the Data Catalog market. After all, they do already manage a great deal of information on physical architecture, business classifications, glossaries, ontologies, information lineage, processes and roles, etc. But we can identify two major flaws in their approach.
The first is organic. By their nature, modeling tools produce top-down models to outline the information in an IS. However accurate it may be, a model remains a model: a simplified representation of reality.
They are very useful communication tools in a variety of domains, but they are not an exact reflection of the day-to-day operational reality which, for me, is crucial to keeping the promises of a Data Catalog (enabling teams to find data, understanding and knowing how to use the datasets).
The second flaw?: It is not user -friendly.
A modeling tool is complex and handles an important number of abstract concepts which require an important learning curve. It’s a tool for experts.
We could consider improving user friendliness of course to open it up to a wider audience. But the built-in complexity of the information won’t go away.
Understanding the information provided by these tools requires a solid understanding of modeling principles (object classes, logical levels, nomenclatures, etc). It is quite a challenge for data teams and a challenge that seems difficult to justify from an operational perspective.
The truth is, modeling tools that have been turned into Data Catalogs are faced with important adoption issues with the teams (they have to make huge efforts to learn how to use the tool, only to not find wha t they are looking for).
A prospective client recently presented us with a metamodel they had built and asked us whether it was possible to implement it in Zeenea. Derived from their business models, the metamodel had several dozen classes of objects and thousands of attributes. To their question, the official answer was yes (the Zeenea metamodel is very flexible). But instead, we tried to dissuade them from taking that path: A metamodel that sophisticated ran the risk, in our opinion, of losing the end users, and turning the Data Catalog project into a failure…
Should we therefore abandon business models when putting a Data Catalog in place? Absolutely not.
It must, however, be remembered that business models are there to handle some issues, and the Data Catalog other issues. Some information contained within the models help structure the catalog and enrich its content in a very useful way (for instance responsibilities, classifications, and of course business glossaries).
The best approach is therefore, in our view, to conceive the catalog metamodel by focusing exclusively on the added value to the data teams (always with the same underlying question: does this information help find, localize, understand, and correctly use the data?), and then integrating the modeling tool and the Data Catalog in order to automate the supply of certain elements of the metamodel already present in the business model.
Take Away
As useful and complete as they may be, business models are still just models: they are an imperfect reflection of the operational reality of the systems and therefore they struggle to provide a useful Data Catalog.
Modeling tools, as well as business models, are too complex and too abstract to be adopted by data teams. Our recommendation is that you define the metamodel of your catalog with a view to answering the questions of the data teams and supply some aspects of the metamodel with the business model.