A data catalog tool will of course reduce the workload but won’t in and of itself guarantee the success of the project.
In this series of articles, discover the pitfalls and preconceived ideas that should be avoided when rolling out an enterprise-wide data catalog project. The traps described in this are articulated around 4 central themes that are crucial to the success of the initiative:
- Data culture within the organization
- Internal project sponsorship
- Project leadership
- Technical integration of the Data Catalog
—
As with all projects, a metadata management initiative has to be properly steered to meet the objectives within the best time frame and costs. It is important however that the steering of the project doesn’t fall into some of the ruts we describe below.
The quantity of the metadata should never become more important than metadata quality
The purpose of the data catalog is to document assets from company data. When the project starts, the absence of information often leads to the same tendency: adding lots of information.
A good data catalog however isn’t characterized by the quantity of objects, but rather by the quality and coherence of the information. These characteristics will require close supervision in identifying priorities both in terms of the perimeters covered and in terms of the information selected for inclusion.
If this may cause frustration, it will very quickly prove its effectiveness and crucial importance for the project to succeed. Indeed, users will rightly consider the data catalog as a source of truth in the same way a dictionary is for the spoken language. It’s always better to offer, starting with a targeted audience, selected and quality content, thus offering an experience that will induce people to come back to the tool for future searches. It will ultimately be difficult to keep users interested if the first exposure is a failure.
A data catalog won’t be filled spontaneously, even when it is open to users
The data catalog is open to many users, of which some (sometimes many) have knowledge of the assets in question. That said, spontaneous and regular updating of the data from the start is exceedingly rare.
The reality is quite different: It is crucial to be accompanied at the beginning of the project but also throughout.
Both the quality and quantity of the information have to be supervised just as it is important to make aware, present, and educate the contributors. Managing the contributions can also be achieved through the creation of virtuous processes that enable control and an invitation for correction/enrichment of the catalog.
It’s impossible to set all the objectives of the data catalog project from the start without making them evolve
The data catalog has to meet the expectations of many users, all with many requirements.
It is therefore unreasonable to assume that you already have the complete list of expectations at the start of the project. Just as it would be naive to believe that this list will be fixed and immutable from the beginning of the project. It is, therefore, the role of the Data Office to continuously collect and analyze the requirements, interpret them accurately, prioritize, and transform them into appropriate content.
Generally, the requirements evolve according to different parameters that are not established at the onset. For instance, the level of enterprise and staff maturity with regard to data management will mature over time, as will the development of use cases around data, not to mention changes in data-related regulations.
All these parameters will potentially have a strong impact on the content that the data catalog will have to cover, both in terms of the scope of the parameters as well as the nature of the information provided for the assets present.
The 10 Traps to Avoid for a Successful Data Catalog Project
To learn more about the traps to avoid when starting a data cataloging initiative, download our free eBook!
