The 7 lies of Data Catalog Providers – #6 A Data Catalog must rely on automation!

The 7 lies of Data Catalog Providers – #6 A Data Catalog must rely on automation!

The Data Catalog market has developed rapidly, and it is now deemed essential when deploying a data-driven strategy. Victim of its own success, this market has attracted a number of players from adjacent markets.

 These players have rejigged their marketing positioning in order to present themselves as Data Catalog solutions.

The reality is that, while relatively weak on the data catalog functionalities themselves, these companies attempt to convince, with degrees of success proportional to their marketing budgets, that a Data Catalog is not merely a high-performance search tool for data teams, but an integrated solution likely to address a host of other topics.

The purpose of this blog series is to deconstruct the pitch of these eleventh-hour Data Catalog vendors.

Here are, in our opinion, the 7 lies of the Data Catalog vendors:

  1. A Data Catalog is a Data Governance platform,
  2. A Data Catalog can measure and manage data quality,
  3. A Data Catalog can manage regulatory compliance,
  4. A Data Catalog can query data directly,
  5. A Data Catalog can model logical architecture and business processes around data,
  6. A Data Catalog is a collaborative cartography and metadata management tool that cannot be automated,
  7. A Data Catalog is a long, complex, and expensive project.

A Data Catalog must reply on automation!

Some Data Catalog vendors, who hail from the world of cartography, have developed the rhetoric that automation is a secondary topic, which can be addressed at a later stage.

They will tell you that a few manual file imports suffice, along with a generous user community collaborating on their tool to feed and use the catalog. A little arithmetic is enough to understand why this approach is doomed to failure in a data-centric organization.

An active Data Lake, even a modest one, quickly hoovers up, in its different layers, hundreds and even thousands of datasets. Along with these datasets, can be added those from other systems (database applications, various APIs, CRMs, ERPs, noSQL, etc) which we usually want to integrate in the catalog.

The orders of magnitude quickly go beyond thousands, sometimes tens of thousands of datasets. Each dataset contains dozens of fields. Datasets and fields alone represent several hundreds of thousands of objects (we could also include other assets: ML models, dashboards, reports, etc). In order for the catalog to be useful, inventorying those objects isn’t enough.

You also need to combine with them all the properties (metadata) which will enable end users to find, understand, and exploit these assets. There are several types of metadata: technical information, business classification, semantics, security, sensitivity, quality, norms, uses, popularity, contacts, etc. Here again, for each asset, there are dozens of properties.

Back to the arithmetics: Overall, we are dealing with millions of attributes needing to be maintained.

Such volumes alone should disqualify any temptation to choose the manual approach. But there is more. The stock of informational assets isn’t static. It is constantly growing. In a data-centric organization, datasets are created daily, others are moved or changed.

The Data Catalog needs to reflect these changes.

 

Otherwise, its content will be permanently obsolete and the end users will reject it. Who is going to trust a Data Catalog that is incomplete and wrong? If you feel that your organization can absorb the load and keep your catalog up to date, that’s wonderful. Otherwise, we would suggest you monitor as quickly as possible the level of automation provided by the different solutions you are looking at.

 

What can we automate in a Data catalog?  

In terms of automation, the most important capacity is the inventory.

A Data Catalog should be able to regularly scan all your data sources and automatically update the asset inventory (datasets, structures and technical metadata at a minimum) to reflect the day-to- day reality of the hosting systems.  

Believe us: a Data Catalog that cannot connect to your da ta sources will quickly become useless, because its content will always be in doubt.

 

Once the inventory is completed, the next challenge is to automate the metamodel feed.

Here, beyond the technical metadata, complete automation seems a little hard to imagine. It is still possible to significantly reduce the necessary workload for the maintenance of the metamodel. The value of certain properties can be determined by simply applying rules at the time of the integration of the objects in the catalog.

It is also possible to suggest property values using more or less sophisticated algorithms (semantic analysis, pattern matching, etc.).

Lastly, it’s often possible to feed part of the catalog by integrating the systems that produce or contain metadata. This can apply for instance for quality measurement, for lineage information, for business ontologies, etc.

For this approach to work, the Data Catalog must be open and offer a complete set of APIs that allow the metadata to be updated from other systems.

 

Take Away

A Data Catalog handles millions of information in a constantly shifting landscape.

Maintaining this information manually is virtually impossible, or extremely costly. Without automation, the content of the catalog will always be in doubt, and the data teams will not use it.

Download our eBook: The 7 lies of Data Catalog Providers for more!

Pricing your Enterprise Data Catalog: What should it really cost?

Pricing your Enterprise Data Catalog: What should it really cost?

For the last couple of years, the Zeenea international sales team has been in contact with prospective clients the world over, presenting Zeenea, running product demos, submitting RFPs (mostly against the big guns in the industry), advising CDOs on data management strategies, accompanying existing customers with their catalog deployment, helping address technical challenges when they arise and always discussing pricing…

Anyone who has been actively in the market for data catalogs, pricing will feel like a complex affair which varies widely depending on use cases, providers (SaaS, On-prem, annual subscription etc.), number of users, etc. 

This blog post seeks to demonstrate that putting a price tag on a data catalog should not really be a complex affair and while the technology that makes a catalog tick can be quite impressive (especially if its powered by a knowledge graph), its ultimate purpose is to help data users access, curate and leverage information that already exists. Nothing more nothing less.  

Below, we’ll relate a couple of fairly typical examples of Zeenea’s approach to pricing which resulted in customer wins for us, and money saved for the customer.

 

Avoid the quagmire of maximum cost/minimal adoption.

One telling example of the financial repercussions of mismanaged data catalog adoption occurred with a large French bank (Zeenea is a Paris-based startup).

This bank had initially chosen a well known data catalog provider to help with the management, governance and quality of their data. Like all financial institutions, this one was subject to BCBS 239 and compliance was therefore a key issue. The catalog provider, whose platform boasts a very sophisticated predefined metamodel with tons of features, therefore set out to help the customer organize its data governance around BCBS 239. Unfortunately, the project quickly began to turn sour for a couple of reasons:  

1-  The “out of the box” metamodel, it turned out, didn’t readily fit with the organization of the data teams. The predictable consequence of this mismatch was having to allocate far more internal resources than planned to handle their compliance initiative.

2- The focus on Data Governance didn’t really deal with the needs of business teams looking for easy access to their datasets. This unhappy situation had a negative impact on the overall adoption of the data catalog and end users simply stopped using it. 

By the time the client stopped the initiative, there were a very limited number of compliance experts actually using the catalog for a bill of a few million euros per year…Suffice it to say that the purse holders were not entirely satisfied with the cost-benefit ratio…

The lesson here is that in order to successfully implement a data catalog long term, one needs early and durable user adoption. A simple, well defined, use case with a manageable number of stakeholders will go a long way to achieving that whilst ensuring costs are kept within reasonable limits. This is precisely what Zeenea did with this customer.

The bill was reduced 10 fold and catalog adoption of our Explorer platform tripled.  

Don’t crack a walnut with a (costly) sledgehammer.

 

Another customer (an SME from the UK in the retail sector), whose data landscape consists of a BI tool, PostgresQL, an ETL platform and PowerBI, reached out to us having already received quotes from other well established data catalog providers. 

Their use case was straightforward: The data engineering team needed to clean up their data lake after years of neglect and the data users in the group, analytics folk mostly, wanted to access data assets more easily. 

I’m paraphrasing but the overriding sentiment was that the solutions they had looked at before consulting Zeenea, whilst undoubtedly useful for larger organisations with complex data landscapes, data governance and compliance imperatives, had too many features which were irrelevant to their actual requirements.

The customer was eventually sent a 6 figure quote which the CFO promptly discarded.

Today this mismatch is the most common one we come across, especially amongst small data teams with more straightforward use cases for a data catalog. Unlike the French bank mentioned above, most SMEse cannot afford to spend, let alone lose, vast amounts of money on a failed data experiment. 

 

So how much should a data catalog really cost…?

Early in 2020, we decided to take a different approach to pricing, one that was coherent with our “Start small, Scale fast” approach to data cataloging adoption. This pricing model, presented as a “Team Edition”, was designed to encourage data teams to start their data cataloging journey with a straightforward use case, roll out the catalog to other users/projects incrementally, thus ensuring maximum catalog adoption and, crucially, keep a handle on cost

 

Our Team Edition has an all inclusive cost of 18k Euros per annum and our conditions are simple (and easy to deliver on). This price tag includes the POC, 3 connectors, 5 data stewards, 50 data explorers and of course customer support.

 

Provide us with…

  • A Use Case for the POC.
  • The Data sources you need to connect to (up to 3 for the Team Edition).
  • The number of Data Stewards needed (up to 5 for the Team Edition).
  • The number of Data Explorers needed (up to 50 for the Team Edition). 

 

…and Zeenea will provide you with a quote your CFO can depend on. It really is that simple.*

  

Click here for more information, or to request a POC.

 

*These conditions were not chosen at random. In our experience, data teams looking to roll out a cataloging solution seldom choose a use case requiring more than 3 data connectors and a handful of data stewards, and we don’t necessarily recommend that they do.

The 7 lies of Data Catalog Providers – #5 A Data Catalog is not a Business Modeling Solution!

The 7 lies of Data Catalog Providers – #5 A Data Catalog is not a Business Modeling Solution!

The Data Catalog market has developed rapidly, and it is now deemed essential when deploying a data-driven strategy. Victim of its own success, this market has attracted a number of players from adjacent markets.

 These players have rejigged their marketing positioning in order to present themselves as Data Catalog solutions.

The reality is that, while relatively weak on the data catalog functionalities themselves, these companies attempt to convince, with degrees of success proportional to their marketing budgets, that a Data Catalog is not merely a high-performance search tool for data teams, but an integrated solution likely to address a host of other topics.

The purpose of this blog series is to deconstruct the pitch of these eleventh-hour Data Catalog vendors.

Here are, in our opinion, the 7 lies of the Data Catalog vendors:

  1. A Data Catalog is a Data Governance platform,
  2. A Data Catalog can measure and manage data quality,
  3. A Data Catalog can manage regulatory compliance,
  4. A Data Catalog can query data directly,
  5. A Data Catalog can model logical architecture and business processes around data,
  6. A Data Catalog is a collaborative cartography and metadata management tool that cannot be automated,
  7. A Data Catalog is a long, complex, and expensive project.

A Data Catalog is NOT a Business Modeling Solution

Some organizations, usually large ones, have invested for years in the modeling of their business processes and information architecture.

They have developed several layers of models (conceptual, logical, physical) and have put in place an organization that helps the maintenance and sharing of these models with specific populations (business experts and IT people mostly).

We do not question the value of these models. They play a key role in the urbanization, the schema blueprints, the IS management, as well as regulatory compliance. But we seriously doubt that these modeling tools can provide a decent Data Catalog.

There is also a market phenomenon at play here: certain historical business modeling players are looking to widen the scope of their offer by positioning themselves on the Data Catalog market. After all, they do already manage a great deal of information on physical architecture, business classifications, glossaries, ontologies, information lineage, processes and roles, etc. But we can identify two major flaws in their approach.

The first is organic. By their nature, modeling tools produce top-down models to outline the information in an IS. However accurate it may be, a model remains a model: a simplified representation of reality.

They are very useful communication tools in a variety of domains, but they are not an exact reflection of the day-to-day operational reality which, for me, is crucial to keeping the promises of a Data Catalog (enabling teams to find data, understanding and knowing how to use the datasets).

The second flaw?: It is not user -friendly.

A modeling tool is complex and handles an important number of abstract concepts which require an important learning curve. It’s a tool for experts.

We could consider improving user friendliness of course to open it up to a wider audience. But the built-in complexity of the information won’t go away.

Understanding the information provided by these tools requires a solid understanding of modeling principles (object classes, logical levels, nomenclatures, etc). It is quite a challenge for data teams and a challenge that seems difficult to justify from an operational perspective.

The truth is, modeling tools that have been turned into Data Catalogs are faced with important adoption issues with the teams (they have to make huge efforts to learn how to use the tool, only to not find wha t they are looking for).  

A prospective client recently presented us with a metamodel they had built and asked us whether it was possible to implement it in Zeenea. Derived from their business models, the metamodel had several dozen classes of objects and thousands of attributes. To their question, the official answer was yes (the Zeenea metamodel is very flexible). But instead, we tried to dissuade them from taking that path: A metamodel that sophisticated ran the risk, in our opinion, of losing the end users, and turning the Data Catalog project into a failure…

Should we therefore abandon business models when putting a Data Catalog in place? Absolutely not.

It must, however, be remembered that business models are there to handle some issues, and the Data Catalog other issues. Some information contained within the models help structure the catalog and enrich its content in a very useful way (for instance responsibilities, classifications, and of course business glossaries).

The best approach is therefore, in our view, to conceive the catalog metamodel by focusing exclusively on the added value to the data teams (always with the same underlying question: does this information help find, localize, understand, and correctly use the data?), and then integrating the modeling tool and the Data Catalog in order to automate the supply of certain elements of the metamodel already present in the business model.

 

Take Away

 As useful and complete as they may be, business models are still just models: they are an imperfect reflection of the operational reality of the systems and therefore they struggle to provide a useful Data Catalog.

Modeling tools, as well as business models, are too complex and too abstract to be adopted by data teams. Our recommendation is that you define the metamodel of your catalog with a view to answering the questions of the data teams and supply some aspects of the metamodel with the business model.

Download our eBook: The 7 lies of Data Catalog Providers for more!

The 7 lies of Data Catalog Providers – #4 A Data Catalog is not a Query Solution!

The 7 lies of Data Catalog Providers – #4 A Data Catalog is not a Query Solution!

The Data Catalog market has developed rapidly, and it is now deemed essential when deploying a data-driven strategy. Victim of its own success, this market has attracted a number of players from adjacent markets.

 These players have rejigged their marketing positioning in order to present themselves as Data Catalog solutions.

The reality is that, while relatively weak on the data catalog functionalities themselves, these companies attempt to convince, with degrees of success proportional to their marketing budgets, that a Data Catalog is not merely a high-performance search tool for data teams, but an integrated solution likely to address a host of other topics.

The purpose of this blog series is to deconstruct the pitch of these eleventh-hour Data Catalog vendors.

Here are, in our opinion, the 7 lies of the Data Catalog vendors:

  1. A Data Catalog is a Data Governance platform,
  2. A Data Catalog can measure and manage data quality,
  3. A Data Catalog can manage regulatory compliance,
  4. A Data Catalog can query data directly,
  5. A Data Catalog can model logical architecture and business processes around data,
  6. A Data Catalog is a collaborative cartography and metadata management tool that cannot be automated,
  7. A Data Catalog is a long, complex, and expensive project.

A Data Catalog is NOT a Query Solution

 

Here is another oddity of the Data Catalog market. Several vendors, whose initial aim was to allow users to query simultaneously several data sources, have “pivoted” towards a Data Catalog positioning on the market.

There is a reason for them to pivot.

The emergence of Data Lakes and Big Data have cornered them in a technological cul-de-sac that has weakened the market segment they were initially in.

A Data Lake is typically segmented into sever al layers. The “raw” layer integrates data without transformation, in formats that are more or less structured and in great quantities; A second layer, which we’ll call “clean”, will contain roughly the same data but in normalized formats, after a dust down. After that, there can be one or sever al “business” layers ready for use: A data warehouse and visualization tool for analytics, a Spark cluster for data science, a storage system for commercial distribution, etc. Within these layers, data is transformed, aggregated and optimized for use, along with the tools supporting this use (data visualization tools, notebooks, massive processing, etc).

 

In this landscape, a universal self-service query tool isn’t suitable.

 

It is of course possible to set up an SQL interpretation layer on top of the “clean” layer (like Hive) but query execution remains a domain for specialists. The volumes of data are huge and rarely indexed. 

Allowing users to define their own queries is very risky: On on-prem systems, they run the risk of collapsing the cluster by running a very expensive query. And on the Cloud, the bill could run very high indeed. Not to mention security and data sensitivity issues.

 

As for the “business” layers, they are generally coupled with more specialized solutions (such as a combination of Snowflake and Tableau for analytics) that offer very complete and secured tooling, offering great performance for self-service queries. With their market space shrinking like snow in the sun, some multi-source query vendors have pivoted towards Data Catalogs.

Their pitch is now to convince customers that the ability to execute queries makes their solution the Rolls-Royce of Data Catalogs (In order to justify their six-figure pricing). We would invite you to think twice about it…

 

Take Away

On a modern data architecture, the capacity to execute queries from a Data Catalog isn’t just unnecessary, it’s also very risky (performance, cost, security, etc.).

Data teams already have their own tools to execute queries on data, and if they haven’t, it may be a good idea to equip them. Integrating data access issues in the deployment of a catalog is the surest way to make it a long, costly, and disappointing project.

Download our eBook: The 7 lies of Data Catalog Providers for more!

What is Data Mesh?

What is Data Mesh?

In this new era of information, new terms are used in organizations working with data: Data Management Platform, Data Quality, Data Lake, Data warehouse… Behind each of these words we find specificities, technical solutions, etc. With Data Mesh, you go further by reconciling technical and functional management. Let’s decipher.

Did you say: “Data Mesh”? Don’t be embarrassed if you’re not familiar with the concept. The term wasn’t used until 2019 as a response to the growing number of data sources and the need for business agility. 

The Data Mesh model is based on the principle of a decentralized or distributed architecture exploiting a literal mesh of data.

While a Data Lake can be thought of as a storage space for raw data, and the Data Warehouse is designed as a platform for collecting and analyzing heterogeneous data, Data Mesh responds to a different use case. 

On paper, a Data Warehouse and Data Mesh have a lot in common, especially when it comes to their main purposes, which is to provide permanent, real-time access to the most up-to-date information possible. But Data Mesh goes further. The freshness of the information is only one element of the system.

Because it is part of a distributed model, Data Mesh is designed to address each business line in your company with the key information that it concerns.

To meet this challenge, Data Mesh is based on the creation of data domains. 

The advantages? Your teams are more autonomous through local data management, a decentralization of your enterprise in order to aggregate more and more data, and finally, more control of the overall organization of your data assets.

 

Data Mesh: between logic and organization

If a Data Lake is ultimately a single reservoir for all your data, Data Mesh is the opposite. Forget the monolithic dimension of a Data Lake. Data is a living, evolving asset, a tool for understanding your market and your ecosystem and an instrument of knowledge and understanding. 

Therefore, in order to appropriate the concept of meshing data, you need to think differently about data. How can we do this? By laying the foundations for a multi-domain organization. Each type of data has its own use, its own target, and its own exploitation. From then on, all the business areas of your company will have to base their actions and decisions on the data that is really useful to them to accomplish their missions. The data used by marketing is not the same as the data used by sales or your production teams. 

The implementation of a Data Catalog is therefore the essential prerequisite for the creation of a Data Mesh. Without a clear vision of your data’s governance, it will be difficult to initiate your company’s transformation. Data quality is also a central element. But ultimately, Data Mesh will help you by decentralizing the responsibility for data to the domain level and by delivering high-quality transformed data.

The Challenges

Does adopting Data Mesh seem impossible because the project seems both complex and technical? No cause for panic! Data Mesh, beyond its technicality, its requirements, and the rigor that goes with it, is above all a new paradigm. It must lead all the stakeholders in your organization to think of data as a product addressed to the business. 

In other words, by moving towards a Data Mesh model, the technical infrastructure of the data environment is centralized, while the operational management of the data is decentralized and entrusted to the business.

With Data Mesh, you create the conditions for an acculturation to data for all your teams so that each employee can base his or her daily action on data.

 

The Data Mesh paradox

Data Mesh is meant to put data at the service of the business. This means that your teams must be able to access it easily, at any time, and to manipulate the data to make it the basis of their daily activities.

But in order to preserve the quality of your data, or to guarantee compliance with governance rules, change management is crucial and the definition of each person’s prerogatives is decisive. When deploying Data Mesh, you will have to lay a sound foundation in the organization. 

On the one hand, free access to data for each employee (what we call functional governance). On the other hand, management and administration, in other words, technical governance in the hands of the Data teams.

Decompartmentalizing uses by compartmentalizing roles, that’s the paradox of Data Mesh! 

The 7 lies of Data Catalog Providers – #3 A Data Catalog is not a Compliance Solution!

The 7 lies of Data Catalog Providers – #3 A Data Catalog is not a Compliance Solution!

The Data Catalog market has developed rapidly, and it is now deemed essential when deploying a data-driven strategy. Victim of its own success, this market has attracted a number of players from adjacent markets.

 These players have rejigged their marketing positioning in order to present themselves as Data Catalog solutions.

The reality is that, while relatively weak on the data catalog functionalities themselves, these companies attempt to convince, with degrees of success proportional to their marketing budgets, that a Data Catalog is not merely a high-performance search tool for data teams, but an integrated solution likely to address a host of other topics.

The purpose of this blog series is to deconstruct the pitch of these eleventh-hour Data Catalog vendors.

Here are, in our opinion, the 7 lies of the Data Catalog vendors:

  1. A Data Catalog is a Data Governance platform,
  2. A Data Catalog can measure and manage data quality,
  3. A Data Catalog can manage regulatory compliance,
  4. A Data Catalog can query data directly,
  5. A Data Catalog can model logical architecture and business processes around data,
  6. A Data Catalog is a collaborative cartography and metadata management tool that cannot be automated,
  7. A Data Catalog is a long, complex, and expensive project.

A Data Catalog is NOT a Compliance Solution

 

As with governance, regulatory compliance is a crucial issue for any data-centric organization.

There is a plethora of data handling regulations spanning all sectors of activity and countries. On the subject of personal data alone, GDPR is mandatory across all EU countries, but each State has a lot of wiggle room on how its implemented, and most States have a large arsenal of legislation to complete, reinforce and adapt it (Germany alone for instance, has several dozen regulations across different sectors of activity related to personal data).

In the US, there are hundreds of laws and regulations across States and sectors of activity (with varying degrees of adherence). And here we are only referring to personal data…Rules and regulations also exist for financial data, medical data, biometric data, banking data, risk data, insurance data etc. Put simply, every organization has some regulation it has to be in compliance with.

 

So what does compliance mean in this case?

The vast majority of regulatory audits center on the following:

  • The ability to provide complete and up to date documentation on the procedures and controls put in place in order to meet the norms,
  • The ability to prove that the procedures described in the documentation are rolled out in the field,
  • The ability to supervise all the measures deployed with a view towards continuous improvement.

A Data Catalog is neither a procedures library, or an evidence consolidation system, and even less a process supervision solution.

It strikes us as obvious that assigning those responsibilities to a Data Catalog will make it considerably less simple to use (norms are too obscure for most people) and will jeopardize adoption for those most likely to benefit from it (data teams).

Should we therefore forget about Data Catalogs in our quest for compliance?

 

No, of course not. Again, in terms of compliance, it would be much wiser to use the Da ta Catalog for the literacy of the data teams. And to tag the data appropriately thus, enabling the teams to quickly identify any norm or procedure they need to adhere to before using the data. The Catalog can even help place the tags using a variety of approaches. It can for example automatically detect sensitive or personal data.

That said, even with the help of ML, detection will never work perfectly ( the notion of “personal data” defined by GDPR for instance, is much larger and harder to detect than North American PII). The Catalog’s ability to manage these tags is therefore critical.

 

Take Away

Regulatory compliance is above all a matter of documentation and proof and has no place in a Data Catalog.

However, the Data Catalog can help identify (more or less automatically) data that is subject to regulations. The Data Catalog plays a key role in the acculturation of the data teams with respect to the importance of regulations.

Download our eBook: The 7 lies of Data Catalog Providers for more!

Data Lakes: The benefits and challenges

Data Lakes: The benefits and challenges

Data Lakes are increasingly used by companies for storing their enterprise data. However, storing large quantities of data in a variety of formats can lead to data chaos! Let’s take a look at the pros and cons of Data Lakes.

To understand what a Data Lake is, let’s imagine a reservoir or a water retention basin that runs alongside the road. Regardless of the type of data, its origin, its purpose, everything, absolutely everything, ends up in the Data Lake! Whether that data is raw or refined, cleansed or not, all of this information ends up in this single place where it isn’t modified, filtered or deleted before being stored. 

Sounds a bit messy, doesn’t it? But that’s the whole point of the Data Lake! 

It’s because it frees the data from any preconceived idea that a Data Lake offers real added value. How? By allowing data teams to constantly reinvent the use and exploitation of your company’s data.

Improvement of customer experience with a 360° analysis of the customer journey, detection of personas to refine marketing strategies, rapid integration of new data flows from IoT in particular, the Data Lake is an agile response to very structuring problems for companies!

 

Data Lakes: the undeniable advantages

The first advantage of a Data Lake is that it allows you to store considerable volumes of protean data. Structured or unstructured, data from NoSQL databases… a Data Lake is, by nature, agnostic to the type of information it contains. It is precisely because it has no strict data exploitation scheme that the Data Lake is a valuable tool. And for good reason, none of the data it contains is ever altered, degraded or distorted.

This is not the only advantage of a Data Lake. Indeed, since the data is raw, it can be analyzed on an ad-hoc basis.

The objective: to detect trends and generate reports according to business needs without it being a vast project involving another platform or another data repository. 

Thus, the data available in the Data Lake can be easily exploited, in real time, and allows you to place your company in a data centric scheme so that your decisions, your choices, and your strategies are never disconnected from the reality of your market or your activities. 

Nevertheless, the raw data stored in your Data Lake can (and should!) be processed in a specific way, as part of a larger, more structured project. But your company’s data teams will know that they have, within reach of a click, an unrefined ore that can be put to use for further analysis.

The challenges a Data Lake

When you think of a Data Lake, poetic mental images come to mind. Crystalline waves waving in the wind of success that carries you away… But beware! A Data Lake carries the seeds of murky, muddy waters. This receptacle of data must be the object of particular attention because without rigorous governance, the risk of sinking into a “chaos of data” is real. 

In order for your Data Lake to reveal its full potential, you must have a clear and standardized vision of your data sources.

The control of these flows is a first essential safeguard to guarantee the good exploitation of data by heterogeneous nature. You must also be very vigilant about data security and the organization of your data. 

The fact that the data in a Data Lake is raw does not mean that it should not have a minimum structure to allow you to at least identify and find the data you want to exploit.

Finally, a Data Lake often requires significant computing power in order to refine masses of raw data in a very short time. This power must be adapted to the volume of data that will be hosted in the Data Lake. 

Between method, rigor and organization, a Data Lake is a tool that serves your strategic decisions!

 

The 7 lies of Data Catalog Providers – #2 A Data Catalog is NOT a Data Quality Management Solution

The 7 lies of Data Catalog Providers – #2 A Data Catalog is NOT a Data Quality Management Solution

The Data Catalog market has developed rapidly, and it is now deemed essential when deploying a data-driven strategy. Victim of its own success, this market has attracted a number of players from adjacent markets.

 These players have rejigged their marketing positioning in order to present themselves as Data Catalog solutions.

The reality is that, while relatively weak on the data catalog functionalities themselves, these companies attempt to convince, with degrees of success proportional to their marketing budgets, that a Data Catalog is not merely a high-performance search tool for data teams, but an integrated solution likely to address a host of other topics.

The purpose of this blog series is to deconstruct the pitch of these eleventh-hour Data Catalog vendors.

Here are, in our opinion, the 7 lies of the Data Catalog vendors:

  1. A Data Catalog is a Data Governance platform,
  2. A Data Catalog can measure and manage data quality,
  3. A Data Catalog can manage regulatory compliance,
  4. A Data Catalog can query data directly,
  5. A Data Catalog can model logical architecture and business processes around data,
  6. A Data Catalog is a collaborative cartography and metadata management tool that cannot be automated,
  7. A Data Catalog is a long, complex, and expensive project.

A Data Catalog is NOT a Data Quality Management (DQM) Solution

 

We at Zeenea, do not underestimate the importance of data quality in successfully delivering a data project, quite the contrary. It just seems absurd to me to put this in the hands of a solution, which by its very nature, cannot achieve the controls at the right time.

Let us explain.

There is a very elementary rule to quality control, a rule that can be applied virtually in any domain where quality is an issue, be it an industrial production chain, software development, or the cuisine of a 5-star restaurant: The sooner the problem is detected, the less it will cost to correct.

To demonstrate the point, a car manufacturer is unlikely to refrain from testing the battery of a new vehicle until after its built and all the production costs have already been incurred and solving a defect would cost the most. No. Each piece is closely controlled, each step of the production is tested, defective pieces are removed before ever being integrated in the production circuit, and the entire chain of production can be halted if quality issues are detected at any stage. The quality issues are corrected at the earliest possible state of the production process where they are the least costly and the most durable.

 

“In a modern data organization, data production rests on the same principles. We are dealing with an assembly chain whose aim is to provide usage with high added value. Quality control and correction must happen at each step. The nature and level of controls will depend on what the data is used for.”

 

If you are handling data, you obviously have at your disposal pipelines to feed your uses. These pipelines can involve dozens of steps – data acquisition, data cleaning, various transformations, mixing various data sources, etc.

In order to develop these pipelines, you probably have a number of technologies at play, anything from in-house scripts to costly ETLs and exotic middleware tools. It’s within those pipelines that you need to insert and pilot your quality control, as early as possible, by adapting them to what is at stake with the end product. Only measuring data quality levels at the end of the chain isn’t just absurd, it’s totally inefficient.

It is therefore difficult to see how a Da ta Catalog (whose purpose is to inventory and document all potentially usable datasets in order to facilitate data discovery and usage) can be a useful tool to measure and manage quality.

A Data Catalog operates on available datasets, on any systems that contain data, and should be as least invasive as possible in order to be deployed quickly throughout the organization.

A DQM solution works on the data feed (the pipelines), focuses on production data and is, by design, intrusive and time consuming to deploy. I cannot think of any software architecture that can tackle both issues without compromising the quality of either one.

 

Data Catalog vendors promising to solve your data quality issues are, in our opinion, in a bind and it seems unlikely they can go beyond a “salesy” demo.

 

As for DQM vendors (who also often sell ETLs), their solutions are often too complex and costly to deploy as credible Data Catalogs.

The good news is that the orthogonal nature of data quality and data cataloging makes it easy for specialized solutions in each domain to coexist without encroaching on each other’s lane.

Indeed, while a data catalog isn’t purposed for quality control, it can exploit the information on the quality of the datasets it contains which obviously provides many benefits.

The Data Catalog uses this metadata for example to share the information (and possible alerts it may identify) with the data consumers. The catalog can benefit from this information to adjust his search and recommendation engine and thus, orientate other users towards higher quality datasets.

And both solutions can be integrated at little cost with a couple of APIs here and there.

 

Take Away

Data quality needs to be assessed as early as possible in the pipeline feeds.

The role of the Data Catalog is not to do quality control but to share as much as possible the results of these controls. By their natures, Data Catalogs are bad DQM solutions, and DQM solutions are mediocre and overly complex Data Catalogs.

An integration between a DQM solution and a Data Catalog is very straightforward and is the most pragmatic approach.

Download our eBook: The 7 lies of Data Catalog Providers for more!

The 7 lies of Data Catalog Providers – #1 A Data Catalog is NOT a Data Governance Solution

The 7 lies of Data Catalog Providers – #1 A Data Catalog is NOT a Data Governance Solution

The Data Catalog market has developed rapidly, and it is now deemed essential when deploying a data-driven strategy. Victim of its own success, this market has attracted a number of players from adjacent markets.

 These players have rejigged their marketing positioning in order to present themselves as Data Catalog solutions.

The reality is that, while relatively weak on the data catalog functionalities themselves, these companies attempt to convince, with degrees of success proportional to their marketing budgets, that a Data Catalog is not merely a high-performance search tool for data teams, but an integrated solution likely to address a host of other topics.

The purpose of this blog series is to deconstruct the pitch of these eleventh-hour Data Catalog vendors.

Here are, in our opinion, the 7 lies of the Data Catalog vendors:

  1. A Data Catalog is a Data Governance platform,
  2. A Data Catalog can measure and manage data quality,
  3. A Data Catalog can manage regulatory compliance,
  4. A Data Catalog can query data directly,
  5. A Data Catalog can model logical architecture and business processes around data,
  6. A Data Catalog is a collaborative cartography and metadata management tool that cannot be automated,
  7. A Data Catalog is a long, complex, and expensive project.

A Data Catalog is NOT a Data Governance Solution

 

This is probably our most controversial stance on the role of a Data Catalog and the controversy originates with the powerful marketing messages pumped out from the world leader in metadata management whose solution is in reality a data governance platform being sold as a Data Catalog.

To be clear, having sound data governance is one of the pillars of an effective data strategy. Governance, however, has little to do with tooling.

Its main purpose is the definition of roles, responsibilities, company policies, procedures, controls, committees…In a nutshell, its function is to deploy and orchestrate, in its entirety, the internal control of data in all its dimensions.

Let’s just acknowledge that data governance has many different aspects (processing and storage architecture, classification, retention, quality, risk, conformity, innovation, etc.) and that there aren’t any universal “one-size fits all” model adapted for all organizations. Like other governance domains, each organization must conceive and pilot its own landscape based on its capacities and ambitions, as well as thorough risk analysis.

Putting in place an effective data governance is not a project, but rather it is a transformation program.

No commercial “solution” can replace that transformation effort.

 

So where does the Data Catalog fit into all this?

The quest for a Data Catalog is usually the result of a very operational requirement: Once the Data Lake and a number of self-service tools are set up, the next challenge quickly becomes to find out what the Data Lake actually contains (both from a technical and a semantic perspective), where the data comes from, what transformations the data may have incurred, who is in charge of the data, what internal policies apply to the data, who is currently using the data and why etc.

 

An inability to provide this type of information to the end-user can have serious consequences to an organization, and a Data Catalog is the best means to mitigate that risk. When dealing with the selection of a transverse solution, involving people from many different departments, the selection of the solution is often given to those in charge of data governance, as they appear to be in the best position to coordinate the expectations of the largest number of stakeholders.

 

This is where the alchemy begins. The Data Catalog, whose initial purpose was to provide data teams with a quick solution to discover, explore, understand, and exploit the data, becomes a gargantuan project in which all aspects of governance have to be solved.

 

The project will be expected to:

  • Manage data quality,
  • Manage personal data and compliance (GDPR first and foremost),
  • Manage confidentiality, security, and data access,
  • Propose a new Master Data Management (MDM),
  • Ensure a field by field automated lineage for all datasets,
  • Support all the roles as defined in the system of governance and enable the relevant workflow configuration,
  • Integrate all the business models produced in the last 10 years for the urbanization program,
  • Authorize crossed querying on the data sources while complying with user habilitation on those same sources, as well as anonymizing the results,
  • Etc.

 

Certain vendors manage to convince their client that their solution can be this unique one-stop-shop to data governance. If you believe this is possible, by all means call them, they will gladly oblige. But to be frank, we at Zeenea, simply do not believe such a platform is possible, or even desirable. Too complex, too rigid, too expensive and too bureaucratic, this kind of solution can never be adapted to a data-centric organization.

For us, the Data Catalog plays a key role in a data governance program. This role should not involve supporting all aspects of governance but should rather be utilized to facilitate communication and awareness of governance rules within the company and to help each stakeholder become an active part of this governance.

 

In our opinion, a Data Catalog is one of the components that delivers the biggest return on investment in data-centric organizations that rely on Data Lakes with modern data pipelines…provided it can be deployed quickly and has a reasonable pricing associated with it.

 

Take Away

 

A Data Catalog is not a data governance management platform.

 

Data governance is essentially a transformation program with multiple layers that cannot be addressed by one single solution. In a data-centric organization, the best way to start, learn, educate, and remain agile is to blend clear governance guidelines with a modern Data Catalog that can share those guidelines with the end users.

Download our eBook: The 7 lies of Data Catalog Providers for more!

Zeenea Effective Data Governance Framework | S03-E02 – Start your Data Governance Journey in less than 6 weeks!

Zeenea Effective Data Governance Framework | S03-E02 – Start your Data Governance Journey in less than 6 weeks!

This is the last episode of our third and final season of the “The Zeenea Effective Data Governance Framework”.

Divided into two episodes, this final season will focus on the implementation of metadata management with a data catalog.

In this final episode, we will help you start a 3-6 weeks data journey with Zeenea and then deliver the first iteration of your Data Catalog.

Season 1: Alignment

  • RUnderstand the context
  • RGet the right people
  • RPrepare for action

    S01 E01

    Evaluate your Data maturity

    S01 E02

    Specify your Data strategy

    S01 E03

    Getting sponsors

    S01 E04

    Build a SWOT analysis

    Season 2: Adapting

    • RCreate your personas
    • RIdentify key roles
    • RSet your objectives

      S02 E01

      Organize your Data Office

      S02 E02

      Organize your Data Community

      S02 E03

      Creating Data Awareness

      Season 3: Implementing Metadata Management with a Data Catalog

      • RGet to know your data
      • RIterate your data catalog

        S03 E01

        The importance of metadata

        S03 E02

        6 weeks to start your data governance journey

        Metadata Governance Iterations

        We are using an iterative approach based on short cycles (6 to 12 weeks at most) to progressively deploy and extend the metadata management initiative in the Data Catalog.

        These short cycles make it possible to quickly obtain value. They also provide an opportunity to communicate regularly via the Data Community on each initiative and its associated benefits.

        Each cycle is organized in predetermined steps, as follows:

        metadata governance iterations zeenea

            1. Identify the goal

            How?

            Workshop:  From the Data Strategy ,OKRs Map, detail the objective precisely and the associated risks for the first iteration

            Deliverable

            A perimeter (data, people), a target.

            2. Deploy / Connect

            How?

            Set up a technical meeting and define the need to conform to the data perimeter.

            Deliverable

            Technical configuration of scanners and ability to harvest the information.

            Zeenea Scanners deployed and operational.

            3. Conceive and configure

            How?

            Workshop to define or adapt the metamodel to comply with the expectation for the first cycles.

            Deliverable

            A metamodel tailored to meet expectations.

            4. Import the items

            How?

            Enrich your Metadata Management Platform: load and document in accordance with the target.

            Deliverable

            Define the core (minimum viable) information to properly serve the users.

            5. Open and test

            How?

            Let the users test the value produced. Challenge and validate it.

            Deliverable

            Validate if the effort produced the expected value.

            6. Measure the gains

            How?

            Retrospective workshop: check if the targets are met and if the users are satisfied.

            Deliverable

            Fine grained analysis of the cycle to identify what worked, what didn’t and how to improve the next cycle.

            Start metadata management in just 6 weeks!

             

            In our guide, we explain how to get your metadata management journey started in less than 6 weeks. Download to get your free guide!

            Data strategy: how to break down data silos?

            Data strategy: how to break down data silos?

            Whether it comes from Product life cycles, marketing, or customer relations, data is omnipresent in the daily life of a company. Customers, suppliers, employees, partners… they all collect, analyze and exploit data in their own way.

            The risk: the appearance of silos! Let’s discover why your data is siloed and how to put an end to it.

            A company is made up of different professions that coordinate their actions to impose themselves on their market and generate profit. Each of these professions fulfill specific missions and collect data. Marketing, sales, customer success teams, communication…all of these entities act on a daily basis and base their actions on their own data.

            The problem is that, over the course of his or her career, a customer will generate a certain amount of information. 

            A simple lead, then becomes a prospect , who then becomes a customer…the same person may have different taxonomies based on which part of the business is analyzing this data.

            This reality is what we call a data silo. In other words, data is poorly or never shared and therefore too often untapped. 

            In a study by IDC entitled “The Data-Forward Enterprise” published in December 2020, 46% of French companies forecast a 40% annual growth in the volume of data to be processed over the next two years. 

            Nearly 8 out of 10 companies consider data governance to be essential. However, only 11% of them believe they are getting the most out of their data. The most common reason for this is data silos.

             

            What are the major consequences of data silos?

            Among the frequent problems linked to data silos, we find first and foremost the problem of duplicated data. Since data is used blindly by the business, what could be more natural?

            These duplicates have unfortunate consequences. They distort the knowledge you can have of your products or your customers. This biased, imperfect information often leads to imprecise or even erroneous decisions.

            Duplicated data also take up unnecessary space on your servers. Storage space that represents an additional cost for your company! Beyond the impact of data silos on your company’s decisions, strategies, or finances, there is also the organizational deficit.

            When your data is in silos, your teams can’t collaborate effectively because they don’t even know they’re mining the same soil! 

            At a time where collective intelligence is a cardinal value, this is undoubtedly the most harmful event caused by data silos.   

             

            Why does your company suffer from data silos?

            There are many causes for siloed data. Most often, they are associated with the history of your information systems. Over the years, these systems were built as a patchwork for business applications that were not always designed with interoperability in mind. 

            Moreover, a company is like a living organism. It welcomes new employees when others leave. In everyday life, spreading data culture throughout the workforce is a challenge! Finally, there is the place of data in the key processes of organizations. 

            Today data is central. But when you go back 5 to 10 years ago, it was much less so. Now that you know that you are suffering from data silos, you need to take action. 

            How do you get rid of data silos?

            To get started on the road to eradicating data silos, you need to proceed methodically.

            Start by recognizing that the process will inevitably take some time. The prerequisite is a creating a detailed mapping of all your databases and information systems. These can be produced by different tools and solutions such as emails, CRMs, various spreadsheets, financial documents, customer invoices, etc.

            It is also necessary to start by identifying all your data sources in order to centralize them in a unique repository. To do this, you can for example create gaps between the silos by using specific connectors, also called APIs. The second option is to implement a platform on your information system that will centralize all the data

            Working as a data aggregator, this platform will also consolidate data by tracking duplicates and keeping the most recent information. A Data Catalog Solution will prevent the reappearance of data silos once deployed. 

            But beware, data quality, optimized circulation between departments, and coordinated use of data to improve performance is also a human project!

            Sharing best practices, training, raising awareness – in a word, creating a data culture within the company – will be the key to eradicating data silos once and for all.

            The keys to succeed your Cloud Migration

            The keys to succeed your Cloud Migration

            The recent COVID-19 pandemic has brought about major changes in the work culture and the Cloud is becoming an essential part by offering employees access to the company’s data, wherever they are.  But why migrate? How to migrate? And for what benefits? Here is an overview.

            Head in the clouds and feet on the ground, that’s the promise of the Cloud which, with the health crisis, has proven to be an essential tool for business continuity.

            In a study conducted by Vanson Bourne at the end of 2020, it appears that more than 8 out of 10 business leaders (82%), accelerated their decision to migrate their critical data and business functions to the Cloud, after facing the COVID-19 crisis. 91% of survey participants say they have become more aware of the importance of data in the decision-making process since the crisis began. 

            Cloud and data. A duo that is now inseparable from business performance.

            A reality that is not limited to a specific market. The plebiscite for Cloud data migration is almost worldwide! The Vanson Bourne study highlights a shared awareness on an international scale, with edifying figures:

            • United States (97%),
            • Germany and Japan (93%),
            • United Kingdom (92%).

            Finally, 99% of Chinese executives are accelerating their plans to complete their migration to the Cloud. In this context, the question “why migrate to the Cloud” is unequivocally answered: if you don’t, your competitors will do it before you and will definitely beat you to it.

             

            The main benefits of Cloud migration

            Ensuring successful Cloud data migration is first and foremost a question of guaranteeing its availability in all circumstances. Once stated, this benefit leads to many others! If data is accessible everywhere and at all times, a company is able to meet the demand for mobility and flexibility expressed by employees. 

            A requirement that was fulfilled during the successive confinements and that should continue as the return to normalcy seems finally possible. Fully operational employees at home, in the office or in the countryside, not only promise increased productivity but also a considerable improvement in the user experience. HR benefits are not the only consequences of Cloud migration. 

            From a financial point of view, the Cloud opens the way to a better control of IT costs. By shifting data from a CAPEX dimension to an OPEX dimension, you can improve the TCO (Total Cost of Ownership) of your information system and your data assets. Better experience, budget control, the Cloud opens the way to optimized data availability. 

            Indeed, when migrating to the Cloud, your partners make commitments in terms of maintenance or backups that guarantee maximum access to your data. You should therefore pay particular attention to these commitments, which are referred to as SLAs (Service Level Agreements). 

            Finally, by migrating data to the cloud, you benefit from the expertise and technical resources of specialized partners who deploy resources that are far superior to those that you could have on your own.

             

            How to successfully migrate to the Cloud

            Data is, after human resources, the most valuable asset of a company.

            This is one of the reasons why companies should migrate to the Cloud. But the operation must be carried out in the best conditions to limit the risk of data degradation, as well as the temporary unavailability that impacts your business.

            To do this, preparation is essential and relies on one prerequisite: the project does not only concern IT teams, but the entire company. 

            Support, reassurance, training: the triptych that is essential to any change management process must be applied. Then make sure you give yourself time. Avoid the Big Bang mode, which could irritate your teams and dampen their enthusiasm. Even if the Cloud migration of your data should go smoothly, put all the chances on your side by making backups of your data. 

            Rely on redundancy to prepare for any eventuality, including (and especially!) the most unlikely. Once the deployment on the cloud is complete, ensure the quality of the experience for your employees. By conducting rigorous long-term project management, you can easily identify if you need to make adjustments to your initial choices. 

             

            The scalability of the Cloud model is a strength that you should seize upon to constantly adapt your strategy!

            Zeenea Effective Data Governance Framework | S03-E01 – The importance of metadata

            Zeenea Effective Data Governance Framework | S03-E01 – The importance of metadata

            This is the first episode of our third and final season of the “The Zeenea Effective Data Governance Framework”.

            Divided into two episodes, this final season will focus on the implementation of metadata management with a data catalog.

            For this first episode, we will give you the right questions to ask yourself in order to build a metamodel for your metadata.

            Season 1: Alignment

            • RUnderstand the context
            • RGet the right people
            • RPrepare for action

              S01 E01

              Evaluate your Data maturity

              S01 E02

              Specify your Data strategy

              S01 E03

              Getting sponsors

              S01 E04

              Build a SWOT analysis

              Season 2: Adapting

              • RCreate your personas
              • RIdentify key roles
              • RSet your objectives

                S02 E01

                Organize your Data Office

                S02 E02

                Organize your Data Community

                S02 E03

                Creating Data Awareness

                Season 3: Implementing Metadata Management with a Data Catalog

                • RGet to know your data
                • RIterate your data catalog

                  S03 E01

                  The importance of metadata

                  S03 E02

                  6 weeks to start your data governance journey

                  In our previous Season, we explained gave you our tips on how to build your Data Office, organize your Data Community, and build your Data Awareness.

                  In this third season, you will step into the real world of implementing a Data Catalog where Seasons 1 and 2 helped you to specify your Data Journey Strategy.

                   

                  In this episode, you will learn how to ask the right questions for designing your Metamodel.

                  The importance of metadata

                  Metadata management is an emerging discipline and is necessary for enterprises wishing to bolster innovation or regulatory compliance initiatives on their data assets.

                  Many companies are therefore trying to establish their convictions on the subject and brainstorm solutions to meet this new challenge. As a result, metadata is increasingly being managed, alongside data, in a partitioned and siloed way that does not allow the full, enterprise-wide potential of this discipline.

                  Before beginning your data governance implementation, you will have to cover different aspects, ask yourself the right questions and figure out how to answer them.

                  Our Metamodel Template is a way to identify the main aspects when it comes to data governance by asking the right questions and in each case, you decide on its relevance.

                  These questions can also be used as support for your data documentation model and can provide useful elements to data leaders.

                    The Who

                    • Who created this data?
                    • Who is responsible for this data?
                    • Who does this data belong to?
                    • Who uses this data?
                    • Who controls or audits this data?
                    • Who is accountable on the quality of this data?
                    • Who gives access to this data?

                     

                    The What

                    • What is the “business” definition for this data?
                    • What are the associated business rules of this data?
                    • What is the security/confidentiality level of this data?
                    • What are the acronyms or aliases associated with this data?
                    • What are the security/confidentiality rules associated with this data?
                    • What is the reliability level (quality, velocity, etc.) of this data?
                    • What are the authorized contexts of use (related to confidentiality for example)?
                    • What are the (technical) contexts of use possible (or not) for this data?
                    • Is this data considered a “Golden Source”?

                     

                    The Where

                    • Where is this data located?
                    • Where does this data come from? (a partner, open data, internally, etc.)
                    • Where is this data used/shared?
                    • Where is this data saved?

                     

                    The Why

                    • Why are we storing this data? (rather than treating its flow)?
                    • What is this data’s current purpose/usage?
                    • What are the possible usages for this data? (in the future)

                     

                    The When

                    • When was the data created?
                    • When was this data last updated?
                    • What is this data’s life cycle? (update frequency)?
                    • How long are we stocking this data for?
                    • When does this data need to be deleted?

                     

                    The How

                    • How is this data structured? (diagram)?
                    • How do your systems consume this data?
                    • How do you access this data?

                    Start defining your metamodel template!

                     

                    These questions can serve as a foundation for building your data documentation model and providing data consumers with the elements that are useful to them.

                     

                    Don’t miss our latest episode of the Zeenea Data Governance Framework next week:

                    “6 Weeks to Start Your Data Governance Journey” where we will help you start a 3-6 weeks data journey with Zeenea and then deliver the first iteration of your Data Catalog.

                    Copyright Zeenea 2021, all rights reserved.

                    Zeenea Effective Data Governance Framework | S02-E03 – Creating Data Awareness

                    Zeenea Effective Data Governance Framework | S02-E03 – Creating Data Awareness

                    This is the final episode of the second season of the Zeenea Effective Data Governance Framework series.

                    Divided into three parts, this second part will focus on Adaptation. This consists of : 

                    • Organizing your Data Office
                    • Building a data community  
                    • Creating Data Awareness

                    For this third and final episode of the season, we will help you use awareness support techniques that reduce the efforts needed to realize communicative tasks to make anyone aware of what the Data Governance Team is doing, get buy-in, and alignment at all levels.

                    Season 1: Alignment

                    • RUnderstand the context
                    • RGet the right people
                    • RPrepare for action

                      S01 E01

                      Evaluate your Data maturity

                      S01 E02

                      Specify your Data strategy

                      S01 E03

                      Getting sponsors

                      S01 E04

                      Build a SWOT analysis

                      Season 2: Adapting

                      • RCreate your personas
                      • RIdentify key roles
                      • RSet your objectives

                        S02 E01

                        Organize your Data Office

                        S02 E02

                        Organize your Data Community

                        S02 E03

                        Creating Data Awareness

                        Season 3: Implementing Metadata Management with a Data Catalog

                        • RGet to know your data
                        • RIterate your data catalog

                          S03 E01

                          The importance of metadata

                          S03 E02

                          6 weeks to start your data governance journey

                          In the last episode, we explained how to organize your Data Community by building your Data Chapters and Data Guilds

                          In this episode, we will help you use awareness support techniques that reduce the effort needed to realize communicative tasks and create data awareness on the enterprise level.

                              At Zeenea, we advise to use the SMART framework to plan and execute the Data Awareness program.

                               

                              What are SMART goals?

                              • Specific:  What do you want to accomplish?  Why is this goal important?  Who is involved?  What resources are involved?
                              • Measurable:  Are you able to track your progress?  How will you know when it’s accomplished?
                              • Achievable:  Is achieving this goal realistic with effort and commitment?  Do you have the resources to achieve this goal?  If not, how will you get them?
                              • Relevant:  Why is this goal important?  Does it seem worthwhile?  Is this the right time?  Does this match efforts/needs? 
                              • Timely:  When will you achieve this goal?

                              The “SMART” method for your data teams

                              If you think about the level of reach a team has, you can summarize them in 3 categories:

                              • The Control sphere is the one your Data Team can reach directly and interacts 
                              • The Influence sphere is the level where you can find sponsors and get help from
                              • The Concern sphere consists of the C levels who need to be informed on how things are progressing from a high level perspective.

                              In other words, you will have to touch all the stakeholders involved but with different means, timing and interactions.

                              Spend time creating nice formats, and pay attention to the form of all your artifacts.

                              Examples of SMART tasks 

                              You fill find below examples of SMART tasks:

                              For the Control sphere, we advise you to do the following:

                              • Deliver trainings (for both Data Governance teams as well as End users)
                              • Deliver presentations dedicated to teams (Strategy, OKRs, Roadmap, etc).
                              • Keep your burn-down charts and all visual management tools displayed at any time.

                              For the Influence sphere, we advise you to:

                              • Celebrate your first milestones
                              • Organize sprint demos
                              • Display OKRs teams constantly

                              And for the Concern sphere, we advise you to

                              • Celebrate the end of a project
                              • Organise product demos
                              • Record videos and make them available
                              SMART goals graph

                              Don’t miss our new season next week!

                              Find out how to put in place a data-driven strategy with our third and final season on implementing metadata management with a data catalog

                              Copyright Zeenea 2021, all rights reserved.

                              Zeenea Effective Data Governance Framework | S02-E02 – Organizing your Data Community

                              Zeenea Effective Data Governance Framework | S02-E02 – Organizing your Data Community

                              This is the second episode of the second season of the Zeenea Effective Data Governance Framework series.

                              Divided into three parts, this second part will focus on Adaptation. This consists of : 

                              • Organizing your Data Office
                              • Building a data community  
                              • Creating Data Awareness

                              For this second episode, we will give you the keys to organizing an efficient and effective data community in your company.

                              Season 1: Alignment

                              • RUnderstand the context
                              • RGet the right people
                              • RPrepare for action

                                S01 E01

                                Evaluate your Data maturity

                                S01 E02

                                Specify your Data strategy

                                S01 E03

                                Getting sponsors

                                S01 E04

                                Build a SWOT analysis

                                Season 2: Adapting

                                • RCreate your personas
                                • RIdentify key roles
                                • RSet your objectives

                                  S02 E01

                                  Organize your Data Office

                                  S02 E02

                                  Organize your Data Community

                                  S02 E03

                                  Creating Data Awareness

                                  Season 3: Implementing Metadata Management with a Data Catalog

                                  • RGet to know your data
                                  • RIterate your data catalog

                                    S03 E01

                                    The importance of metadata

                                    S03 E02

                                    6 weeks to start your data governance journey

                                    Spotify Feature Teams: a good practice, or a failure?

                                     

                                    In the last episode, we explained how to build your Data Office with Personas and the Spotify Feature Teams paradigm.

                                    The Spotify model has been criticized because there have been failures at companies that tried to implement it.

                                    The three main reasons were:

                                    • Autonomy is nice but it does not mean that teams can do what they want and there is a need to emphasize alignment
                                    • Key results need to be defined at the leadership level and this is why building your OKRs are the right thing to do.
                                    • Autonomy means accountability and the teams have to be measured and the fact that the increments they are working on need to be done and the definition of “Done” has to be specified.

                                    We will focus in this episode on the Chapters and Guilds  and how to organize and better leverage your Data Community.

                                     

                                    How to organize your Chapters and Guilds

                                    Chapters

                                    Collaboration in Chapters and Guilds needs specific knowledge and experience and it is wrong to assume that teams know Agile Practices.

                                    When teams are growing, there is a need to have dedicated support and therefore, the Program Managers in charge of data related topics are accountable for the processes and organization of the Data Community.

                                    At the highest level, organizing your data community means sharing knowledge at all levels: technological, functional, or even specific practices around data related topics.

                                    The main drivers to focus on the Chapters organisation are:

                                    • Teams miss information
                                    • Teams miss knowledge
                                    • Teams repeat mistakes
                                    • Teams need ceremonies and agile common agreed practices.

                                    Chapters meet regularly and often.

                                    We advise to meet once a month. When too big, a Chapter can be split into smaller groups. Even if it is a position that can change overtime, a Chapter needs a leader, and not a manager.

                                    They are in charge of animating and making it efficient by

                                    • Getting the right people involved
                                    • Sharing outcomes with upper level management
                                    • Coordinating and moderating meetings
                                    • Helping to establish transparency
                                    • Finding a way of sharing and keeping available all the knowledge shared.
                                    • Defining the Chapter: why, for whom and what it is meant for.

                                    A tip is to define an elevator pitch for the Chapter.

                                    The Chapter leader is also responsible for building a backlog to avoid endless discussions with no outcome.

                                    Typically the backlog consists in the following topics:

                                     

                                    Data topics

                                    • Chapter Data People Culture
                                    • Chapter data related topics in continuous improvement
                                    • Chapter Data Practices
                                    • Chapter Data Processes
                                    • Chapter Data Tools

                                     

                                    Generic topics

                                    • Chapter continuous improvement
                                    • Chapter feedback collection
                                    • Chapter Agility Practices
                                    • Chapter generic tools
                                    • Chapter information sharing
                                    • Chapter education program

                                     

                                    The Chapter Lead is in charge of communicating outside of his Chapter with other Chapter leaders and has to get time allocation to animate.

                                     

                                    How to start a Chapter

                                     

                                    • Identify the community and all members
                                    • Name the Chapter
                                    • Organize the first chapter meeting
                                    • Define elevator statement
                                    • Initialize your the Chapter Web Page (and keep it updated for future new members onboarding)
                                    • Negotiate and build the first Backlog
                                    • Plan the meetings

                                      Guilds

                                      Guilds should be organized differently and in a self organized way.

                                      The reason for Guilds to exist is passion and the teams are only built on a voluntary base.

                                      In order to avoid the syndrome of too many useless meetings, we advise to allow only Guilds to meet in certain circumstances like:

                                       

                                      • Trainings, workshops but in short formats like in BBLs (Brown Bag Lunch) for the topics they built the Guild for
                                      • Q&A sessions with top executives to emphasize the Why of the Data Strategy
                                      • Hack days to crack a topic 
                                      • Post mortem meetings after a major issue has occurred.

                                      Get our free Data Stewardship Chapter Lead Handbook

                                       

                                      Start building your Chapters by downloading our free Chapter Lead Handbook! 

                                      Don’t miss next week’s episode!

                                      We will cover all the basics to building your Data Awareness to help you to get an Enterprise wide adoption and rollout of your Data Strategy

                                      Copyright Zeenea 2021, all rights reserved.

                                      Zeenea Effective Data Governance Framework | S02-01 – Organizing your Data Office

                                      Zeenea Effective Data Governance Framework | S02-01 – Organizing your Data Office

                                      This is the first episode of the second season of the Zeenea Effective Data Governance Framework series.

                                      Divided into three parts, this second part will focus on Adaptation. This consists of : 

                                      • Organizing your Data Office
                                      • Building a data community  
                                      • Creating Data Awareness

                                      For this first episode, we will give you the keys to build your data personas in order to set up a clear and well defined Data Office. 

                                      Season 1: Alignment

                                      • RUnderstand the context
                                      • RGet the right people
                                      • RPrepare for action

                                        S01 E01

                                        Evaluate your Data maturity

                                        S01 E02

                                        Specify your Data strategy

                                        S01 E03

                                        Getting sponsors

                                        S01 E04

                                        Build a SWOT analysis

                                        Season 2: Adapting

                                        • RCreate your personas
                                        • RIdentify key roles
                                        • RSet your objectives

                                          S02 E01

                                          Organize your Data Office

                                          S02 E02

                                          Organize your Data Community

                                          S02 E03

                                          Creating Data Awareness

                                          Season 3: Implementing Metadata Management with a Data Catalog

                                          • RGet to know your data
                                          • RIterate your data catalog

                                            S03 E01

                                            The importance of metadata

                                            S03 E02

                                            6 weeks to start your data governance journey

                                            In the first season, we shared our best practices to help you align your data strategy with your company. For us, it is essential to:

                                            In this first episode, we will teach you how to build your Data Office.

                                            The evolution of Data Offices in companies

                                             

                                            At Zeenea, we believe in Agile Data Governance.

                                            Previous implementations of data governance within organizations have rarely been successful. The Data Office often focuses too much on technical management or a strict control of data.

                                            For data users who strive to experiment and innovate around data, Data Office behavior is often synonymous with restrictions, limitations, and cumbersome bureaucracy.

                                            Some will have gloomy visions of data locked up in dark catacombs, only accessible after months of administrative hassle. Others will recall the wasted energy at meetings, updating spreadsheets and maintaining wikis, only to find that no one was ever benefiting from the fruits of their labor.

                                            Companies today are conditioned by regulatory compliance to guarantee data privacy, data security, and to ensure risk management.

                                            That said, taking a more offensive approach towards improving the use of data in an organization by making sure the data is useful, usable and exploited is a crucial undertaking.

                                            Using modern organizational paradigms with new ways of interacting is a good way to set up an efficient Data Office flat organisation.

                                            Below are the typical roles of a Data Office, although very often, some roles are carried out by the same person:

                                            • Chief data officer
                                            • Data related Portfolio/Program/Project managers
                                            • Data Engineers / Architects
                                            • Data scientists
                                            • Data analysts
                                            • Data Stewards

                                            Creating data personas

                                            An efficient way of specifying the roles of Data Office stakeholders is to work on their personas.

                                            By conducting one on one interviews, you will learn a lot about them: context, goals and expectations. The OKRs map is a good guide for building those by asking accurate questions.

                                            Here is an example of a persona template:

                                            example of a data persona zeenea

                                              Some useful tips:

                                                  • Personas should be displayed in the office of all Data Office team members.
                                                  • Make it fun, choose an avatar or a photo for each team member, write a small personal and professional bio, list their intrinsic values and work on the look and feel.
                                                  • Build one persona for each person, don’t build personas for teams
                                                  • Be very precise in the personas definition interviews, rephrase if necessary.
                                                  • Treat people with respect and consider all ideas equally.
                                                  • Print them and put them on the office walls for all team members to see.

                                              Building cross functional teams

                                              In order to get rid of Data and organisational silos, we recommend you organise your Data Office in Feature Teams (see literature on the Spotify feature teams framework on the internet).

                                              The idea is to build cross functional teams to address a specific feature expected by your company.

                                              The Spotify model defines the following teams:

                                              Squads

                                              Squads are cross-functional, autonomous teams  that focus on one feature area. Each Squad has a unique mission that guides the work they do. 

                                              In season 1, episode 2, in our OKRs example, the CEO has 3 OKRs and the first OKR (Increase online sales by 2%) has generated 2 OKRs:

                                                  • Get the Data Lake ready for growth, handled by the CIO
                                                  • Get the data governed for growth, handled by the CDO.

                                              There would then be 2 squads:

                                                  • Feature 1: get the Data Lake ready for growth
                                                  • Feature 2: get data governed for growth.

                                              Tribes

                                              At the level below, multiple Squads coordinate within each other on the same feature area. They form a Tribe. Tribes help build alignment across Squads. Each Tribe has a Tribe Leader who is responsible for helping coordinate across Squads and encouraging collaboration. 

                                              In our example, for the Squad in charge of the feature “Get Data Governed for growth”, our OKRs map tells us that there is a Tribe in charge of “Get the Data Catalog ready”.

                                              Chapter

                                              Even though Squads are autonomous, it’s important that specialists (Data Stewards, Analysts) align on best practices. Chapters are the family that each specialist has, helping to keep standards in place across a discipline.

                                              Guild

                                              Team members who are passionate about a topic can form a Guild, which essentially is a community of interest (for example: data quality). Anyone can join a Guild and they are completely voluntary. Whereas Chapters belong to a Tribe, Guilds can span different Tribes. There is no formal leader of a Guild. Rather, someone raises their hand to be the Guild Coordinator and help bring people together.

                                              Here is an example of a Feature Team organization:

                                              feature teams example zeenea

                                              Don’t miss next week’s SE02 E01:

                                              Building your Data Community, where we will help you adapt your organisation in order to become more data-driven.

                                               

                                              Copyright Zeenea 2021, all rights reserved.

                                              Zeenea Effective Data Governance Framework | S01-E04 – SWOT Analysis

                                              Zeenea Effective Data Governance Framework | S01-E04 – SWOT Analysis

                                              This is the fourth episode of our series “The Zeenea Effective Data Governance Framework”.

                                              Split into three seasons, this first part will focus on Alignment: understanding the context, finding the right people, and preparing an action plan in your data-driven journey. 

                                              [SEASON FINALE] This episode will give you the keys to build a concrete and actionable SWOT analysis.

                                              Season 1: Alignment

                                              • RUnderstand the context
                                              • RGet the right people
                                              • RPrepare for action

                                                S01 E01

                                                Evaluate your Data maturity

                                                S01 E02

                                                Specify your Data strategy

                                                S01 E03

                                                Getting sponsors

                                                S01 E04

                                                Build a SWOT analysis

                                                Season 2: Adapting

                                                • RCreate your personas
                                                • RIdentify key roles
                                                • RSet your objectives

                                                  S02 E01

                                                  Organize your Data Office

                                                  S02 E02

                                                  Organize your Data Community

                                                  S02 E03

                                                  Creating Data Awareness

                                                  Season 3: Implementing Metadata Management with a Data Catalog

                                                  • RGet to know your data
                                                  • RIterate your data catalog

                                                    S03 E01

                                                    The importance of metadata

                                                    S03 E02

                                                    6 weeks to start your data governance journey

                                                    In our previous episode, we discussed the different means to obtain the right level of sponsorship to ensure endorsement from decision makers.

                                                    This week, we will teach you how to build a concrete and actionable SWOT analysis to assess the company Data Governance Strategy in the best possible way.

                                                     

                                                    What is a SWOT analysis?

                                                    Before we give our tips and tricks on building the best SWOT analysis possible, let’s go back and define what a SWOT analysis is. 

                                                    A SWOT analysis is a technique used to determine and define your Strengths, Weaknesses, Opportunities, and Threats (SWOT). Here are some examples:

                                                    Strengths

                                                    This element addresses the things your company or department does especially well. This can be a competitive advantage or a particular attribute on your product or service. An example of a “strength” for a data-driven initiative would be “Great data culture” or “Data shared across the entire company”. 

                                                    Weaknesses

                                                    Once your strengths are listed, it is important to list your company’s weaknesses. What is holding your business or project back? Taking our example, a weakness in your data or IT department could be “Financial limitations”, “Legacy technology”, or even “Lack of a CDO”. 

                                                    Opportunities 

                                                    Opportunities refer to favorable external factors that could give an organization a competitive advantage. Few competitors in your market, emerging needs for your product.. all of these are opportunities for a company. In our context, an opportunity could be “Migrating to the Cloud” or “Extra budget for data teams”. 

                                                    Threats

                                                    The final element of a SWOT analysis is Threats – everything that poses a risk to either your company itself or its likelihood of success or growth. For a data team, a threat could be “Stricter regulatory environment for data” for example.

                                                    S1E4 - SWOT

                                                      How to start building a smart SWOT analysis?

                                                      Building a good SWOT analysis means adopting a democratic approach that will ensure you don’t miss important topics.

                                                      There are 3 principles you should follow:

                                                      Gather the right people

                                                      Invite different parts of your Data Governance Team stakeholders from Business to IT, CDO and CPO representatives. You’ll find that different groups within your company will have entirely different perspectives that will be critical to making your SWOT analysis successful.

                                                      Throw your ideas against the wall

                                                      Doing a SWOT analysis consists, in part, in brainstorming meetings. We suggest giving out sticky-notes and encouraging the team to generate ideas on their own to start things off. This prevents group thinking and ensures that all voices are heard.

                                                      This first ceremony should be no more than 15 minutes of individual brainstorming, Put all the sticky-notes up on the wall and group similar ideas together. 

                                                      You can allot additional time to enable anyone to add notes at this point if someone else’s idea sparks a new thought.

                                                      Rank the ideas

                                                      It is now  time to rank the ideas. We suggest giving a certain number of points to each participant. Each participant will rate the ideas by assigning points to the ones they consider most relevant. You will then be able to prioritize them with accuracy.

                                                      Toolkits for your SWOT analysis

                                                      In our first episode, we helped you analyze your Data Maturity.

                                                      We suggested you build a SWOT analysis for each aspect. It is interesting to focus on those for which your company score was low and spend more time on them and draft an improvement plan as described below:

                                                        S1E4 - GROUP

                                                        The Improvement Plan should update your OKRs, with new actionable activities and potentially new stakeholders with Objectives, Key Results and Deadlines.

                                                        For example, in order to improve the Data Culture, you may want to involve the head of HR to launch specific training sessions, and create new roles, responsibilities or job descriptions.

                                                        You could also want to change the Data Access Requests to certain Data Sources in order to gain more flexibility and fluidity.

                                                        Don’t miss the beginning of season 2 next week where we will help you adapt your organization towards becoming more Data-driven.

                                                        Copyright Zeenea 2021, all rights reserved.

                                                        JOIN US AT GARTNER DATA & ANALYTICS SUMMIT 2021!

                                                        JOIN US AT GARTNER DATA & ANALYTICS SUMMIT 2021!

                                                        Zeenea is proud to announce that we will be sponsoring this year’s Europe’s premier meeting place for the Chief Data Officer Community on June 15th – 16th! We are looking forward to meeting you for this 100% virtual event !

                                                        Why should you join us?

                                                        • Discover more about Zeenea Data Catalog 
                                                        • Attend to our speaking session 
                                                        • Request a meeting with Zeenea Team

                                                        Why attend CDO Exchange 2021 ? 

                                                        • Learn from industry leaders 
                                                        • Network with 60 cross-industry peers
                                                        • Empower yourself with right solution

                                                        The simplicity of Zeenea’s implementation allows to deliver & use its cloud-based data catalog anywhere in the world!

                                                        ATTEND OUR SPEAKING SESSION!

                                                        “THE 7 LIES OF DATA CATALOG PROVIDERS you should know before buying!

                                                        Today, 95% of Data Catalog purchasing intentions are triggered by Data Lakes. You’ll understand why Data Catalogs have nothing to do with Data Quality, Data Privacy, Data Governance and more…In this session, Luc Legardeur, VP International Operations and Co Founder will share  Zeenea’s beliefs on what a Data Catalog should do and should not do, followed by a demo. 
                                                                                                                                                                                                                          Don’t miss our Zeenea’s presentation by Luc Legardeur (Co-founder of Zeenea) on June 15th at 11:55 AM CEST  to see how a smart data catalog can help you scale from local use cases to enterprise-wide metadata management.`
                                                         

                                                        At the end of the session, you will be able to download the eBook “The 7 lies of Data Catalog Providers” by Guillaume Bodet, CEO.

                                                        [PRESS RELEASE] Record Year for Zeenea: triple-digit growth and a strong international expansion

                                                        [PRESS RELEASE] Record Year for Zeenea: triple-digit growth and a strong international expansion

                                                        Paris, May 3rd, 2021.

                                                        Zeenea, the Next-Gen Data Catalog provider is proud to announce triple-digit growth and a strong international expansion.

                                                        “Our unique features, such as our SaaS native platform, our Universal Connectivity, our Knowledge Graph for dealing with complex ontologies as well as the launch of Zeenea Explorer, has given us a strong competitive edge over more traditional Data Catalog providers.

                                                        We are beating our competitors and replacing them onsite because our Data Catalog is modern, easy to use, easy to implement, and easy to scale”

                                                        Guillaume Bodet

                                                        CEO , Zeenea

                                                        Zeenea’s international expansion only began in 2020, and the data catalog provider already managed to acquire customers in 10 different countries.

                                                        “With flags in the US, UK, Germany, Scandinavian countries, the Netherlands and South Africa, we have demonstrated that our SaaS model facilitates considerably rapid international expansion.

                                                        Our platform is not specific to any industry and we are proud to cater for customers in the following industries: Banks, Insurances, Manufacturing, Retail, Pharmaceuticals, Software, Media and Gaming” 

                                                        Luc Legardeur

                                                        VP International Operations, Zeenea

                                                        Zeeena has signed, amongst others, with prestigious brands such as Natixis, BMW and Kering.

                                                        SOME OF ZEENEA’S CLIENTS

                                                        natixis
                                                        Kering_Logo
                                                        healios_logo_standard

                                                        About Zeenea 

                                                        Zeenea is the Cloud-native Data Catalog that helps companies accelerate their data initiatives. Our cloud-based platform offers a reliable and comprehensible database available with maximum simplicity and automaticity. In just a few clicks, you can find, discover, govern, and manage your company’s information. What makes Zeenea’s platform unique is that we offer a data catalog with two different user experiences to democratize data access for all.

                                                        What are the differences between a Data Analyst and a Business Analyst?

                                                        What are the differences between a Data Analyst and a Business Analyst?

                                                        So similar yet, so different! The roles of a Data Analyst and a Business Analyst are very often unclear, even though their missions are very different. Their functions being more complementary than not, let’s have a look at these two highly sought-after profiles. 

                                                        Data is now at the heart of all decision-making processes. According to a study conducted by IDC on behalf of Seagate, the volume of data generated by companies worldwide is expected to reach 175 Zetabytes by 2025…

                                                        In this context, collecting information is no longer enough. What’s important is the ability to draw conclusions from this data to make informed decisions. 

                                                        However, the interpretation methods used and the way to exploit data can be very different. The ever-changing nature of data has created new domains of expertise with titles and functions that are often misleading or confusing.

                                                        What separates the missions of the Data Analyst to those of the Business Analyst may seem tenuous. And yet, their functions, roles, and responsibilities are very different… and complementary! 

                                                         

                                                        Business Analyst & Data Analyst: a common ground 

                                                        If the role of a Business Analyst and of a Data Analyst are sometimes unclear, it is because their missions are inherently linked to creating value with enterprise information. 

                                                        What distinguishes them is the nature of this information.

                                                        While a Data Analyst works on numerical data, coming from the company’s information systems, the Business Analyst can exploit both numerical and non-numerical data. 

                                                        A data analyst must ensure the processing of data within the company to extract valuable analytic trends that enable teams to adapt to the organization’s strategy. The business analyst then provides answers to concrete business issues based on a sample of data that may exceed the data portfolio generated by the company. 

                                                         

                                                        A wide range of skills

                                                        Data Analysts must have advanced skills in mathematics and statistics. A true expert in databases and computer language, this data craftsman often holds a degree in computer engineering or statistical studies.

                                                        The Business Analyst, on the other hand, has a less data-oriented profile (in the digital sense of the term). If they use information to fulfill their missions, they will always be in direct contact with management and all of the company’s business departments. Although the Business Analyst may have skills in algorithms, SQL databases or even master XML language, they are not necessarily an essential prerequisite. 

                                                        A Business Analyst must therefore be able to demonstrate a real know-how to communicate, listen, hear and understand the company’s challenges. For a Data Analyst on the other hand, technical skills are essential. SQL language, Python, Data modeling and Power BI, IT and analytics expertise will allow them to exploit the data in an operational dynamic.

                                                         

                                                        The differences in responsibilities and objectives

                                                        The Data Analyst’s day-to-day work consists above all of enhancing the company’s data assets. To this end, he or she will be responsible for data quality, data cleansing and data optimization.

                                                        Their objective? To provide internal teams with usable databases in the best conditions and to identify all the improvement levers likely to impact the data project. 

                                                        The Business Analyst will benefit from the work of the Data Analyst and will contribute to making the most of it by putting the company’s native data into perspective with peripheral data and information. By reconciling and enhancing different sources of information, the Business Analyst will contribute to the emergence of new market, organizational or structural opportunities to accelerate the company’s development.

                                                        In short, the Data Analyst is the day-to-day architect of the company’s data project. The Business Analyst is the one who intervenes, in the long run, on the business strategy. To meet this challenge, he or she bases his or her action on the quality of the data analyst’s work. 

                                                         

                                                        Two complementary missions, two converging profiles that will allow organizations to make the most of their data culture! 

                                                         

                                                        Zeenea Effective Data Governance Framework | S01-E03 – Getting sponsorship

                                                        Zeenea Effective Data Governance Framework | S01-E03 – Getting sponsorship

                                                        This is the third episode of our series “The Zeenea Effective Data Governance Framework”.

                                                        Split into three seasons, this first part will focus on Alignment: understanding the context, finding the right people, and preparing an action plan in your data-driven journey. 

                                                        This third episode will give you the keys on how to get good sponsorship for your data projects.

                                                        Season 1: Alignment

                                                        • RUnderstand the context
                                                        • RGet the right people
                                                        • RPrepare for action

                                                          S01 E01

                                                          Evaluate your Data maturity

                                                          S01 E02

                                                          Specify your Data strategy

                                                          S01 E03

                                                          Getting sponsors

                                                          S01 E04

                                                          Build a SWOT analysis