data-fingerprinting

What is Data Fingerprinting and similarity detection?

December 3, 2019

With the emergence of Big Data, enterprises found themselves with a colossal amount of data. In order to understand and analyse their data, as well as meet the various regulatory requirements, it is vital for organizations to document their data assets. However, documenting and giving context to thousands of datasets is a very difficult, even impossible, task to do by hand.

Or, you can use Data Fingerprinting!

What is Data Fingerprinting?

In the data domain, a fingerprint represents a “signature”, or fingerprint, of a data column. The goal here is to give context to these columns.

Via this technology, a Data Fingerprint can automatically detect similar datasets in your databases and can document them more easily, making data steward’s tasks less fastidious and more efficient. For example, supervised by the data steward, data fingerprinting technologies allow us to understand that a column of data with the information “France”, “United States”, and “Australia” represents “Countries”.

Data Fingerprinting at Zeenea

In Zeenea’s case, our metadata management platform’s objective is to give meaning and context to your catalogued datasets in the most automatic way as possible. With our Machine Learning technologies, Zeenea identifies dataset schema columns, analyses them and gives them their own “signature”. In this way, if any of these fingerprints are similar, our Data Catalog will make suggestions as to whether the Data Steward should give the same information relative to another.

This technology also gives a means for DPOs to, among others, underline and point out personal or sensitive information that the organization possesses in its databases.

zeenea logo

At Zeenea, we work hard to create a data fluent world by providing our customers with the tools and services that allow enterprises to be data driven.

zeenea logo

Chez Zeenea, notre objectif est de créer un monde “data fluent” en proposant à nos clients une plateforme et des services permettant aux entreprises de devenir data-driven.

Be(come) Data Fluent

Read the latest trends on big data, data cataloging, data governance and more on Zeenea’s data blog.

Join our community by signing up to our newsletter!

Devenez Data Fluent

Découvrez les dernières tendances en matière de big data, data management, de gouvernance des données et plus encore sur le blog de Zeenea.

Rejoignez notre communauté en vous inscrivant à notre newsletter !

LET’S GET STARTED

Make data meaningful & discoverable for your teams

Démarrer MAINTeNaNT

Donnez du sens à votre patrimoine de données