Search for a personal data anonymization software

November 3, 2023

This case study is about the search for a personal data anonymization software.

Client’s problem statement

Client is a French group of publishing companies, subsidiary of a large conglomerate. It is the second-largest French publishing group. They are looking for an anonymization solution of personal data for their corporate non-production environnements. Their IT gathers a large SAP scope (On-Premise & On Cloud) but also many others different systems that may need to be anonymized.

They have 3 main KPIs:

KPI#1: Efficiency, easy to use, Anonymizer tool that can run on SAP and Non-SAP Systems

KPI#3 : Compatibility with both premises and cloud solutions

KPI#2 : High ROI (Sensitive to cost)

List of relevant vendors

Informatica, Delphix, Talend, SAP UI Masking, IBM Optim, Snowflake, DataProf, Baffle, EPI-USE Labs, Alteryix, Oracle, Open Text, Fortanix, Mentis, Genrocket., Arcad Software, Privitar, Tonic.AI, VGS, Precisely, Libelle, Gretel.ai, Imperva, OneTrust, KNIME, K2view, Brighter AI Technologies, Tumult Labs, PrivacyOne, Protegrity, Droon

In just 3 weeks, Vektiq AI-powered platform pinpointed the ideal personal data anonymization software for a RSSI.

Ready to start your search? Link to our self-service platform here.

Performance Matrix

Vendor’s answers

Informatica’s answer

Informatica provides its customers with various tools and features to help them with data anonymization, ensuring compliance with data protection regulations and safeguarding sensitive information. Here’s how Informatica helps its customers with data anonymization:

Data Masking

Sensitive Data Discovery

Data Subsetting and Sampling

Referential Integrity Preservation

Encryption

Policy Management and Governance

Audit and Compliance Reporting

Automated Anonymization

What other vendors say about Informatica

Vs Talend: license cost, ease of use (possibility to enable business users)

Vs Imperva:

– Automated operations

-Retain realistic data attributes for test integrity

-Avoid negative compliance and security impact of

– breaches and audit failures

Vs OneTrust:

-Automated Regulatory Intelligence

-Field Level Discovery Classification

-Automated remediation actions (anonymisation, deletion, redaction masking, etc.)

Vs Precisely: Our pricing model is more simple.

Vs Delphix: Delphix est une solution Low code to no Code ce qui permet une intégration simplifiée dans le système d’information du client. Découverte automatisée des données à caractère personnelle. Delphix permet nativement d’avoir une intégrité référentielle multi-coud et entre plusieurs applications. Performance: avec l’architecture Hyperscale sur base de containers, Delphix permet d’anonymiser des volumes très important de données ( + de 10TB ou plusieurs de 100 aines de millions de lignes) en un temps court.

Vs Alteryx:

Informatica and Alteryx play in different markets.

-The need for curated/trusted data makes Alteryx users more successful within organizations . Alteryx is a complement to your data integration strategy

-Alteryx can make your data integration strategy better. Even in the context of a strong data integration strategy, allowing business and IT to work closely together on the same platform helps organizations unlock value from their data, faster.

-Alteryx is a platform for solving analytical problems, data analysis is not decoupled from data preparation.

-Alteryx is a perfect complement as we overcome the limitations of traditional data integration vendors with:

– Agility: time to value for IT who can use Alteryx to prototype data integration flows and time to insight for the business who can answer any ad-hoc requests, experiment and solve problems.

– Accessibility / self-service: Alteryx offers a platform that can be used by all users, not just IT, especially by business users who best know the data and can help IT build the right workflows.

-Analytics, not just data preparation

Vs GenRocket:

Traditional TDM vendors such as Informatica use a Gold Copy approaches for test data that has many drawbacks:

-The test data only meets 50% of requirements for volume, variety and formats so organizations have to also manually create test data (spreadsheets, scripts, batch runs)

-Test data reservation is required because there isn’t enough tests data for different testing teams

-Test data storage is expensive (all those terabytes of Gold Copy data)

-Test data goes “stale” quickly and refresh cycles for Gold Copy data in regulated businesses can take days to weeks

-Test data is not easily mapped into a test case as part of a CI/CD pipeline

-Test data use isn’t easily tracked so test data CoE teams lack visibility into ROI

In contrast, GenRocket’s approach for test data automation has many advantages:

-Test data meets close to 100% of test data requirements with no manual data creation

-Test data does not need to be reserved; all testing teams get as much data volume and variety as they need

-Test data does not have to be stored; it is delivered “on demand” in seconds to minutes

-Test data does not go stale because it is dynamically updated, always accurate and always “fresh”

-Test data is mapped directly into a test case, called by the test script as part of a CI/CD pipeline

-Test data use is automatically tracked so test data CoE teams have full visibility into ROI

Delphix’s answer

La plateforme de données Delphix permet d’automatiser la chaîne de distribution de données anonymisées dans les environnements de non-production (Développeurs, QA, analystes..) ainsi que la découverte et la protection des données sensibles dans le respect des réglementations en vigueur (RGPD). Les clients accélèrent ainsi la modernisation de leurs applications tout en réduisant les risques de conformité.

La plateforme est composée de deux solutions principales:

Continuous Data Engine: une plate-forme qui fournit des copies virtuelles légères de bases de données pour les cas d’utilisation de développement/test, de reporting, d’IA/ML et de support de production. Grâce à cette virtualisation, nous pouvons fournir et actualiser des bases de données de grande volumétries (10 To+) en quelques minutes, sans avoir besoin de faire de subsetting ou d’utiliser des données synthétiques.

Avantages du Continuous Data : Consolidation 10:1 des environnements hors-production, Synchronisation sans interruption avec la base de production, Provisionnement rapide, Contrôle de versions de données, Granularité transactionnelle, Multi-Cloud, Intégration CI/CD, Plateforme full API ,Portail de Self-Service

Continuous Compliance: solution d’anonymisation permettant la sécurisation et l’automatisation de la gouvernance autour des données sensibles en veillant à ce que vos données soient en conformité avec les réglementations en vigueur et votre politique interne. Avantages du Continuous Compliance :Profilage automatique des données sensibles, Algorithmes couvrant 98% des cas clients (low to zero code), Anonymisation automatique, Intégrité référentielle multi-cloud sur plusieurs bases et applications, Anonymisation de PB de données en un temps record

What others think of Delphix:

Vs Mantis: Offers rudimentary masking methods like substitution, redaction, tokenization etc. Mage on the other hand offers over 80+ anonymization methods from encryption, masking, tokenization and masking. Unlike Delphix, Mage anonymization can ensure referential integrity across data sources while maintaining data usability and function without compromising data security.

Vs Tonic.AI: Tonic provides more realistic data, preserving the messiness and the noise of the original dataset. Delphix provides standard masking.

Vs Imperva: Easy to deploy simple to use, not dependent on a specific platform

Vs GenRocket: Traditional TDM vendors like Delphix leverage a Gold Copy approach for test data that has many drawbacks:

The test data only meets 50% of requirements for volume, variety and formats so organizations have to also manually create test data (spreadsheets, scripts, batch runs)