GDPR : data minimization is the key

samedi 10 février 2018, par Dataplana

« Big Data » is frequently invoked as if it were a real deus ex machina. As if dumping all the existing data into a large distributed architecture with off-the-shelf Business Intelligence software would magically reveal its best-hidden secrets. But actually, there is no purely technical solution to generate value and meaning out of data.

Worse, these heavy Hadoop architectures full of fatty data often complicate the task by increasing the number of components and players. Indeed, the so-call « big data analysis » requires something infinitely more valuable than machines : real expertise in data-science and demanding algorithms.

At Dataplana, we are convinced that relevant Big Data analyzes cannot be obtained just by brute force computing in a vast ocean of uncontrolled data : mathematical algorithms on questionable data will always generate questionable results (« garbage in, garbage out »).

Data minimisation is the key. Instead of storing the whole stream of data as the traditional Big Data solutions do, the Dataplana processing engine constructs a « data lake » containing useful, qualified and consolidated data. In this lake, it will be easier and faster to fish the answers pertaining to your business needs. The benefits are : savings in energy, time and cost, but also a better proficiency in the data and a limited risk of incorrect analyzes.

Our approach is unorthodox because truly centered on the very DNA of your data : it is the preliminary analysis of the data by a data-scientist that allows us to optimize processing on software architectures at low cost, but also and above enable the production of results that perfectly meets its business logic.

The Dataplana « smart engine » classifies the information upstream and stores only valorizable data : a multi-disciplinary approach, combining data-science, system architecture and design to create agile Business Intelligence solutions. Our data-scientists, also « full-stack » developers, have the 360 degree vision of the project and its data. Thus, the same resource is able to understand the business rules, to modelize the data and to quickly develop the appropriate algorithms as part of a continuous improvement loop.

Columnar, in-memory, locally cached and pipelined, our « data lake » can be mentalized as a set of matrix, with vectorized business rules. It can be easily interfaced with any linear algebra calculation software, including popular machine learning packages like Scikit-learn.

Our solution does not impact your IT : you just have to export the necessary data as it is (CSV, XML, JSON...) into a deposit area. This way, you stay in complete control. With data minimisation at the very core of our engine, data storage is strictly limited to the goal to be achieved and we are compliant with the GDPR by design.

Dataplana has proven that its technology was able to supervise and investigate the quality of service of 6 million IPTV customers, generating one terabyte of network monitoring data sets every week, on a single workstation. We got exceptional results in the data-investigation of complex malfunctions that were quickly turned into a dramatic surge of the customer satisfaction. This validates completely the relevance of our solution which is especially suitable for strong flows of technical data (logs, sensors, probes...) that require specific business rules.

If you really want to understand your data, and bring out its hidden wealth, then you’ve come to the right place : let us cook your data and get the quintessence of it.

Dataplana, your « data-partner » :
- Consulting services to analyze your data and provide you with solutions for valorizing your data capital.
- Development of turnkey solutions fulfilling your business needs.
- Training in applied data-science.

Contact us :