European financial authorities make their rich collection of personal data on European citizens available to financial institutions in a synthesized form

The Directorate-General for Financial Stability, Financial Services and Capital Markets (DG FISMA) of the European Commission (EC) announced:
Synthetic data enables national authorities to make their financial data publicly available.

These data are shared through the EU Data Hub, part of the EU Digital Finance Platform.

According to this announcement a machine learning model analyses real customer data to understand relationships between factors, such as how income level affects loan default risk. After that:

Then, this analysis is used to train an algorithm to replicate these trends. Finally, the model generates new data that mirrors the patterns of the original data but does not represent any actual person. For instance, it might create a synthetic “customer” with certain financial characteristics, but this customer doesn’t exist.

There are several reasons for using synthetic data. One major benefit is privacy protection, as the synthetic nature of the data removes the risk of revealing real customer information, ensuring compliance with privacy regulations. It also enables data sharing, as central banks and national authorities can share synthetic data with external partners or researchers without privacy concerns. Additionally, the synthetic data retains enough quality to train models effectively, meaning there is no significant loss in accuracy.

The announcement does not mention where the personal data of Europeans come from.

 

More information:

Publications:

  • Report: Synthetic data in the Data Hub of the Digital Finance Platform, announcement 9 April 2024, technical report (pdf). Abstract: “In a cooperation with DG FISMA, the JRC is has tested a data synthetization software indented for the EU Data Hub. The purpose of this testing is to ensure that the new dataset, which will be made available as open data, maintains the properties of the original dataset while also protecting privacy and confidentiality. The report describes the datasets used to test the methodology and the steps taken for the synthetization. In addition, the report compares the main statistical properties of the original and new database, and summarize the tests performed to draw conclusions on potential confidentiality and privacy issues. By testing and validating the data synthesis software, the JRC and DG FISMA are working to ensure that the new dataset will be a valuable resource for firms and researchers, while also respecting confidentiality issues.
  • Study: European Data Spaces – Scientific Insights into Data Sharing and Utilisation at Scale, announcement of the study 12 June 2023, study (pdf). From the abstract: “This report distils technical and organisational lessons learned from the scientific work of the JRC that can inform the scoping and implementation of common European data spaces as envisioned by the European Strategy for data.“.
  • Study: Technological Enablers for Privacy Preserving Data Sharing and Analysis, announcement 17 August 2023, study (pdf). From the abstract: “As data becomes more important, so does the need to protect privacy, in the sense of protecting the personal, confidential and/or private information that it contains. Privacy Enhancing Techniques (PETs) are a key enabler technology for ensuring that privacy is maintained while extracting value from the data.
    The first objective of this report is to analyse various PETs with an assessment of their usability and maturity in the context of data sharing scenarios, and in particular within common European data spaces. The second objective is to demonstrate the application of one PET in a collaborative scenario between different entities. Over the course of this analysis, these objectives were accomplished in two phases: first, a detailed state-of-the-art analysis and evaluation of the pros, cons, and maturity (i.e., their TRL level) of the different PETs was carried out, in order to select the one of greatest interest to propose a collaborative scenario between different entities. Secondly, a realistic use case from the healthcare domain, therefore relevant to the European Health Data Space was implemented with the selected PET (Federated Learning) and the results were evaluated
  • Study: FABLES: Framework for Autonomous Behaviour-rich Language-driven Emotion-enabled Synthetic populations, announcement 13 October 2023, study (pdf). From the abstract: “The research investigates how large language models (LLMs) emerge as reservoirs of a vast array of human experiences, behaviours, and emotions. Building upon prior work of the JRC on synthetic populations , it presents a complete step-by-step guide on how to use LLMs to create highly realistic modelling scenarios and complex societies of autonomous emotional AI agents. This technique is aligned with agent-based modelling (ABM) and facilitates quantitative evaluation.
Onbekend's avatar

About Ellen Timmer

Weblog: https://ellentimmer.com/ ||| Microblog: https://mastodon.nl/@ellent ||| Motto: goede bedoelingen rechtvaardigen geen slechte regels
Dit bericht werd geplaatst in English - posts in English on this blog, Europa, Financieel recht, onder meer Wft, Wtt, Fraude, witwasbestrijding, Wwft, Grondrechten, ICT, privacy, e-commerce en getagd met , , , , , , , , , , , , . Maak de permalink favoriet.

Plaats een reactie