Data Snacks

Data Snack: Opportunities in generating synthetic financial data

  • Collecting, organizing and sharing data can be a costly and risky process.
  • FIs and Big Tech are looking into generating 'synthetic data'. Which mimics real data but removes the risk of privacy breaches occurring due to exposure of Personally Identifiable Information.
close

Email a Friend

Data Snack: Opportunities in generating synthetic financial data

While data can be exceptionally useful for analytics and strategizing, mismanaging access to it can lead to significant security risks for both organizations and consumers. Personally Identifiable Information  poses a challenge for organizations, who generally want to retain as much detail as they can, without exposing customers to privacy risks.

One solution is synthetically generated data, which mimics real data sets but does not hold any PII. Moreover, synthetic data circumvents the labor and costs attached to data collection and organization, allowing teams to develop algorithms faster and with less red tape.

In the past year companies like Microsoft, Google, and Amazon have all spoken to the importance of synthetic data and its use in their current architecture. San-Diego based startup and synthetic data creator Gretel.ai closed a $50 million Series B funding round in October, led by Anthos Capital. Their products, such as a privacy toolkit, safeguard synthetic data from adversarial attacks and also enables teams to de-bias and anonymize their data sets, while also allowing for the sharing of data among teams more securely.

JP Morgan’s AI research has developed the following model for generating synthetic data sets: 



Source: JP Morgan

The flow diagram is explained by JP Morgan as follows: 

Step 1:  Compute metrics for the real data
Step 2:  Develop a Generator (may be statistical methods or an agent-based simulation)
Step 3:  (Optional) Calibrate the Generator using the real data
Step 4:  Run the Generator to generate synthetic data
Step 5:  Compute metrics for the synthetic data
Step 6:  Compare the metrics of the real data and synthetic data
Step 7:  (Optional) Refine the Generator to improve against comparison metrics

In their research on the subject, JP Morgan found that tabular data in retail banking and time series of market microstructure data are the most in need of protection by financial institutions.

Tune into our Data Day Conference on the 21st of June to find out more about how data is changing the fintech landscape. 

0 comments on “Data Snack: Opportunities in generating synthetic financial data”

Data Snacks

Natural disasters are disastrous for consumers’ financial health

  • 2 in 5 Americans live in states where losses due to natural disasters are higher than the national average.
  • But those in high-loss states are not covered by residential insurance at the same rate as those who live in low-loss states, which may leave consumers more financially vulnerable in the event of a natural disaster.
Rabab Ahsan | September 05, 2023
Data Snacks, Lending

46% of student loan borrowers aren’t ready to start making payments

  • Borrowers will be expected to start making student debt payments in October, but many report not being prepared for the end of forbearance.
  • Even though the government's Student Debt Forgiveness program was struck down by the Supreme Court, the government has introduced "on-ramp" policies that should ease the load of student debt payments in the first year. But many borrowers don't understand what these policies entail.
Rabab Ahsan | August 07, 2023
Data Snacks

Paying for the big white (and costly) wedding

  • The big white wedding is costly, and couples regret some of their expenditures.
  • Wedding costs present the financial industry with an opportunity to enable access to credit.
Rabab Ahsan | August 04, 2023
Data Snacks

What banks are prioritizing and focusing on for the rest of 2023

  • 2023 has come with its own shift in issues and priorities for banks.
  • On top of mind for most banks are deposits. 52% report it as a top priority.
Rabab Ahsan | June 28, 2023
Data Snacks, Fraud

Fraud and AML investigations are taking more time, but financial crime is getting faster

  • Money laundering in the US makes up 15% to 35% of all money laundering in the world.
  • Spending more time investigating financial crimes comes at a cost, yet most organizations are reporting increased timeframes. The criminals however are getting faster meanwhile.
Rabab Ahsan | June 08, 2023
More Articles