Synthetic 'twin' dataset balances privacy, data accuracy

October 11, 2019 //By Rich Pell
Synthetic 'twin' dataset balances privacy, data accuracy
AI platform startup Diveplane (Raleigh, NC) has announced the availability of what it claims is the industry's first verifiable synthetic 'twin' dataset.

GEMINAI, says the company, enables businesses and government organizations to easily and safely sell, share, and analyze sensitive datasets without the fear of mishandling, loss, or theft. The 'twin' dataset looks, acts, and feels realistic for the purposes of data modeling and analysis, but does not contain any personally identifiable information, which is critical for businesses that need to adhere to national and international privacy laws and compliance requirements.

"We love seeing AI increasingly adopted by many industries, but we're finding that not all AI is created and trained equally," says Dr. Michael Capps, CEO of Diveplane. "Many businesses are forced to use inaccurate or incomplete data to train their AI due to privacy requirements, which can lead to the AI making poor or misleading decisions. With GEMINAI, we're eliminating that risk by creating a verifiable synthetic 'twin' of the dataset, so that businesses don't need to sacrifice the quality of their AI for the sake of privacy."

While other privacy techniques take data and simply mask certain slices of information, GEMINAI takes a dataset and uses it to create entirely new datapoints that did not previously exist while maintaining the statistical relationships and nuances contained in the original dataset. GEMINAI, says the company, can be used to:

  • Assist with medical research efforts. If a hospital was able to share truly anonymized patient records with nonprofits and research universities it could help speed advances in the medical field and save more lives. GEMINAI can create those anonymized records so medical organizations do not need to worry about breaching HIPAA.
  • Secure the multi-billion dollar data sharing industry. Data is being shared and sold at an incredible rate, and rarely do organizations take the extra step to de-identify their datasets. GEMINAI easily creates synthetic data that does not contain any personally identifiable information, so there is no danger of unintentionally revealing an individual or entity if the information gets into the

Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.