News /
Sneek Peak: Use and Share Medical Data freely
Practical Insights
Practical Insights
July 16, 2024
Written by:
Richard Schreiber

Sneek Peak: Use and Share Medical Data freely

Get a quick preview of our medical whitepaper. Learn how artificial data can be used in the medical sector to enable free sharing of sensitive data.

Vadim Borisov

CTO

The Challenge of Patient Data Privacy


Healthcare providers and pharmaceutical companies face the dual challenge of protecting patient data while extracting meaningful insights for research and development. Conventional anonymization often degrades data quality and also leaves it vulnerable to re-identification. Research by Latanya Sweeney and others shows that supposedly anonymized datasets can be re-identified using public available records. The need for robust data privacy solutions is critical, especially with increasing data-driven decision-making and the upcoming AI era in healthcare. Traditional methods fail to ensure privacy and data usability simultaneously, so analyzing, collaboration and data sharing for example between clinics and healthcare companies is usually impossible.
For a deeper analysis, more examples and more specific use cases than in this blog post, download our medical whitepaper.

{{Whitepaper}}

Tabularis Technology: A New Solution Using Artificial Data Conversion


Tabularis uses transformer-based language models (LLMs) to generate hyper-realistic artificial tabular data based on the original, sensitive data. In general, Tabularis technology allows an LLM to efficiently abstract all distributions and correlations of a sensitive data set and then sample a completely new data set that reflects all learned correlations but does not contain or reveal any of the original sensitive data points. This method addresses challenges like lossy preprocessing and re-identification risks of anonymization while ensuring best data quality. The process involves:

Flowchart of the conversion steps of artificial data with Tabularis.


A Real-World Case Study


To demonstrate our artificial data’s efficacy, we compared a primary healthcare dataset with its artificial version generated by Tabularis. This showcases how Tabularis maintains the statistical properties and correlations, ensuring the artificial data is as usable for analysis as the primary data while ensuring impossible re-identification and therefor full privacy.
To highlight the similarities between the primary and our artificial dataset, we show visual representations of the comparison. In the following you see Boxplots representing distributions of three Parameters in comparison and a correlation Matrix showing the difference of correlations between both datasets.

The box plots show a comparison of the distributions of the variables between the primary data set and the artificial data set. The matrix shows the differences in the percentage correlations between the primary and artificial data set. A consistently low difference can be seen, the highest deviation is 0.04. For a larger view, download the detailed Medical Whitepaper.

Benefits

  • Data Privacy and Legal Compliance: Ensures complete privacy and compliance with GDPR and HIPAA by eliminating connections to real individuals. Remove the personal information in a legally valid form.
  • Preservation of Statistical Integrity: Retains distributions, correlations, and variances, making artificial data perfect for analysis and machine learning.
  • Cost-Effective Data Management and Risk Mitigation: Reduces costs and risks associated with sensible data preparation and handling.
  • Enables Free Data Sharing: Allows Collaboration on high quality data through legal Compliance.

Summary


Tabularis offers a powerful, secure, and efficient way to leverage healthcare data. By ensuring full privacy and preserving statistical integrity, it enables comprehensive use without compromising patient privacy. Tabularis artificial data can drive innovation and improve patient outcomes by enabling safe and effective data use and sharing. For a deeper analysis, more examples and more specific use cases, download our medical whitepaper.


Next Steps

{{next-steps}}

Get access to the full medical Whitepaper

Please enter your email address and we will send you the complete medical white paper with more detailed analyses, explanations and examples.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Vadim Borisov

CTO

Related Posts

No items found.
News /
Sneek Peak: Use and Share Medical Data freely
Practical Insights
Practical Insights
July 16, 2024
Written by:
Richard Schreiber

Sneek Peak: Use and Share Medical Data freely

Get a quick preview of our medical whitepaper. Learn how artificial data can be used in the medical sector to enable free sharing of sensitive data.

Get access to our medical Whitepaper

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Table of Content

Richard Schreiber

CEO

The Challenge of Patient Data Privacy


Healthcare providers and pharmaceutical companies face the dual challenge of protecting patient data while extracting meaningful insights for research and development. Conventional anonymization often degrades data quality and also leaves it vulnerable to re-identification. Research by Latanya Sweeney and others shows that supposedly anonymized datasets can be re-identified using public available records. The need for robust data privacy solutions is critical, especially with increasing data-driven decision-making and the upcoming AI era in healthcare. Traditional methods fail to ensure privacy and data usability simultaneously, so analyzing, collaboration and data sharing for example between clinics and healthcare companies is usually impossible.
For a deeper analysis, more examples and more specific use cases than in this blog post, download our medical whitepaper.

{{Whitepaper}}

Tabularis Technology: A New Solution Using Artificial Data Conversion


Tabularis uses transformer-based language models (LLMs) to generate hyper-realistic artificial tabular data based on the original, sensitive data. In general, Tabularis technology allows an LLM to efficiently abstract all distributions and correlations of a sensitive data set and then sample a completely new data set that reflects all learned correlations but does not contain or reveal any of the original sensitive data points. This method addresses challenges like lossy preprocessing and re-identification risks of anonymization while ensuring best data quality. The process involves:

Flowchart of the conversion steps of artificial data with Tabularis.


A Real-World Case Study


To demonstrate our artificial data’s efficacy, we compared a primary healthcare dataset with its artificial version generated by Tabularis. This showcases how Tabularis maintains the statistical properties and correlations, ensuring the artificial data is as usable for analysis as the primary data while ensuring impossible re-identification and therefor full privacy.
To highlight the similarities between the primary and our artificial dataset, we show visual representations of the comparison. In the following you see Boxplots representing distributions of three Parameters in comparison and a correlation Matrix showing the difference of correlations between both datasets.

The box plots show a comparison of the distributions of the variables between the primary data set and the artificial data set. The matrix shows the differences in the percentage correlations between the primary and artificial data set. A consistently low difference can be seen, the highest deviation is 0.04. For a larger view, download the detailed Medical Whitepaper.

Benefits

  • Data Privacy and Legal Compliance: Ensures complete privacy and compliance with GDPR and HIPAA by eliminating connections to real individuals. Remove the personal information in a legally valid form.
  • Preservation of Statistical Integrity: Retains distributions, correlations, and variances, making artificial data perfect for analysis and machine learning.
  • Cost-Effective Data Management and Risk Mitigation: Reduces costs and risks associated with sensible data preparation and handling.
  • Enables Free Data Sharing: Allows Collaboration on high quality data through legal Compliance.

Summary


Tabularis offers a powerful, secure, and efficient way to leverage healthcare data. By ensuring full privacy and preserving statistical integrity, it enables comprehensive use without compromising patient privacy. Tabularis artificial data can drive innovation and improve patient outcomes by enabling safe and effective data use and sharing. For a deeper analysis, more examples and more specific use cases, download our medical whitepaper.


Next Steps

{{next-steps}}

Get access to the full medical Whitepaper

Please enter your email address and we will send you the complete medical white paper with more detailed analyses, explanations and examples.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Get access to the full medical Whitepaper

Please enter your email address and we will send you the complete medical white paper with more detailed analyses, explanations and examples.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Resources

Stay informed

Subscribe

Get in Contact

Connect

Download Whitepaper

Get Access

Technically interested?

Get Research Papers

Richard Schreiber

CEO

Related Posts