Exact Data Matching

Understanding Exact Data Matching

Rather than relying on pattern matching to identify sensitive data, this method compares scanned data against actual data that is known to be sensitive. It looks for copies of the actual sensitive data instead of a pattern that describes it. This method is geared towards structured data with fields and elements such as databases, and when used properly, it nearly eliminates false positives, freeing up your team to focus on important (real) security events.

Exact Data Match (EDM):

  • Optimized for Large Datasets: Skyhigh’s Enhanced EDM offers unparalleled performance, being able to match against billions of cells in under a second.

  • Data Precision: EDM ensures accurate matching of specific data patterns, reducing the risk of false positives.

  • Supports Traditional DLP Logic: Policies can use EDM as well as traditional logic such as proximity and keyword validation, where a data field or keyword (or another data field) needs to appear within a certain distance of each other.

Protects Sensitive Data with One-Way Hashes

A prudent observation of such an approach might be that providing sensitive data to a third party such as Skyhigh is a risk in itself. To prevent this risk exposure, Skyhigh sends only the hashed values of your sensitive data to our cloud, which means that Skyhigh is not storing your data in any original or reversible form.

Sample Original Data

HIC First Name Middle Name Last Name Birthdate Gender Diagnosis Codes Primary Physician Company Address City State Zip Code Email
HIC10011 Rachel Marie Turner 1983-09-25 F L30.9 Dr. Christopher Lee Tech Innovators 789 Elm St TechCity CA 90123 rachel.turner@emai
HIC10012 Benjamin Thomas White 1978-03-12 M I10 Dr. Melissa Harris Health Solutions 567 Oak St Healthtown TX 34567 benjamin.white@ema
HIC10013 Grace Elizabeth Taylor 1990-06-30 F F41.1 Dr. Andrew Smith Wellness Corp 123 Pine St Wellsville NY 56789 grace.taylor@email
HIC10014 William James Adams 1986-11-18 M G82.9 Dr. Sophia Brown Med Solutions 456 Birch St Medville FL 23456 william.adams@emai
HIC10015 Emma Rose Johnson 1997-02-14 F N23.9 Dr. Daniel Miller HealthTech 101 Cedar St Techville CA 67890 emma.johnson@emai

Hashed Version of the Same Data

HIC First Name Middle Name Last Name Birthdate Gender Diagnosis Codes Primary Physician Company Address City State Zip Code Email
b6af09d8f5a90e72 a2e5c4d9c163fd0e a94a8fe5ccb19ba6 78abfb18d8281455 7c979a7e785c62bf a3f390d88e4c41f2 1a3968efea0b59e2 44cc9e24e5eb3dd8 579cbaca35f204a4 02e4a515a77a61cd 0c6457b1f4978f97 d907a5b29e78c55e 84a516841ba77a5b 9e69d3654a2e772c
d7834f61b132b334 5b1a7c762408b330 4a0679561e37b2d8 5c4e150b9b0e3c41 e5502a7c68d847d0 4a0679561e37b2d8 7dd8d9d5e42f2e7b 0467079ec39ce477 189c8726ad7a0a1f f249c0e6b9750200 8fc7c72b649350bf 6cf12ebd85f90e9e d596f4bfe8a6d0bc 10a86c3a4c776f0e
56e8d8a5d2b1e5b6 0fcdfd31b3992a9f a764ef4a8e6cb0e2 71fdda2e8b3e47c5 b4dd4ee006d97b0d c59d4083ccceccee 4df1c97c1fc2b839 6797fbd7c3e25be9 f3ff8e793a092bb6 a4655d55ea6f78de 9d8c99252e86fe8c 5715a29c338d3d0e 9b8f0f12548cdd8e 7d04f11b993a93d0
b6af09d8f5a90e72 a2e5c4d9c163fd0e a94a8fe5ccb19ba6 78abfb18d8281455 7c979a7e785c62bf a3f390d88e4c41f2 1a3968efea0b59e2 44cc9e24e5eb3dd8 579cbaca35f204a4 02e4a515a77a61cd 0c6457b1f4978f97 d907a5b29e78c55e 84a516841ba77a5b 9e69d3654a2e772c

Understanding Intelligent Data Matching

Skyhigh also supports Intelligent Data Matching (IDM), which works in a way similar to EDM except that it is geared towards unstructured data such as documents. Rather than requiring an exact match, IDM allows you to set a percentage similarity. For example, you can configure it to match a scanned file if the content is 90% the same as a file that is known to be sensitive. This method works best where the organization relies heavily on templated documents such as health, finance, or tax documents.

Intelligent Data Match (IDM):

IDM, or Intelligent Data Match, utilizes advanced algorithms and contextual analysis to identify sensitive data. Its advantages in the context of DLP include:

  • Contextual Analysis: IDM considers data context, enabling it to identify sensitive information even when it’s not an exact match.

  • Adaptive Protection: It can adapt to evolving data threats and dynamically adjust data protection measures.

Tip

Due to time constraints, we will not be configuring IDM in today’s lab. However, the process is similar to EDM, and you should feel free to try it out after the completion of the guided portion of today’s lab.

EDM and IDM are Complementary Technologies

By integrating Exact Data Match (EDM) and Intelligent Data Match (IDM) into a holistic DLP strategy, you can enhance data protection, reduce false positives, and improve your organization’s overall security posture.

Let’s Get Started with EDM