Classification Components

Skyhigh Classification Components

Skyhigh has many different DLP components which you will see in today’s lab; however, it is useful to review some of the most common:

Boolean Logic

Boolean logic provides the ability to combine or exclude conditions which would indicate a DLP violation. For example, you could use Boolean logic to scan for regular expressions for both a credit card number and an expiration date.

Proximity

When you are looking for data types which have more than one element, often the elements must be close together to be considered relevant. For example, a 16-digit number that passes the Luhn algorithm followed immediately by an expiration date is more likely to be a true positive (a real credit card number) than if the expiration date were several hundred characters away.

Dictionaries

Dictionaries are essentially lists of words that can be anything relevant. Keeping with the credit card example, the presence of keywords like “CCN” or “Expiration date” near the 16-digit number and/or expiration date further increases confidence that it is a true positive.

True File Types

A file’s type can also be indicative of its sensitivity. An architecture firm might look for CAD drawings leaving the organization, for instance. Skyhigh can look at the structure of data in the file to determine its true file type regardless of the file’s extension.

Data Location

If your company uses a data tagging solution, it likely inserts its tags in the headers or footers of files. Skyhigh can limit its search for tags (or any other data) to certain parts of the file such as the header, body, or footer.