Classification Components

Skyhigh Classification Components

Skyhigh has many different DLP components which you will see in todays lab, however it is useful to review some of the most common:

Boolean logic

Boolean logic provides the ability to combine or exclude conditions which would indicate a DLP violation. For example, you could use boolean logic to scan for regular expressions for both a credit card number and an expiration date.


When you are looking for data types which have more than one element often the elements must be close together to be considered relevant. For example, a 16 digit number that passes the Luhn algorithm followed immediately by an expiration date is more likely to be a true positive (a real credit card number) than if the expiration data were several hundred characters away.


Dictionaries are essentially list of words that can be anything relevant. Keeping with the credit card example, the presence of keyword “CCN” or “Expiration date” near the 16 digit number and/or expiration date further increase confidence that it is a true positive.

True file types

A file’s type can also be indicative of it’s sensitivity. An architecture firm might look for CAD drawings leaving the organization, for instance. Skyhigh can look at the structure of data in the file to determine its true file type regardless of the files extension.

Data location

If your company uses a data tagging solution, it likely inserts its tags in the headers or footers of files. Skyhigh can limit its search for tags (or any other data) to certain parts of the file such as the header, body, or footer.