What's new in SQL Server 2016: Dynamic Data Masking

07/06/2016

Gérald Barré

This post is part of the series 'Microsoft SQL Server 2016'. Be sure to check out the rest of the blog posts of the series!

What's new in SQL Server 2016: JSON
Comparing execution plans with SQL Server
What's new in SQL Server 2016: Dynamic Data Masking (this post)
What's new in SQL Server 2016: Row-Level Security
What's new in SQL Server 2016: Stretch Database
What's new in SQL Server 2016: Live Query Statistics
What's new in SQL Server 2016: Temporal Tables
What's new in SQL Server 2016: Always encrypted
SQL Server 2016 SP1 - Licensing changes

The final version of SQL Server 2016 has been released recently. So it's time to take a look at the new features provided by this version. Today we will see Dynamic Data Masking. This feature already available on SQL Azure for some time is now available to SQL Server.

In some tables we may be brought to store sensitive information, or at least information that we do not want all users can access: blue card number, email address, etc. Currently to protect this data one can create views or stored procedures masking the desired columns, but this takes time and is difficult to set depending on the logged-in user.

Dynamic Data Masking tries to answer this need by allowing to hide the contents of certain columns according to the rights of the connected user. Before seeing how this is parameterized, here is the result of the same query with 2 different users not having the same permissions:

As we can see, the idea is not to remove columns, but to anonymize the content. For example, a telephone company may ask you for the last X digits of your bank card and verify that it is yours, but will not have access to the full amount. On the other hand, a user with sufficient rights will be able to see the data not masked.

#Setting up Mask

Before using this feature, activate it:

SQL

DBCC TRACEON(209,219,-1)

Let's create the table:

SQL

CREATE TABLE [dbo].[Customer](
    [Id] [uniqueidentifier] NOT NULL,
    [Name] [nvarchar](100) NOT NULL,
    [CardNumber] [nchar](20) NULL,
    [Email] [nvarchar](255) NULL
)

To enable data masking, you must change the definition of the desired columns:

SQL

ALTER TABLE Customer
ALTER COLUMN Name ADD MASKED WITH (FUNCTION = 'default()')

The replacement of the data is defined with FUNCTION = 'default()'. 3 functions are available:

default(): replace the value with "XXXX" for a character string or its default value (0 for a number, 1/1/0001 for a date, etc.)
email(): keeps the first letter of the email as well as the extension of the domain
partial(prefix_length, 'XXX', suffix_length): replace the text with the first N characters of the original string followed by the desired string ('XXX' in the example) followed by the N characters of the original string

For example to filter a credit card number one can use the following filter:

SQL

ALTER TABLE Customer
ALTER COLUMN CardNumber ADD MASKED WITH (FUNCTION = 'partial(0, "XXXX-XXXX-XXXX-", 4)')
-- XXXX-XXXX-XXXX-1234

For a phone number you can keep the first 4 digits:

SQL

ALTER TABLE Customer
ALTER COLUMN PhoneNumber ADD MASKED WITH (FUNCTION = 'partial(4, "000000", 0)')
-- 0175000000

For the email address we use the function dedicated to this use:

SQL

ALTER TABLE Customer
ALTER COLUMN Email ADD MASKED WITH (FUNCTION = 'email()')
-- cXXX@XXXX.com

#Set permissions

If you execute the SELECT * FROM Customer query, you most likely do not see the masks. Indeed you surely have permission UNMASK. We will create a new user without this permission and execute the query with his identity:

SQL

CREATE USER TestUser WITHOUT LOGIN;
GRANT SELECT ON Customer TO TestUser;

EXECUTE AS USER = 'TestUser';
SELECT * FROM Customer;
REVERT;

The data is hidden. Now add the UNMASK permission to the user and look at the difference:

SQL

GRANT UNMASK TO TestUser;

EXECUTE AS USER = 'TestUser';
SELECT * FROM Customer;
REVERT;

This time the user can see the unmasked data.

#Conclusion

This new feature makes it easy to improve data privacy without changing the content of the database. According to the rights of the user, SQL Server will anonymize the data automatically according to the specified rules.

Do you have a question or a suggestion about this post? Contact me!

Follow me:

Enjoy this blog?

💖 Sponsor on GitHub