Protecting Patients through Background Checks

Data Sanitization

Background Check System

IT Technical Design Guide

Version 01

1/29/2015


Table of Contents

1 Overview 3

2 Microsoft AntiXSS Library 2.2 3

2.1 Bcs.Sanitization Project 3

2.2 Bcs.Sanitization.Test Project 4

2.3 Sanitization Layers 5

3 Future Sanitization Enhancements 5


1 Overview

The purpose of this document is to describe the approach the Background Check System (BCS) uses to inspect and scrub input and output data to protect against malicious user input. This is achieved by using the Microsoft AntiXSS Library version 2.2 to inspect and sanitize all data input into the BCS and output from the BCS. This sanitization is performed in addition to the HTML encoding performed by the Microsoft ASP.NET MVC Framework before rendering HTML to the user.

2 Microsoft AntiXSS Library 2.2

The Microsoft AntiXSS Library version 2.2 is the current sanitization library used in the BCS. For more information regarding the Microsoft AntiXSS Library, please visit the Information Security pages on the Microsoft Developer Network website at: http://msdn.microsoft.com/en-us/security/aa973814.

2.1 Bcs.Sanitization Project

The Bcs.Sanitization project contains several sanitization items:

· HtmlSanitizationLibrary Reference: The component of the Microsoft AntiXSS Library implementing the sanitizer.

· Unsanitary Attribute: Allows properties to be marked as do-not-sanitize and should be used sparingly; it is typically reserved for data that is known to be XML or HTML from trusted sources.

· Sanitizer Class: Responsible for inspecting text strings and removing or encoding any suspect substrings.


The Sanitizer class contains two important sanitization methods:

· SanitizeRef Method: Performs the actual string sanitization. This method is the only actual dependency on the Microsoft AntiXSS Library, which makes replacing the sanitizer easy. Since the BCS sanitizes all string data with this library, and not just HTML, there are workarounds required to keep sanitized strings from being changed inappropriately and to allow certain characters to remain unaltered.

· Sanitize<T> Method: Performs the process of inspecting a given object for any string properties and applies sanitization to each string encountered. This method uses reflection to deeply scour the object for strings to sanitize.

2.2 Bcs.Sanitization.Test Project

The Bcs.Sanitation.Test project contains various unit tests that demonstrate the sanitizer functionality, as well as some of its shortcomings. Since .NET strings are immutable, there are certain limitations in how the Sanitizer class can accomplish its goals.


2.3 Sanitization Layers

BCS modules contain a sanitization layer which is responsible for using the sanitizer on input and output data. These sanitization layers are implemented in projects named “Bcs.*.Sani” that applies the Sanitizer.Sanitize method to all strings and objects. (Note the “Bcs.Bio. Sani” below.)

The code snippet in the image below is an example of the sanitizer being applied to the UserSave method. The input parameters user and username are sanitized before being logged, validated, and saved. The return of the method is also sanitized before being returned to the calling client. This approach allows BCS to fully sanitize the majority of string data without relying on developers to implement it explicitly on a case-by-case basis.

3 Future Sanitization Enhancements

Future sanitization enhancements could include upgrading the Microsoft AntiXSS Library, switching to a new sanitization component, and/or updating debug builds when un-sanitizable collections are encountered.

· The Unsanitary attribute that we currently use is an all-or-nothing approach to sanitization, mostly due to the nature of the Microsoft AntiXSS Library. Ideally, we would prefer the ability to sanitize strings while having certain characters and strings allowed without sanitization. For example, it would be nice to have the ability to sanitize a string of XML where the values for XML elements and attributes are sanitized, but the rest of the XML markup is not. In this instance, a new sanitization component might be considered.


· Additionally, since many IEnumerable types are immutable or can only be iterated once, IEnumerable<string> collections are not sanitizable (unless they also implement IList<string>). Care must be taken to ensure that interface methods are not used that accept IEnumerable<string> parameters that should be refactored to accept IList<string>. In this instance, coding debug builds to report exceptions when un-sanitizable collections are encountered could be considered.

· If BCS continues to use Microsoft AntiXSS Library for sanitization, an update to version 4.3.0 should be considered, because it contains several improvements to the current sanitizer in the BCS. An upgrade to version 4.3.0 is available on the CodePlex website at: https://wpl.codeplex.com/releases/view/122988.

Although upgrading to a new version of the Microsoft AntiXSS Library or switching to a new sanitization component requires very little development effort in the BCS, there are certain risks involved that should be considered before making a decision:

· The primary risk is that certain strings that get hashed (such as user passwords and security question answers) could be sanitized differently and no longer match as expected after an upgrade or modification to the sanitization component. As such, a reasonable amount of regression testing would be required to ensure that results continue to be as expected.

· Additionally, since initial benchmarks indicated that adding sanitization into the BCS resulted in an 8% performance penalty, system performance would need to be measured after an upgrade or modification to the sanitization component and any negative impacts to performance would need to be remedied or weighed against benefits of the change.

Data Sanitization Page 6 of 6

Version 01 | 1/29/2015 | IT Technical Design Guide | Audience: State IT Staff