Automating Input validation

Input validation has always been trickier to implement in data driven applications. Specially web applications where validation can be implemented in multiple places i.e. client and server. From a security perspective you would want to implement the validation on server side as client is user controlled and thus untrusted. As far as validation itself is concerned there are multiple ways to do it based on the technology. The complexity of validation is compounded by the input sources such as textboxes, request variables etc. which requires multiple validations. Some of the validations are interdependent on the different data sources.

Implementing input validation is always a time consuming dev task thus I propose an automated way to perform input validation. I break it down into the following  sub tasks.

  1. Monitor input to learn the sources of input
  2. Use machine learning to understand data type and domain of the input
  3. Create a validation definition for each input based on machine learning
  4. Monitor the inputs for the above definition and take configurable action

The key to success of this method is to monitor the application for the inputs and learning the input type and domain. The system should include some kind of threshold where machine learning ends and validation of inputs start. Machine learning can also be active during user acceptance testing or preproduction phase of the application. In production this validation list is used to validate the inputs in production.

You can break this down into two executable and independent systems, learning system and validation system. Validation system can be easily implemented using HTTP Modules and regular expressions. Regular expression is the output generated by the learning system. Control Id’s and Request Names could be used as keys for storing the regular expressions. Module can validate input and replace the malicious characters with non malicious character or reject the input based on the configuration.

The latest release of Security Runtime Engine can provide a lot of reusable components for the validation system. It already provides configuration, logging and extensible module framework. IPageInspector in particular is interesting as it provides a way to inspect page controls and check input data. Thus SRE is a perfect framework for this prototype, in coming days I will try to prototype this idea and present it to you guys.

- Anil RV