Friday, September 11, 2009

CoreEx.Common.Validation


CoreEx.Common.Validation

When working with objects we often find the need to validate their contents before e.g. persisting them to a data store.

I have been putting this off for a long time now and I decided it was time to take a look at the options available.

Before we start lets define some requirements of the validation library.

  • Objects should be validated without the need to reference the validation library itself.

  • It must be possible to configure validation without the use of XML.

  • It must be possible to add new validation rules to an object without the need to recompile.
 
  • It should be possible to validate a single property

  • And last, but not least , it must be easy to use.


This may seems like modest requirements, but in fact they rule out some of the existing validation frameworks.
Some of you may already be using the Validation Block from the Enterprise Library.

That basically leaves you with two options when it comes to configuration.

1. Using Attributes

2. External configuration using an XML format specifically designed  to describe validation rules.

As this violates at least two of our requirements, we must investigate further.

The beauty of Lambda Expressions

I started to read this amazing article which have been resting in my reading backlog for a long time now.

He uses Lambda expressions to specify the validation rules and I immediately thought.. what a great idea?

Let the configuration be the code itself.

For what is attributes and XML configuration if not just another way of specifying what code to execute?
 
This must be it. Let's start defining some interfaces for our new library.

Imagine of we could something like

(c => c.CustomerID.Length < 5, "CustomerID cannot exceed five characters") 

 
First we need something that represents a validation rule
   

/// <summary>

/// Represents a validation rule.

/// </summary>

/// <typeparam name="T">The type that this <see cref="IValidationRule{T}"/> applies to.</typeparam>

public interface IValidationRule<T>

{

    /// <summary>

    /// Gets or sets the message used when the rule is broken.

    /// </summary>

    string Message { get; set; }

 

    /// <summary>

    /// Gets or sets the function delegate that points to the validating code.

    /// </summary>

    Func<T, bool> Rule { get; set; }

}   


The validation rule is specific to the target type and now we just need some way of adding new validation rules.

/// <summary>

/// Represents a set of validation rules.

/// </summary>

/// <typeparam name="T">The target type that the validation rules applies to.</typeparam>

public interface IValidationRules<T>

{

    /// <summary>

    /// Adds a new validation rule.

    /// </summary>

    /// <param name="rule">The <see cref="Expression{TDelegate}"/> that contains the validating code.</param>

    /// <param name="message">The message to be used when a validation rule is broken.</param>

    void AddRule(Expression<Func<T, bool>> rule, string message);

 

    /// <summary>

    /// Gets all the <see cref="IValidationRule{T}"/> instances for the target type.

    /// </summary>

    /// <returns>An <see cref="IEnumerable{T}"/> that contains <see cref="IValidationRule{T}"/> instances.</returns>

    IEnumerable<IValidationRule<T>> GetRules();

 

    /// <summary>

    /// Gets all the <see cref="IValidationRule{T}"/> instances that applies to the target type and <paramref name="propertyName"/>.

    /// </summary>

    /// <param name="propertyName">The name of the property for which to retrieve the validation rules.</param>

    /// <returns>An <see cref="IEnumerable{T}"/> that contains <see cref="IValidationRule{T}"/> instances.</returns>

    IEnumerable<IValidationRule<T>> GetRules(string propertyName);

}

   
 
We can now add new validation rules and return them as a whole or only rules specific to one property.
You migth notice that there is no overload to AddRule that allows us to specify the target property.
 
How is it then possible to retrieve validation rules that applies to e.g. CustomerID?
 

Expression<Func<T,bool>> vs Func<T,bool>

While Lambda expression are truly powerful, but their real potential comes to show when wrapped by a Expression<T>.
 
Expression trees kan be thought of like code that has not yet been compiled. It is sort of like an AST(Abstract syntax tree).
 
We can inspect and even change expression trees before they are compiled and executed.
 
I'm not saying here that compiled code can not be altered, but that is a quite different story which we will look into later.
 
So let's just get back to expression trees.
 
So how is that going to help us with regards to the validation library?
 
Since the requirements states that we must be able to validate any given property, we need some way to
associate the expression and the target property. 
 
The general rule here is if the expression references a property that belongs to the target type, we assoiate the expression with that property.
 
Example
 

(c => c.CustomerID.Length < 5, "CustomerId cannot exceed five characters") 

 

 

The CustomerID property is declared by the Customer class and we can associate the expression accordingly.
 
But how do we inspect the expression to determine the target properties?
 

The ExpressionVisitor

Working with expression trees takes some getting used to.
 
Luckally for us we have a very useful class available to us when we want to inspect or rewrite an expression tree.   
 
The ExpressionVisitor class is actually not available to us directly, but you can find it in the MSDN documentation.
 
If you ever get to browse the source code of a Linq provider, you are very likely to see this class used all over.

It's purpose is basically to allow us to visit each node of the expression tree (visitor pattern) and reconstruct the tree for us if we make any changes to it.
 
Why do we want to make changes to the expression tree? Well, we will see in just a minute.
 
For now we just want to retrieve all target properties based on the lambda expression that makes up the validating code.
 
We go ahead and define an interface like this
 

/// <summary>

/// Represents a class that is capable of determining a set of

/// </summary>  

public interface ITargetPropertyResolver

{

    /// <summary>

    /// Tries to determine the target property or properties that the validation <paramref name="expression"/> applies to.

    /// </summary>

    /// <param name="expression">The validation expression</param>

    /// <returns>An <see cref="IEnumerable{T}"/> that contains the target properties.</returns>

    IEnumerable<PropertyInfo> ResolveFrom(Expression expression);

}


And the implementation goes like

/// <summary>

/// Determines the properties that a validation expression should be associated with.

/// </summary>

[Implements(typeof(ITargetPropertyResolver))]

public class TargetPropertyResolver : ExpressionVisitor, ITargetPropertyResolver

{

    private readonly IList<PropertyInfo> _targetProperties = new List<PropertyInfo>();

 

    private Type _validationTargetType;

 

    /// <summary>

    /// Tries to determine the target property or properties that the validation <paramref name="expression"/> applies to.

    /// </summary>

    /// <param name="expression">The validation expression</param>

    /// <returns>An <see cref="IEnumerable{T}"/> that contains the target properties.</returns>

    public IEnumerable<PropertyInfo> ResolveFrom(Expression expression)

    {

        SetValidationTargetType(expression);

 

        Visit(expression);

 

        return _targetProperties;

    }

 

 

    /// <summary>

    /// Determines if the <paramref name="m"/> is considered a target property for the validation expression.

    /// </summary>

    /// <param name="m">The currently accessed member.</param>

    /// <returns><see cref="Expression"/></returns>

    protected override Expression VisitMemberAccess(MemberExpression m)

    {

        if (IsTargetProperty(m))           

            RegisterMemberAsExpressionTarget(m);               

 

        return base.VisitMemberAccess(m);

    }

 

    /// <summary>

    /// Sets the validation target types based on the validation expression

    /// </summary>

    /// <param name="expression">The validation expression.</param>

    private void SetValidationTargetType(Expression expression)

    {

        _validationTargetType = ((LambdaExpression)expression).Parameters[0].Type;

    }

 

    /// <summary>

    /// Determines if the <paramref name="memberExpression"/> is a property declared by the target type.

    /// </summary>

    /// <param name="memberExpression">The <see cref="MemberExpression"/> that current is being visited.</param>

    /// <returns><b>True</b> if the <paramref name="memberExpression"/> is a property and is declared by the target type.</returns>

    private bool IsTargetProperty(MemberExpression memberExpression)

    {

        return (memberExpression.Member.DeclaringType == _validationTargetType) && memberExpression.Member is PropertyInfo;

    }

 

    private void RegisterMemberAsExpressionTarget(MemberExpression memberExpression)

    {           

        _targetProperties.Add((PropertyInfo)memberExpression.Member);           

    }       

}



As we can see from the code the visitor class override the VisitMemberAccess method which is fired for every time a member is accessed in the lambda expression.
 
Then we basically just checks if the property is a target property by determining if the property is declared by the target type (e.g Customer)
 
So now we have our expression and we also know its target property/properties.
 
Didn't I mention something about rewriting the expression tree?
 

Coerce those Null values

Let's take another look at our little example

(c => c.CustomerID.Length < 5, "CustomerId cannot exceed five characters") 

Does anybody spot something potentially wrong with this code? No?

If it is any consolation, neither did I until my unit test blew up in my face proclaiming a NullReferenceException somewhere.

Then it finally hit me. What happens if the target property is null? How could I have missed that?

And just when this library was all good and ready to jump into production.

Then I started to think about how to deal with this problem.

We could of course check the property value before executing the lambda expression, but then again the expression could target several properties and would have to check them all in order to safely execute the expression.

We could also require the lambda expression to include a check for null, but that seems a little intrusive and certainly violates the requirement that states ease of use.
 
If we only could make sure we had a value in place once the check is executed.

What we basically are looking for is turning this

(c => c.CustomerID.Length < 5, "CustomerId cannot exceed five characters") 

into this

(c => (c.CustomerID ?? "").Length < 5, "CustomerId cannot exceed five characters")



That should take care of business, right?

But what we actually wanted to test for null in addition to the string length?

(c => c.CustomerID.Length < 5 && c.CustomerID != null, "CustomerId cannot be null or exceed five characters")



So what we need to do is coerce those null values into their default values unless an explicit check for null is intended.

That means that we need to turn it into this

(c => (c.CustomerID ?? "").Length < 5 && c.CustomerID != null, "CustomerId cannot be null or exceed five characters")



And since we are still working with expression trees, we can make the necessary changes by doing an expression tree rewrite.

Next we go ahead and define an interface to do the rewrite.

/// <summary>

/// Represents a class that is capable of rewriting the expression tree in such

/// as way that member expressions don't return <c>null</c> values unless an explicit

/// check for <c>null</c> is intended.

/// </summary>

public interface IRuleRewriter

{

    Expression<Func<T, bool>> Rewrite<T>(Expression<Func<T, bool>> ruleExpression, IEnumerable<PropertyInfo> targetProperties);

}




And the implementation is once again based on the ExpressionVisitor class.

[Implements(typeof(IRuleRewriter))]

public class RuleRewriter : ExpressionVisitor, IRuleRewriter

{

    private readonly IList<MemberExpression> _excludeList = new List<MemberExpression>();

    private IEnumerable<PropertyInfo> _targetProperties;

 

    #region IRuleRewriter Members

 

    /// <summary>

    /// Rewrites the <see cref="ruleExpression"/> by replacing any <see cref="MemberExpression"/> with a <see cref="ConditionalExpression"/>

    /// where the member is not explicitly checked for <c>null</c>.

    /// </summary>

    /// <typeparam name="T">The target type to validate.</typeparam>

    /// <param name="ruleExpression">The validation expression.</param>

    /// <param name="targetProperties">The target properties for the <paramref name="ruleExpression"/> that are candidates for rewriting.</param>

    /// <returns><see cref="Expression{TDelegate}"/></returns>

    public Expression<Func<T, bool>> Rewrite<T>(Expression<Func<T, bool>> ruleExpression,

                                                IEnumerable<PropertyInfo> targetProperties)

    {

        _targetProperties = targetProperties;

        return (Expression<Func<T, bool>>) Visit(ruleExpression);

    }

 

    #endregion

 

    /// <summary>

    /// Check to see if a <see cref="MemberExpression"/> is part of an explicit check for <c>null</c>

    /// </summary>

    /// <param name="binaryExpression">The currently visited <see cref="BinaryExpression"/></param>

    /// <returns><see cref="BinaryExpression"/></returns>

    protected override Expression VisitBinary(BinaryExpression binaryExpression)

    {

        if (IsExplicitCheckForNull(binaryExpression))

            ExcludeFromMemberRewrite(binaryExpression);

        return base.VisitBinary(binaryExpression);

    }

 

    /// <summary>

    /// Replaces the <see cref="MemberExpression"/> with a <see cref="ConditionalExpression"/>

    /// that return the default value if the member yields null.

    /// </summary>

    /// <param name="memberExpression">The currently visited <see cref="MemberExpression"/></param>

    /// <returns>A <see cref="ConditionalExpression"/> if a rewrite has been performed, otherwise the original <see cref="MemberExpression"/></returns>

    protected override Expression VisitMemberAccess(MemberExpression memberExpression)

    {

        if (ShouldRewiteMemberExpression(memberExpression))

            return RewriteMemberExpression(memberExpression);

 

        return base.VisitMemberAccess(memberExpression);

    }

 

    private bool ShouldRewiteMemberExpression(MemberExpression memberExpression)

    {

        if (!IsTargetProperty(memberExpression) || IsMemberExcluded(memberExpression))

            return false;

        return MemberIsReferenceTypeOrNullableValueType(memberExpression);

    }

 

    private bool MemberIsReferenceTypeOrNullableValueType(MemberExpression memberExpression)

    {

        var propertyInfo = (PropertyInfo) memberExpression.Member;

        return !propertyInfo.PropertyType.IsValueType || propertyInfo.PropertyType.IsNullableType();

    }

 

 

    private Expression RewriteMemberExpression(MemberExpression memberExpression)

    {

        var propertyInfo = (PropertyInfo) memberExpression.Member;

        return Expression.Condition(GetTestExpression(memberExpression), GetDefaultValueExpression(propertyInfo),

                                    memberExpression);

    }

 

    private static BinaryExpression GetTestExpression(MemberExpression memberExpression)

    {

        return Expression.Equal(memberExpression, Expression.Constant(null));

    }

 

    private static UnaryExpression GetDefaultValueExpression(PropertyInfo propertyInfo)

    {

        return Expression.Convert(Expression.Constant(GetDefaultValue(propertyInfo.PropertyType)),

                                  propertyInfo.PropertyType);

    }

 

 

    private void ExcludeFromMemberRewrite(BinaryExpression binaryExpression)

    {

        if (IsTargetProperty(binaryExpression.Left))

            _excludeList.Add((MemberExpression) binaryExpression.Left);

        if (IsTargetProperty(binaryExpression.Right))

            _excludeList.Add((MemberExpression) binaryExpression.Right);

    }

 

    private bool IsMemberExcluded(MemberExpression memberExpression)

    {

        return _excludeList.Contains(memberExpression);

    }

 

 

    private bool IsExplicitCheckForNull(BinaryExpression binaryExpression)

    {

        if (IsNullConstant(binaryExpression.Left) && IsTargetProperty(binaryExpression.Right))

            return true;

        if (IsNullConstant(binaryExpression.Right) && IsTargetProperty(binaryExpression.Left))

            return true;

        return false;

    }

 

 

    private static bool IsNullConstant(Expression expression)

    {

        var constantExpression = (expression as ConstantExpression);

        if (constantExpression == null)

            return false;

        return (constantExpression.Value == null);

    }

 

    private bool IsTargetProperty(Expression expression)

    {

        var memberExpression = (expression as MemberExpression);

        if (memberExpression == null)

            return false;

        var propertyInfo = (PropertyInfo) memberExpression.Member;

        if (propertyInfo == null)

            return false;

        return _targetProperties.Contains(propertyInfo);

    }

 

 

    /// <summary>

    /// Gets the default value to be used as a surrogate for the supplied <c>null</c> value.

    /// </summary>

    /// <param name="type">The target type.</param>

    /// <returns>The default value as determined by the underlying value type.</returns>

    private static object GetDefaultValue(Type type)

    {

        //We need some special handling for the string data type

        //since it is a reference type, but behaves like a value type.

        if (type == typeof (string))

            return string.Empty;

 

        return type.GetNonNullableType().GetDefaultValue();

    }

}



What we are doing here is that we replace the MemberExpression with a ConditionalExpression that returns the default value (based on the property type) if the property yields null.



Configuring your application

We have already stated that the validation target  should not have to reference the validation library itself.

So where and how can we add new rules to our objects?

One simple solution would be to retrieve the IValidationRules<T> instance and start adding new rules.

example

var rules = _serviceContainer.GetService<IValidationRules<Customer>>();

rules.AddRule( c=> c.ContactName != null,"ContackName can not be null");

But that code would have to reside somewhere, right?

In order to really make life easy when adding new rules, take a look at the IRuleInjector<T> interface.

    /// <summary>
    /// Represents a class that is capable of injecting additional validation rules
    /// that should apply to the target type.
    /// </summary>
    /// <typeparam name="T">The target type to validate.</typeparam>
    public interface IRuleInjector<T>
    {
        /// <summary>
        /// Allows additional validation rules to be added to the <see cref="IValidationRule{T}"/> instance.
        /// </summary>
        /// <param name="validationRules">The <see cref="IValidationRules{T}"/> instance
        /// that contains the validation rules for the target type.</param>
        void Inject(IValidationRules<T> validationRules);
    }


All we need to do is to implement the interface and make sure that it is located in an assembly loaded by the service container.

example

    [Implements(typeof(IRuleInjector<IOrderDetail>))]
    public class SampleRuleInjector : IRuleInjector<IOrderDetail>
    {
        public void Inject(IValidationRules<IOrderDetail> validationRules)
        {
            validationRules.AddRule(od => (decimal)od.Discount < od.UnitPrice
                ,"Discount must be less than the unit price");
        }
    }

This also means that we can add new rules without ever recompiling the application.


Validation in action

Now we pretty much have all the expressions in place and it is time to actually validate something.

That something may be an object where we validate everything or an object where we want to validate just one property.

Having the ability to validate just one property comes in very handy when using this library in conjunction with the IDataErrorInfo interface.


Using the code

We have already looked at how we add new validation rules to a target type and its time to see how we can execute the actual validation

To validate the whole object

var validator = _serviceContainer.GetService<IValidator<Customer>>(); 
string result = validator.validate(someCustomer);

To validate one single property

var validator = _serviceContainer.GetService<IPropertyValidator<Customer>>(); 
string result = validator.validate(someCustomer,"CustomerID");


The IDataErrorInfo interface

As we can see using the IPropertyValidator<T> fits very well together with implementing the IDataErrorInfo interface.

This interface has to be implemented by the binding target and how can we do that without having to reference the validation library.

It is also very related to the UI and the data binding mechanism so it does not really belong in our objects either. What a puzzle.



A Proxy to the rescue

I sometimes get asked about why I have my domain object implement an interface and this is one of the reasons why.

By using interfaces for our domain object, we can create proxies for them and those proxies can implement any additional interface.

Let's stick with the customer object (Northwind) and see how it looks like

The interface

    public interface ICustomer
    {
        string CustomerID { get; set; }
        string CompanyName { get; set; }
        string ContactName { get; set; }
        string ContactTitle { get; set; }
        string Address { get; set; }
        string City { get; set; }
        string Region { get; set; }
        string PostalCode { get; set; }
        string Country { get; set; }
        string Phone { get; set; }
        string Fax { get; set; }
        IList<ICustomerCustomerDemo> CustomerCustomerDemo { get; }
        IList<IOrder> Orders { get; }
    }


When asking the service container for an ICustomer instance, the only requirement for the concrete class is that it implement ICustomer.

But it can also implement addition interfaces.

The Linfu framework has a very powerful proxy library and we can actually use that to have our domain objects implement IDataErrorInfo at runtime.

For those of you wondering what a proxy really is, we can sum it all up by saying that it sits between the calling code and the actual implementation.

You can read more about them here.

First we need some type of interceptor that gets called when calls are being made to the proxy instance.

    [Implements(typeof(IInterceptor),LifecycleType.OncePerThread, ServiceName = "DataErrorInfoInterceptor")]
    public class SampleDataErrorInfoInterceptor : IInterceptor, IInitialize
    {
        private IServiceContainer _serviceContainer;

        private object _actualTarget;

        private Type _targetType;

        public SampleDataErrorInfoInterceptor(object actualTarget, Type targetType)
        {
            _actualTarget = actualTarget;
            _targetType = targetType;
        }

        public object Intercept(IInvocationInfo info)
        {
            if (info.TargetMethod.Name == "get_Item")
            {                                
                var propertyValidatorType = typeof (IPropertyValidator<>).MakeGenericType(_targetType);
                var propertyValidator = _serviceContainer.GetService(propertyValidatorType);
                var result = propertyValidatorType.DynamicInvoke(propertyValidator,"Validate", new[] {_actualTarget, info.Arguments[0]});
                return result;
            }
            return info.TargetMethod.DynamicInvoke(_actualTarget, info.Arguments);
        }

        public void Initialize(IServiceContainer source)
        {
            _serviceContainer = source;
        }
    }

The interceptor forward calls to IDataErrorInfo.Item[] to the IPropertyValidator<T> that knows how to validate the target type.

The last thing we need to is making sure that a proxy is returned when a request for an ICustomer instance is made.

Using an IPostProcessor enables us to inspect the service instance and return our proxy instance.

    [PostProcessor]
    public class SampleDomainModelPostProcessor : IPostProcessor
    {
        private IProxyFactory _proxyFactory;
        
        public void PostProcess(IServiceRequestResult result)
        {            
            if (result.ServiceType.Namespace.Contains("DomainModel"))
            {
                var proxyFactory = CreateProxyFactory(result.Container);
                var interceptor = result.Container.GetService<IInterceptor>("DataErrorInfoInterceptor",result.OriginalResult,result.ServiceType);
                var proxy = proxyFactory.CreateProxy(result.ServiceType, interceptor, typeof (IDataErrorInfo));
                result.ActualResult = proxy;
            }
        }

        private IProxyFactory CreateProxyFactory(IServiceContainer serviceContainer)
        {
            if (_proxyFactory == null)
                _proxyFactory = serviceContainer.GetService<IProxyFactory>();
            return _proxyFactory;
        }        
    }


The proxy now implements both the ICustomer and the IDataErrorInfo interface.

        [Test]
        public void ShouldImplementIDataErrorInfo()
        {
            var customer = _serviceContainer.GetService<ICustomer>();
            Assert.IsTrue(typeof(IDataErrorInfo).IsAssignableFrom(customer.GetType()));
        }


As a result of this we have now added transparent validation to our domain objects and all the plumbing
needed to visualize this in the UI has been abstracted away from the object itself.