Thursday, August 15, 2013

The fallacy of the single composition root?

Single Composition Root

This article discusses DI (dependency injection) and the pattern of having one single unique composition root.

Composition roots

When we talk about composition roots, we usually talk about a single unique location where services are registered with the IoC container. This is then executed at the application entry level.

This ensures that only the composition root needs to have a reference to the IoC container and we can make the switch to another container without going through the whole application and its assemblies. Life is good and everyone is happy.

Now let me tell you a sad story about an application using this exact approach and how we ended up rethinking the whole single composition root idea.

As with most projects, they start out small and performance problems are something that only exists in galaxies far far away.

In order to save time during development, we decided that we could just scan all the assemblies and register services using some sort of convention instead of explicitly register each service. This worked out great and developers did not have to think about anything else than to ensure that their assembly was present in the bin folder. We also executed the same composition root during unit testing which meant that we did integration tests with the same configuration as in production (This is actually a good thing).
Man, those were happy times...for a while.

First sign of trouble

One day I got an e-mail from one of my colleagues complaining about increased startup time. It also took a long time just to execute a single unit test. The application was only half way completed and we were already faced with a serious performance problem. As it turned out, the application had grown from your typical "10 assemblies greenfield project", into something that now consisted of 100 assemblies where each of these assemblies represented services that could potentially be requested from the container. It now took 10 seconds to start the application and 10 seconds to start a single unit test. Happy times were going bye-bye.

Note: Not discussing whether the container should be used in unit tests. Think of integration tests.

What's the problem

It did not took us long to realize that the problem lied within the fact that we now scanned 100 assemblies during the startup of the application. Even for unit tests that only really needed 5 of those assemblies to run, we loaded all 100 assemblies into the application domain. Of course we needed to, we did not know which 5 of those 100 assemblies that was needed for the test.
As for the application that was a desktop application, we created a really fancy splash screen to amuse the end user while all this assembly loading was going on in the background.
It did not take long for the amusement to wear off among the users.

No more assembly scanning

Having identified the assembly scanning as the root of our problems, we decided that each service should be explicitly registered with the container. This required a little more work from the developers, but that was okay. After all, we needed to fix this problem before it really got out of hand.

container.ScanAllAssemblies()  

to this

container.Register<IFoo, Foo>();
container.Register<IBar, Bar>();
--followed by hundreds more

We were now faced with another issue. The assembly that contained the composition root, now had to reference all the other 99 assemblies so that the composition root could register the services. The fact that we now needed the scrollbar just to browse the references list should have made us think twice. Sadly it did not.

The problem remains

So after a major refactoring process to explicitly register services and abandon the assembly scanning, we were still seeing a significant delay during startup although it had improved a little. The improvement was relative meaning that the startup time kept increasing as more assemblies got added to the application. The improvement came from not having to do reflection on all the types in each assembly.

So how could it be that we still saw this behavior?

To answer this, we need to look a little closer into how assembly loading work in the CLR (Common Language Runtime).

The CLR tries to minimize the number of assemblies that goes into the application domain by only loading the assemblies that are actually needed. So when does an assembly get loaded?

It gets loaded when code that uses types from that assembly is executed. To be even more precise, it gets loaded when the code is compiled by the Just-In-Time compiler.

This is actually a very simple and clever approach because the number of assemblies loaded is now dependent on the actual path of execution.

Now lets review the code from our composition root.

container.Register<IFoo, Foo>();
container.Register<IBar, Bar>();
--followed by hundreds more

What we essentially have here is a method that bypasses all this cleverness enforced by the CLR.

Because we now have a method that references types from all those 100 assemblies, we force the CLR to load all the assemblies into the application domain. As bad as this in release builds, this is even worse during debug since all the assemblies and their symbol files needs to be loaded too. Starting the application or running a unit tests still provided a nice opportunity to check the latest news on Facebook.
Happy times now seem ages ago.

Laziness is good

As opposed to life in general, in computer science, laziness is really a good thing. Doing things only when we really have to makes for highly optimized applications in terms of performance and memory footprint.

Being an DI framework author, I'm going to explain how to deal with this problem using the LightInject service container. Although the remaining examples will target this container, I'm confident that similar solutions could be applied using other DI frameworks.

In LightInject there is an interface called ICompositionRoot that serves the purpose of registering services with the service container.

The interface is rather simple and looks like this:

/// <summary>
/// Represents a class that acts as a composition root for an <see cref="IServiceRegistry"/> instance.
/// </summary>
internal interface ICompositionRoot
{
    /// <summary>
    /// Composes services by adding services to the <paramref name="serviceRegistry"/>.
    /// </summary>
    /// <param name="serviceRegistry">The target <see cref="IServiceRegistry"/>.</param>
    void Compose(IServiceRegistry serviceRegistry);
}

When an assembly is scanned, LightInject will check to see if the assembly contains implementations of this interface and execute the Compose method. If no composition root is found, services will be registered using a set of conventions (not covered here).

This implies that we are back to scanning assemblies and we already stated that did not work out to well, didn't we?

Well, it's all a matter of how and when.

Lazy Registration

HOW

How we scan an assembly is important with regards to performance when registering services. We have already said that LightInject will use the composition root if found within the target assembly. This is essentially done by inspecting all types and check to see if one or more types implements the ICompositionRoot interface. While that certainly works, it is far from optimal as this could also lead to unnecessary assemblies to be loaded into the application domain.

The solution is to provide a little metadata at the assembly level to help the assembly scanner in the right direction.

/// <summary>
/// Used at the assembly level to describe the composition root(s) for the target assembly.
/// </summary>
[AttributeUsage(AttributeTargets.Assembly, AllowMultiple = true)]
internal class CompositionRootTypeAttribute : Attribute
{
    /// <summary>
    /// Initializes a new instance of the <see cref="CompositionRootTypeAttribute"/> class.
    /// </summary>
    /// <param name="compositionRootType">A <see cref="Type"/> that implements the <see cref="ICompositionRoot"/> interface.</param>
    public CompositionRootTypeAttribute(Type compositionRootType)
    {
        CompositionRootType = compositionRootType;
    }

    /// <summary>
    /// Gets the <see cref="Type"/> that implements the <see cref="ICompositionRoot"/> interface.
    /// </summary>
    public Type CompositionRootType { get; private set; }
}

The assembly scanner will first check the assembly for this attribute and execute the composition roots according to the type information from the attribute.

This saves us from inspecting each and every type in the assembly and provides the most efficient way of locating the composition root.

To summarize how this works:

  1. Check for CompositionRootTypeAttribute. If found, execute composition root(s).
  2. Check all types for implementations of the ICompositionRoot interface. If found execute composition root(s).
  3. Register types according to a set of conventions.

WHEN

When the container encounters a service that cannot be resolved, it will scan the assembly that contains the requested service. After the assembly has been scanned, the container will continue to try resolving the requested service. If the service is still not found, the container will throw an exception as usual.
This allows us to implement multiple lazy compositions roots and hence minimize the number of assemblies loaded into the application domain at any given time during the lifespan of the application.

Example

This example shows a simple console application that consists of the application itself and two additional assemblies.

FooAssembly

public interface IFoo {}

public class Foo : IFoo {}

public CompositionRoot : ICompositionRoot
{
    public void Compose(IServiceRegistry registry)
    {
        registery.Register<IFoo, Foo>();
    }
}  

BarAssembly

public interface IBar {}

public class Bar : IBar {}

public CompositionRoot : ICompositionRoot
{
    public void Compose(IServiceRegistry registry)
    {
        registery.Register<IBar, Bar>();
    }
}  

As we can see each assembly has its own composition root and takes care of registering services from the containing assembly.

Further both the FooAssembly and the BarAssembly will have the CompositionRootTypeAttribute defined at the assembly level.

[assembly: CompositionRootType(typeof(CompositionRoot)]

The console application now only references the FooAssembly and the composition root within the FooAssembly will be lazily executed once the IFoo instance is requested.

The Main method of our console application would look something like this.

static void Main(string[] args)
{
    var container = new ServiceContainer();                
    IFoo foo = container.GetInstance<IFoo>();                
} 

The FooAssembly is loaded into the application domain when the Main method is executed (compiled by the J.I.T), while the BarAssembly is left unloaded until we execute code that uses types from this assembly.

What's the downside?

There is always at least two sides to every story and even this has some obvious downsides.

First and foremost it means that every assembly that contains a composition root, would also need to have a reference to the assembly that contains the ICompositionRoot interface. This is a container specific interface and should ideally only be referenced at the "true" composition root (application entry).

It's all a matter of give and take while this certainly creates more references to the service container library, it is still far from the downsides of implementing the service locator pattern which basically means references to the container everywhere in our code.

Secondly, we encourage developers to explicitly registers their services in the appropriate composition root where we in the past got away with plain old assembly scanning.

While this might push more plumbing work over to the developers, it also becomes less "magic" and might become easier to debug. If the service is unable to resolved, it is probable just missing that one line of code that registers the service with the container. It's as simple as that really.

Update 8/19/2013 2:12:56 PM

The article has been updated to emphasize the meaning of having "distributed" composition roots.

Feedback

I would appreciate any feedback whether you think this looks cool or the worst idea ever :)

mail: bernhard.richter@gmail.com

twitter: @bernhardrichter