Extensible Query with Specification Pattern

Writing finder methods for data repositories is problematic. The following is an example of pretty simplistic customer-search screen:
Email Address: [________]
Name: [_______]
Age: from [__] to [__] year old
Is terminated: [ ]Yes [ ]No
{Search}

Now how are we going to implement data query behind this search? I’ll go through several alternatives. (Jump to solution if you can’t care less). Let’s start from the simplest approach.

1. Repository Finder Methods

IList<Customer> result = customerRepository.SearchByBlaBlaBloodyBla(email, name, ageFrom, ageTo, isTerminated);

Let me give more context about what I’m working on. I’m developing an application that is intended to be an extensible vertical CRM framework. The client will provide their own new screens/functionalities by writing specific plugin implementation in their own separate assembly, without requiring modification on the framework (Open-Closed Principle).
The approach you just saw above tightly couples the repository API with specific UI design. Any change with the design of the search screen will require the repository API to be reworked. And should we create one repository method for each search screen? Furthermore, when a new search screen or report functionality is plugged into the system by adding new plugin, we need to somehow extend the data repository API to cover each of those specific screen scenarios. This is not an easily extensible architecture.
The upside, this approach is simple and very easy to mock for unit-test. When flexibility is not an issue, I would go for this approach.

2. Lambda-Expression Repository

IList<Customer> result = repository.FindAll<Customer>(
	x => x.EmailAddress == emailAddress && x.IsTerminated == isTerminated);  // and so on

This code uses repository API from FluentNHibernate. I like this API because we only have one single general-purpose Repository. It decouples the repository completely from specific UI design. However I’m not comfortably about leaking naked Linq outside of repository. Exposing Linq to other layer will scatter the database concern all across application. Let’s consider what happens if we decides to refactor IsTerminated property that is currently implemented as a column in DB into C# code, say:

public bool IsTerminated {get {return this.TerminationDate != null; }};

The earlier Linq statement (possibly scattered all over the place) will start to fail since Linq is unable to map IsTerminated property into a correct SQL where clause.

3. Pipe and Filter Pattern

IQueryable<Customer> result = repository.All<Customer>().WithEmail(emailAddress).AgeAbove(ageFrom).AgeBelow(ageTo);
if(isTerminated != null)
	result = (isTerminated)? result.IsTerminated(): result.IsNotTerminated();	

Or in this case, that should be wrapped into:

IList<Customer> result = repository.All<Customer>().WithBlaBloodyBla(emailAddress, name, ageFrom, ageTo, isTerminated).List();

This approach leverages fluent IQueryable and extenssion-methods. It still exposes IQueryable which leaks database concern outside repository, but it’s much better since the query is properly encapsulated behind easy-to-read and maintainable extenssion methods.
In the above example with isTerminated check, it’s obvious that this approach is doing pretty well in handling dynamic query that is very difficult to express using previous lambda-expression approach. But the flexibility is pretty limited, or to be specific, you can only chain multiple filters in an AND relationship.
Another problem of this approach, which is actually the main reason I steer away from approaches #2 #3 #4, is on unit-testing. Yes it is very easy to unit-test each of the filter independently, but it is extremely difficult to mock out those filters to unit-test services that depends on it. I’ll describe the problem in the next approach.

4. Specification pattern

IList<Customer> result = repository.FindAll<Customer>(new WithinAgeCustomerSpecification(ageFrom, ageTo));

My prefered solution is largely derived from specification pattern, I’ll give extra highlight to this approach later. Long story short, IMO this approach is best since it doesn’t leak any data-concern and linq to outside repository. It also separates the responsiblities between loading/saving domain entity (repository) and querying (specification). I’ll start with the problem.
As mentioned, it’s very easy to unit-test each of the specification using the infamous in-memory/sqlite repository testing. But it’s incredibly difficult to unit-test the UI controller and application layer that uses the specification.
Just to give a concrete illustration, this is how I write unit-test had I used the approach #1. (Simplified to search age only)

customerRepository.Expect(x => x.SearchByAgeBetween(20, 30)).Return(stubCustomers);
ViewResult view = customerSearchController.Search(20, 30); 
Assert.That(view.Model, Is.EqualTo(stubCustomers);

But anyone has suggestion how I could test the following controller (simplified)?

public class CustomerController: Controller
{
	IRepository repository; //Injected
	
	public ActionResult Search(int? ageFrom, int? ageTo)
	{
		var customers = repository.Query(new WithinAgeCustomerSpecification(ageFrom, ageTo));
		return View("search", customers);
	}
}

That little call to “new WithinAgeCustomerSpecification(..)” makes it virtually impossible to mock the specification and take it out from the test concern. Linq and Extension method in approach #2 and #3 certainly don’t help.
Why do we care to mock the specification? Because, mind you again, testing queries _IS_ painful! It’s tedious to setup stub-data and verify query result. Each of the specification has had this kind of unit-test themselves, and we certainly _DO_NOT_ want to repeat the test in the controller. For sake of discussion, this is how the unit-test for the controller would look like using unmocked Specification.

ShouldLoadSearchView();
CanLoadCustomerByEmailAddressToViewData();
CanLoadCustomerByNameToViewData();
CanLoadCustomerByAgeToViewData();
CanLoadCustomerByNameAndEmailAddressToViewData();
// etc etc

Each test-case deals with tedious in-memory/sqlite stub data. I don’t even understand why I need to care about data and Sqlite to unit-test UI/Application layer. It just doesn’t make sense.
And guess how the unit-test for the specification looks like.

CanSearchByEmailAddress();
CanSearchByName();
CanSearchByAge();
CanSearchByNameAndEmailAddress();
// etc etc

That’s right, duplication. Not to mention tediously data-driven. Generally, you want to avoid testing that involves data and query. For comparisson, this is how the test for the controller would look like with mocked specification.

ShouldLoadSearchView();
ShouldSearchCustomerUsingCorrectSpecification();
ShouldLoadSearchResultToViewData();

Yes, that’s all we care: “controller should use correct specification to search the customer”. We don’t care if the specification actually does what it claims it does. That’s for other developers to care.

Solution

By wrapping Specifications into a factory, we decouple the controller from Specification implementation.

public class CustomerController: Controller
{
	IRepository repository; //Injected
	ICustomerSpecFactory whereCustomer; //Injected
	
	public ActionResult Search(int? ageFrom, int? ageTo)
	{
		var customers = repository.Query(whereCustomer.HasAgeBetween(ageFrom, ageTo));
		return View("search", customers);
	}
}

Unit-test is a breeze.

whereCustomer.Expect(x => x.HasAgeBetween(20, 30))
	.Return(stubSpec = MockRepository.Generate<Specification<Customer>>);
customerRepository.Expect(x => x.Query(stubSpec)).Return(stubCustomers);

var view = customerController.Search(20, 30); // EXECUTE CONTROLLER ACTION
Assert.That(view.Model, Is.EqualTo(stubCustomer);

EDIT: I posted a better way to write unit-test for this

I actually like it a lot! The specification is also amazingly flexible to mix and play. E.g.:

repository.Query((whereCustomer.HasAgeBetween(20, 30) || whereCustomer.LivesIn("Berlin")) && !whereCustomer.IsVIP());

And they’re still testable and mock-friendly. Now that we know I like this approach, let’s take a look on the implementation of specificatin pattern.

public class CustomerQuery: ICustomerQuery
{
	public ISpecification<Customer> MathesUserSearchFilters(string email, string name, int? ageFrom, int? ageTo, bool isTerminated)
	{
		var result = Specification<Customer>.TRUE;
		if(email != null)
			result &= new Specification<Customer>(x => x.email.ToLower() == email.ToLower());
		if(name != null)
			result &= new Specification<Customer>(x => x.Name.ToLower().Contains(name.ToLower());
		if(ageFrom != null)
			result &= IsOlderThan(ageFrom.Value);
		if(ageTo != null)
			result &= !IsOlderThan(ageTo.Value);
		if(isTerminated != null)
			result &= (isTerminated)?IsTerminated():!IsTerminated();
		return result;
	}
	public ISpecification<Customer> IsOlderThan(int yearOld) {/*..*/}
	public ISpecification<Customer> IsTerminated() {/*..*/}
}

Unlike Linq criteria, Specification plays incredibly well with building dynamic query! And that’s not the best part yet. These Specifications are not mere DB queries. Write once, use it everywhere. It can be used for object-filtering or validation.

var terminatedCustomers = customerList.FindAll(whereCustomer.IsTerminated()); 

Or:

Validate(customer, !whereCustomer.IsTerminated())
	.Message("The customer had been terminated. Please enter an active customer");

Code for sample Specification API can be found in Ritesh Rao’s post.

Where are we?

Oh yes, our initial objective: plugging in new screen/functionality in Open-Closed Principle fashion. Not a problem. The query can live in separate assembly, and it’s easy for the client to introduce their own set of Specifications that meets their querying needs for their plugins.
This approach also gives us the liberty to override the specification implementation, e.g. to comply with specific persistence technology, or database-structure. Say, if Customer.Name is implemented as FIRST_NAME and LAST_NAME columns. Overriding Specification implementation is not possible with “new” keyword or extension method approach, since the application is tightly coupled to specific Specification implementation.
This allows clients to extend the domain entity with their business-specific properties and persistence-structure.

Mocking Helper for Pipe and Filter

In previous post, I was questioning the mockability of query filters in Rob Conery’s Pipe and Filter pattern. I left the post open with a hint about one possible approach to do it, hence this post.

So I have written a simple helper class, QueryableMock, specifically to address this issue, and apparently it is much more complex than I ever expected. The idea is basically not to really mock away the filter implementation, but instead to record the final result of expression-tree generated by the filter. During the playback, the SUT will run the filter methods again, and again, we capture again its expression-tree, and finally compare it with the previously recorded expression. If it matches, then we can safely say that our SUT has passed the test by running exactly all expected filters.

So just to remind ourselves, let’s take a look at the code we want to test from the last post, a Spammer class that sends spam emails to all Tazmanian teens.

public void SendTazmanianTeenSpam(string spamMessage)
{
	var targets = this.customerRepository.All()
		.LivesInCity("Tazmania")
		.YoungerThan(18);

	foreach(var customer in targets)
		emailSender.Send(customer.Email.Address, spamMessage);
}

So let’s check out how we write the unit-test using our helper QueryableMock class.

[Test]
public void CanSendSpamToTazmanianTeen()
{
	var targetStubs = new List<Customer>
	{
		new Customer(1) {Email = new Email("test@email.com")},
		new Customer(2) {Email = new Email("anotherTest@email.com")}
	};

	var customerQueryMock = new QueryableMock<Customer>();
	customerQueryMock.Expect()
		.LivesInCity("Tazmania")
		.YoungerThan(18);
	customerQueryMock.StubFinalResult(targetStubs);

	using (mockery.Record())
	{
		Expect.Call(customerRepository.All()).Return(customerQueryMock);
		emailSender.Send("test@email.com", spamMessage);
		emailSender.Send("anotherTest@email.com", spamMessage);
	}

	using (mockery.Playback())
	{
		spammer.SendTazmanianTeenSpam(spamMessage);
	}
}

Look closer at line #10:

customerQueryMock.Expect()
	.LivesInCity("Tazmania")
	.YoungerThan(18);
customerQueryMock.StubFinalResult(targetStubs);

What it does is building expected filtration expression. It is done by executing the real filter methods and record the final Queryable Expression. This recorded Expression will later on be compared with the Expression being executed later on in SUT: spammer.SendTazmanianTeenSpam(spamMessage);

UNDER THE HOOD
That was all we need to do in the unit-test. Here’s some relevant bit of the code behind QueryableMock.

public class QueryableMock<T> : QueryableProxy<T>
{
	// ... Other stuffs ...
	protected override IEnumerator GetFinalEnumerator()
	{
		VerifyExpression();
		return stubbedFinalEnumerable.GetEnumerator();
	}
	public void VerifyExpression()
	{
		if(!ExpressionComparer.CompareExpression(
			Provider.QueriedExpression, recorderQuery.Provider.QueriedExpression))
			throw new Exception("Query expression does not match expectation");
	}
}

The idea is so far pretty straightforward: when the SUT is about to execute (enumerate) the query filter, it will try to do VerifyExpression to compare the filter expression against expected expression.
The biggest challange here is to compare the equality of 2 expression trees. I.e., how to verify if x=>x+1 is equal to x=>x+1. I was quite surprised that there is no easy way of doing it. Life cannot be as simple as calling Expression.Equals(). So I wrote a quick brute-force comparison logic in ExpressionComparer class. You can check the detail on the attached source-code. If you know of any easier way to do it, please do let me know.

EVEN FURTHER
Finally, this QueryableMock utility is very specifically intended for unit-testing around Pipe-Filter pattern. It does a very limited thing, but so far it works very well to tackle my need (namely: to mock query-filters). Another interesting scenario that is also covered is:

customerQueryMock.Expect()
	.LivesInCity("New York")
	.SelectEmailAddresses()
	.InDomain("gmail.com");
customerQueryMock.StubFinalResult(emailList);

Note that the filtration switched from IQueryable<Customer> to IQueryable<Email> its half way through.

You may argue that TypeMock can provide much elegant solution for this. It uses bytecode instrumentation that can therefore intercept even static and extension methods. But I think the problem is not really whether I can do it or not. But if I cannot easily test a piece of code without relying on bytecode magic, I will question if the code design is even worth doing at all.

Admittedly, I am not 100% comfortable with this approach of mocking. Let me know what you think.

SOURCE CODE
You can download the full source-code of this post here to see what on earth I have been talking about.

How Do You Test Pipe and Filter?

In his notable MVC Storefront series, Rob Conery brought up a very intereting pattern, called Pipe and Filter. My first reaction to that was a bit anxious if it might hurt testability. Before I start with the problem, I think this pattern deserves a little bit introductory words, just in case you have stayed under the rock for the last 12 months, and haven’t checked Rob Conery’s posts.

Pipe and Filter is a pattern recently made popular by Rob’s MVC Storefront episodes. Also known as Filtration pattern, it is a very nice trick to produce a highly fluent Linq2Sql data repository (although, well, it’s not quite repository anymore).

Basically, instead of having several specific-purpose querying filters on repository interface like this:

var list = customerRepository
	.FindByStateAndAgeBelow("Tazmania", 20);

We can now use a far more fluent and flexible syntax through chained filtering statements like this:

var list = customerRepository.All()
	.WithState("Tazmania")
	.WithAgeBelow(20).List();

The magic behind it is .Net 3.5’s Extension Methods.

public static class CustomerFilter
{
	public IQueryable<Customer> WithState (this IQueryable<Customer> query, string state)
	{
		return from cus in query 
			where cus.HomeAddress.State == state
			select cus;
	}
	public IQueryable<Customer> WithAgeBelow(this IQueryable<Customer> query, int age)
	{
		var maxDob = Date.Now.AddYear(-age);
		return from cus in query 
			where cus.BirthDate < maxDob 
			select cus;
	}
}

Testing both of these filters is easy. Just create a collection of customer object, execute the filter, then go ahead and check the result. Here is the unit-test code for WithState filter.

// Stub Customer List
var list = new List<Customer>()
{
	new Customer() {HomeAddress = new Address() {State = "Illinois"}},
	new Customer() {HomeAddress = new Address() {State = "Tazmania"}},
	new Customer() {HomeAddress = new Address() {State = "NSW"}},
	new Customer() {HomeAddress = new Address() {State = "Tazmania"}}
};
var query = list.ToQueryable();

// Execute
var filtered = query.WithState("Tazmania").ToList();

// Verify
Assert.That(filtered.Count, Is.EqualTo(2));
Assert.IsTrue(filtered.Contains(list[1]));
Assert.That(filtered.Contains(list[3]));

Now imagine I work in an ambitious evil project, where I have a small piece of method in my business logic that sends spam emails to all Tazmanian teens.

public void SendAdvertisement(string message)
{
	foreach(var customer in 
		customerRepository.All()
		.WithState("Tazmania").WithAgeBelow(20))
	{
		emailSender.Send(customer.EmailAddress, message);
	}
}

The question is now, how do we write unit test to verify this logic?

Had we used customerRepository.FindByStateAndAgeBelow(“Tazmania”, 20), we would be able to just mock away the call to customerRepository.FindByStateAndAgeBelow(), (hence, behavior or interaction-based verification), and we are sorted.

But here, the problem with Pipe and Filter pattern is the fact that it uses Extension Methods, which are essentially static methods! (And we all know, static methods are the villains in TDD world). They are not mockable, and we have to deal with the real filter implementations.

True, I could just use the same state-based verification approach that we have done above on WithState unit-test by stubbing up a list of customer objects, and then verify the emailSender based on the expected filtered customers. But this is really gross.
I don’t want to care about the internal behavior of the filters here. As the matter of fact, each filter has already had its own unit-test (we have written one above), and I don’t want to repeat myself here. All that I really want to care is if my business logic makes the correct calls to the correct filters, and put the filtration result as our spam targets.

I want to gather how you write unit-test for Pipe|Filter pattern. How you mock out filter logic from your business-logic tests. I have a quick thought in mind about writing a Rhino-Mock helper to catter this scenario. I have yet to try it out, and I will write it on the next post as soon as I do. But first, I would like to hear what other people think about this. Any comment?