Writing finder methods for data repositories is problematic. The following is an example of pretty simplistic customer-search screen:
Email Address: [________]
Name: [_______]
Age: from [__] to [__] year old
Is terminated: [ ]Yes [ ]No
{Search}
Now how are we going to implement data query behind this search? I’ll go through several alternatives. (Jump to solution if you can’t care less). Let’s start from the simplest approach.
1. Repository Finder Methods
IList<Customer> result = customerRepository.SearchByBlaBlaBloodyBla(email, name, ageFrom, ageTo, isTerminated);
Let me give more context about what I’m working on. I’m developing an application that is intended to be an extensible vertical CRM framework. The client will provide their own new screens/functionalities by writing specific plugin implementation in their own separate assembly, without requiring modification on the framework (Open-Closed Principle).
The approach you just saw above tightly couples the repository API with specific UI design. Any change with the design of the search screen will require the repository API to be reworked. And should we create one repository method for each search screen? Furthermore, when a new search screen or report functionality is plugged into the system by adding new plugin, we need to somehow extend the data repository API to cover each of those specific screen scenarios. This is not an easily extensible architecture.
The upside, this approach is simple and very easy to mock for unit-test. When flexibility is not an issue, I would go for this approach.
2. Lambda-Expression Repository
IList<Customer> result = repository.FindAll<Customer>( x => x.EmailAddress == emailAddress && x.IsTerminated == isTerminated); // and so on
This code uses repository API from FluentNHibernate. I like this API because we only have one single general-purpose Repository. It decouples the repository completely from specific UI design. However I’m not comfortably about leaking naked Linq outside of repository. Exposing Linq to other layer will scatter the database concern all across application. Let’s consider what happens if we decides to refactor IsTerminated property that is currently implemented as a column in DB into C# code, say:
public bool IsTerminated {get {return this.TerminationDate != null; }};
The earlier Linq statement (possibly scattered all over the place) will start to fail since Linq is unable to map IsTerminated property into a correct SQL where clause.
3. Pipe and Filter Pattern
IQueryable<Customer> result = repository.All<Customer>().WithEmail(emailAddress).AgeAbove(ageFrom).AgeBelow(ageTo); if(isTerminated != null) result = (isTerminated)? result.IsTerminated(): result.IsNotTerminated();
Or in this case, that should be wrapped into:
IList<Customer> result = repository.All<Customer>().WithBlaBloodyBla(emailAddress, name, ageFrom, ageTo, isTerminated).List();
This approach leverages fluent IQueryable and extenssion-methods. It still exposes IQueryable which leaks database concern outside repository, but it’s much better since the query is properly encapsulated behind easy-to-read and maintainable extenssion methods.
In the above example with isTerminated check, it’s obvious that this approach is doing pretty well in handling dynamic query that is very difficult to express using previous lambda-expression approach. But the flexibility is pretty limited, or to be specific, you can only chain multiple filters in an AND relationship.
Another problem of this approach, which is actually the main reason I steer away from approaches #2 #3 #4, is on unit-testing. Yes it is very easy to unit-test each of the filter independently, but it is extremely difficult to mock out those filters to unit-test services that depends on it. I’ll describe the problem in the next approach.
4. Specification pattern
IList<Customer> result = repository.FindAll<Customer>(new WithinAgeCustomerSpecification(ageFrom, ageTo));
My prefered solution is largely derived from specification pattern, I’ll give extra highlight to this approach later. Long story short, IMO this approach is best since it doesn’t leak any data-concern and linq to outside repository. It also separates the responsiblities between loading/saving domain entity (repository) and querying (specification). I’ll start with the problem.
As mentioned, it’s very easy to unit-test each of the specification using the infamous in-memory/sqlite repository testing. But it’s incredibly difficult to unit-test the UI controller and application layer that uses the specification.
Just to give a concrete illustration, this is how I write unit-test had I used the approach #1. (Simplified to search age only)
customerRepository.Expect(x => x.SearchByAgeBetween(20, 30)).Return(stubCustomers); ViewResult view = customerSearchController.Search(20, 30); Assert.That(view.Model, Is.EqualTo(stubCustomers);
But anyone has suggestion how I could test the following controller (simplified)?
public class CustomerController: Controller { IRepository repository; //Injected public ActionResult Search(int? ageFrom, int? ageTo) { var customers = repository.Query(new WithinAgeCustomerSpecification(ageFrom, ageTo)); return View("search", customers); } }
That little call to “new WithinAgeCustomerSpecification(..)” makes it virtually impossible to mock the specification and take it out from the test concern. Linq and Extension method in approach #2 and #3 certainly don’t help.
Why do we care to mock the specification? Because, mind you again, testing queries _IS_ painful! It’s tedious to setup stub-data and verify query result. Each of the specification has had this kind of unit-test themselves, and we certainly _DO_NOT_ want to repeat the test in the controller. For sake of discussion, this is how the unit-test for the controller would look like using unmocked Specification.
ShouldLoadSearchView(); CanLoadCustomerByEmailAddressToViewData(); CanLoadCustomerByNameToViewData(); CanLoadCustomerByAgeToViewData(); CanLoadCustomerByNameAndEmailAddressToViewData(); // etc etc
Each test-case deals with tedious in-memory/sqlite stub data. I don’t even understand why I need to care about data and Sqlite to unit-test UI/Application layer. It just doesn’t make sense.
And guess how the unit-test for the specification looks like.
CanSearchByEmailAddress(); CanSearchByName(); CanSearchByAge(); CanSearchByNameAndEmailAddress(); // etc etc
That’s right, duplication. Not to mention tediously data-driven. Generally, you want to avoid testing that involves data and query. For comparisson, this is how the test for the controller would look like with mocked specification.
ShouldLoadSearchView(); ShouldSearchCustomerUsingCorrectSpecification(); ShouldLoadSearchResultToViewData();
Yes, that’s all we care: “controller should use correct specification to search the customer”. We don’t care if the specification actually does what it claims it does. That’s for other developers to care.
Solution
By wrapping Specifications into a factory, we decouple the controller from Specification implementation.
public class CustomerController: Controller { IRepository repository; //Injected ICustomerSpecFactory whereCustomer; //Injected public ActionResult Search(int? ageFrom, int? ageTo) { var customers = repository.Query(whereCustomer.HasAgeBetween(ageFrom, ageTo)); return View("search", customers); } }
Unit-test is a breeze.
whereCustomer.Expect(x => x.HasAgeBetween(20, 30)) .Return(stubSpec = MockRepository.Generate<Specification<Customer>>); customerRepository.Expect(x => x.Query(stubSpec)).Return(stubCustomers); var view = customerController.Search(20, 30); // EXECUTE CONTROLLER ACTION Assert.That(view.Model, Is.EqualTo(stubCustomer);
EDIT: I posted a better way to write unit-test for this
I actually like it a lot! The specification is also amazingly flexible to mix and play. E.g.:
repository.Query((whereCustomer.HasAgeBetween(20, 30) || whereCustomer.LivesIn("Berlin")) && !whereCustomer.IsVIP());
And they’re still testable and mock-friendly. Now that we know I like this approach, let’s take a look on the implementation of specificatin pattern.
public class CustomerQuery: ICustomerQuery { public ISpecification<Customer> MathesUserSearchFilters(string email, string name, int? ageFrom, int? ageTo, bool isTerminated) { var result = Specification<Customer>.TRUE; if(email != null) result &= new Specification<Customer>(x => x.email.ToLower() == email.ToLower()); if(name != null) result &= new Specification<Customer>(x => x.Name.ToLower().Contains(name.ToLower()); if(ageFrom != null) result &= IsOlderThan(ageFrom.Value); if(ageTo != null) result &= !IsOlderThan(ageTo.Value); if(isTerminated != null) result &= (isTerminated)?IsTerminated():!IsTerminated(); return result; } public ISpecification<Customer> IsOlderThan(int yearOld) {/*..*/} public ISpecification<Customer> IsTerminated() {/*..*/} }
Unlike Linq criteria, Specification plays incredibly well with building dynamic query! And that’s not the best part yet. These Specifications are not mere DB queries. Write once, use it everywhere. It can be used for object-filtering or validation.
var terminatedCustomers = customerList.FindAll(whereCustomer.IsTerminated());
Or:
Validate(customer, !whereCustomer.IsTerminated()) .Message("The customer had been terminated. Please enter an active customer");
Code for sample Specification API can be found in Ritesh Rao’s post.
Where are we?
Oh yes, our initial objective: plugging in new screen/functionality in Open-Closed Principle fashion. Not a problem. The query can live in separate assembly, and it’s easy for the client to introduce their own set of Specifications that meets their querying needs for their plugins.
This approach also gives us the liberty to override the specification implementation, e.g. to comply with specific persistence technology, or database-structure. Say, if Customer.Name is implemented as FIRST_NAME and LAST_NAME columns. Overriding Specification implementation is not possible with “new” keyword or extension method approach, since the application is tightly coupled to specific Specification implementation.
This allows clients to extend the domain entity with their business-specific properties and persistence-structure.