Structuring Unit-Tests, My Way

Phil Haack wrote recently about how he structures his unit-tests (which he stole from NuGet.org’s Drew Miller). I thought I would respond with a post on how I structure my unit-tests. I found Phil’s post interesting because of its similarity to the way I usually structure my unit-tests, at least in spirit. Phil brought up some points there which I strongly agree:

  1. One flat test-class per target-class is too convoluted. You will soon get drowned by the complexities of your tests.
  2. This giant test-class needs to be broken down to nested-classes
  3. This structure keeps tests organized and grouped together. So you can collapse method-body (CTRL+M, CTRL+O), or use the object-browser, to quickly view the list of test hierarchies within your class

Now the difference. The main objective of unit-test structure for me is not just for general tidiness, but mainly to address the repetitive nature of writing unit-tests, such as setting up slightly varying contexts between test-cases while maintaining readability and terseness of your unit-test code.

Phil groups his tests based on target methods. I.e. as opposed to one test-class per target-class, now we have one test-class per target-method. That definitely makes things cleaner. But I usually take it even further.

I think a method is still too big to be captured by a single test-class. One method can be used in a variety of different contexts and scenarios.  Testing all these different contexts and scenarios within one single method often leads to tedious repetitive setup code in the unit-test methods.

So instead of adopting “one test-class per target-method”, I structure my unit-tests in a “one test-class per scenario” manner. I’m so far very happy with this approach, not just because it improves readability, but mainly because it greatly promotes reusability (of test-code) and DRYness (as in Don’t Repeat Yourself).

Now, how do we define these test-classes? I divide my test-classes into 2 kinds:

  • GIVEN test class. This test-class defines some background context of your test-story. In short, a GIVEN test class usually has an Arrange() setup method.
  • WHEN test class. WHEN class is an action. It usually has an Act() method, which is where you actually invoke your target-method.

So what is these Act() and Arrange() methods? Well they’re both actually what you usually call Setup() methods, but I break them to 2 different kinds. Why? Because I want to make sure all Arrange() methods to run before all Act() methods. This semantic is consistent with the AAA (Arrange-Act-Assert) syntax, which many mock-frameworks follow. So this structuring feels very natural.

Ok let’s cut to the chase now. Our first example, we write some unit-tests where you have one context (GIVEN) followed by different actions (WHENs).

GIVEN an_empty_shopping_cart:

  • should_have_no_item
  • WHEN added_12_productA:
    • should_only_have_1_item
    • the_item_should_be_of_productA
    • item_quantity_should_be_12
    • WHEN added_5_productB:
      • should_now_have_2_items
      • first_item_remain_intact
      • second_item_should_be_of_productB
      • productB_quantity_should_be_5
      • WHEN cleared:
        • should_now_have_no_item
    • WHEN set_productB_quantity_to_0:
      • should_now_only_have_productA_left
  • WHEN added_5_productA:
    • should_still_have_1_item
    • item_quantity_should_now_be_17

(* This tree hierarchy is how the tests will actually look on NUnit/Resharper test runner)

Remember, WHEN contains Action(). In this example, you’re basically just following one action after another. This is the case where you have a single context where you can perform a chain of different actions, with each step having its own set of tests (assertions).

The code looks like the following. (I use Java because it has a really nice feature called instance-scoped nested-class, which C# does not have. This feature means that nested-class has access to the instance of the outer-class).

public class Given_an_empty_shoppping_cart{
   @Mock Customer customer;
   ShoppingCart cart;

   @Arrange void arrange(){
      cart = new ShoppingCart(customer);
   }

   @Test void should_have_no_item(){
      assertTrue(cart.isEmpty());
   }

   public class When_added_12_productA{
      @Act void act(){
         // code to setup stub for productA
         cart.add(productA, 12);
      }

      @Test void should_only_have_1_item(){
         assertEquals(1, cart.getItems().size());
      }
      @Test void the_item_should_be_of_productA(){
         // assert code
      }
      @Test void item_quantity_should_be_12(){
         // assert code
      }

      public class When_added_5_productB{
         @Act void act(){
            // code to setup stub for productB
            cart.add(productB, 5);
         }

         @Test void should_have_2_items(){
            // assert code
         }
         @Test void first_item_should_remain_intact(){
            the_item_should_be_of_productA();
            item_quantity_should_be_12();
         }
         @Test void second_item_should_be_of_productB(){
            // assert code
         }
         @Test void productB_quantity_should_be_5(){
            // assert code
         }

         public class When_cleared{
            @Act void act(){
               cart.clear();
            }

            @Test void should_have_no_item(){
               assertTrue(cart.isEmpty());
            }
         }

         public class When_set_productB_quantity_to_0{
            @Act void act(){
               cart.setQuantity(productB, 0);
            }

            @Test void should_now_only_have_productA_left(){
               should_only_have_1_item();
               the_item_should_be_of_productA();
               item_quantity_should_be_12();
            }
         }
      }

      public class When_added_5_more_productA{
         @Act void act(){
            cart.add(productA, 5);
         }
         @Test void should_still_have_1_item(){
            assertEquals(1, cart.getItems().size());
         }
         @Test void item_quantity_should_now_be_17(){
            // assert code
         }
      }
   }
}

(Yap, sorry if that was long. I just typed them all in just in case you’re curious how the code looks like.)

So by grouping each test-scenario into its own story class, you promote explicitness and DRYness. We only write the context and the action only once, and use it all the way down the hierarchy. You also notice I’m reusing my unit-tests on line#65-#67. (Additionally, in practice I also leverage a lot of inheritance to reuse a set of unit-tests to multiple contexts).

And this is how the actual source-code looks like on the editor.

Code Outline View
Outline View

So that was an example of writing multiple WHENs to create a chain of different actions.

There is also the reverse. You can write your tests where you have one action (WHEN) applied to different contexts (GIVENs):

(Continuing on the shopping-cart example)

  • WHEN estimating_local_shipping_fee_at_2dollars_per_kg:  // -> when the action happens!
    • GIVEN product_is_3kgs:
      • GIVEN customer_is_standard_member:
        • shipping_should_be_18bucks
      • GIVEN customer_is_premium_member:
        • should_be_free_shipping
    • GIVEN product_is_heavier_than_4kgs:
      • GIVEN customer_is_standard_member:
        • should_only_charge_for_4kgs__ie_24bucks
      • GIVEN customer_is_premium_member:
        • should_charge_a_flat_1dollar_surcharge

(* This tree hierarchy is how the tests will actually look on NUnit/Resharper test runner)

In this example, you are performing one single action (estimating local shipping charge of the shopping-cart), but the expectation of this one single operation may vary, depending on the contexts. (E.g. the weight of the products in the cart, and the type of the customer).

/* continued from previous code example */
public class When_added_12_productA{
   @Act void act(){
      cart.add(productB, 5);
   }

   public class When_estimating_local_shipping_fee_at_2dollars_per_kg{
      Money cost;
      @Act void act(){
         when(product.getShippingRate())
            .thenReturn(ShippingRate.perKg(Money.local(0.50)));

         cart.setDeliveryAddress(somewhereLocal);
         cost = cart.calculateShippingCost();
      }

      public class Given_product_is_3kgs{
         @Arrange void setup(){
            when(product.getWeight()).thenReturn(Weight.kg(3));
         }

         public class Given_customer_is_standard_member{
            @Arrange void setup(){
               when(customer.isPremium()).thenReturn(false);
            }

            @Test void shipping_should_be_18bucks(){
                // 12items x 3kgs x 50c
               assertEquals(Money.local(12 * 3 * 0.50), cost);
            }
         }
         public class Given_customer_is_premium_member{
            @Arrange void setup(){
               when(customer.isPremium()).thenReturn(true);
            }

            @Test void should_be_free_shipping(){
                  assertEquals(Money.zero(), cost);
            }
         }
      }

      public class Given_product_is_heavier_than_4kgs{
         @Arrange void setup(){
            when(product.getWeight()).thenReturn(Weight.kg(10));
         }

         public class Given_customer_is_standard_member{
            @Arrange void setup(){
               when(customer.isPremium()).thenReturn(false);
            }

            @Test void should_only_charge_for_4kgs__ie_24bucks(){
               assertEquals(Money.local(12 * 4 * 0.50), cost);
            }
         }

         public class Given_customer_is_premium_member{
            @Arrange void setup(){
               when(customer.isPremium()).thenReturn(true);
            }

            @Test void should_charge_a_flat_1dollar_surcharge(){
               assertEquals(Money.local(1), cost);
            }
         }
      }
   }
}
Unit-Test Hiearchy on Eclipse
Outline View

This is where the distinction between @Act and @Arrange comes handy. In this case, we are able to define the background story (using @Arrange methods) before the @Act is performed (i.e. invoking shipping calculation). By grouping the test by these different contexts, we promote DRYness. We define our actions once (in WHEN classes), and reuse it all the way down through multiple different contexts (by applying GIVENs through the hierarchy).

Also, notice that we have a couple of repetitive classes: Given_customer_is_standard member and Given_customer_is_premium member. This is a good candidate for an extract-superclass refactoring to further sanitize the test-code, especially if they’re used in many contexts.

We have seen tests with multiple WHENs, and ones with multiple GIVENs. You can mix and match various combination. They will feel very natural, helping eliminate brittle/fragile unit-test code. And they will look really nice on your test-runner too 😉

Source Code

To run the test example, you will need a custom JUnitTestRunner than I have written, which is a pretty small class. I will make the source-code available somewhere in GitHub, and update this post. So watch this space. Also, the code example here uses Java (because of its instance-scoped nested-class), but the pattern can also be achieved using lambda syntax, for example with MSpec.

Summary

Flat unit-test structure can get really convoluted. Grouping your unit-tests into nested-classes can improve readability. Phil showed a great example of one way to group these tests. However, the test itself is still written using conventional pattern:  you write each test-method with the whole context-action-assertion code (arrange-act-assert). This leads to massively repetitive code.

By breaking arrange and act into separate WHEN and GIVEN classes, these codes will only need to be written once, and reused throughout your class hierarchical structure. It greatly promotes DRYness, and avoids fragile unit-tests.

This is a technique I borrow from BDD. Related post by me from a couple years back: TDD, BDD Done Right.

4 thoughts on “Structuring Unit-Tests, My Way

  1. I really like this idea. While I don’t terribly care about how the tests are structured (so long as they’re there, a shortcut lets you find any method name pretty easily), I do care about constantly repeating setup. We have this problem with the tests in the project I’m working on. Every scenario needs a slightly different setup, and we’ve basically ended up having an Arrange, Act, and Assert in every single test. It’s annoying.

    We’re in C# using NUnit. I don’t think I can exactly implement your idea because the nested classes won’t have access to parent class properties (which is why you said you’re in Java-land for this). Any ideas how I can structure our tests to be more about testing each setup instead of testing each method?

    1. Sorry Loise, I somehow forgot to write a reply.
      I agree. Repetition has always been the primary issue I had when writing every test scenario as one flat method. Test scenarios are simply too huge to fit into each method, and inevitably lead to repetitive setup code.
      Regarding the structure itself, I actually find it very useful. Not so much to navigate through the tests, but more to keep my test-names lean and relevant. I.e., you’re not ending up with dozens of tests with repetitive-sounding names like “Given_foo_that_is_blah_and_blah_when_blah_done_kazaam_then_bar_should_be_yada_yada()”, with all their slightly different variances. Hierarchical structure keeps them concise and non-repetitive.

      You can achieve similar structure in C# (NUnit) by using static scope, which is the trick I’ve been using myself. I.e. you declare your unit-test variables in static fields, and your Arrange/Act methods as static methods. These static fields and methods are implicitly accessible from their nested-classes. Apart from that, the rest is pretty much identical to the Java code.

      PS: Other trick that has been used (by Scott Bellware with his SpecUnit) is by using inheritance. (I.e. nested-tests to inherit manually from their outer-tests). I used to adopt this pattern. But it feels noisy and does get annoying pretty quickly. I have now personally preferred the static-scoped approach, for its cleaner java-like feeling with the least possible friction.

  2. Great post Hendry.

    I would like to follow this approach whenever there’s a need to group tests.

    Q) I see that you said “To run the test example, you will need a custom JUnitTestRunner than I have written, which is a pretty small class.” Could you upload the same?

    Q2) Just as shown in the example, I would like to keep the common setUp() in the parent class and make inner classes extend the outer. However, part of me warns me about the anti-pattern: “Inheritance for code reuse”. Let me know what you think.

  3. I pretty much like the idea,we are thinking moving towards the approach,as group unit tests based on context.As we have a cleaner way of doing in Ruby.
    Did you push the JunitRunner into GIt somewhere!!??

Leave a reply to Louis Basile Cancel reply