SheepAop Code Review with NDepend (Part I)

Now that SheepAop has reached its first milestone, I think it’s time to stop for a moment and review the code to discover potential code-smells before I engage any further development.
I’ll use NDepend for our code-review exercise today, and I’ll be doing it live while I’m reviewing the code, refactor, and write it down on this post on every step as I go along.

Dependency Graph

clip_image002[4]

The very first thing I did was to check the dependency graph to get an overall understanding about the structure of the code. To my surprise, the dependencies did not immediately look quite as obvious as the picture I had in my head.

A large part of the graph still fits very well with what I expected:

  1. There are some helper-classes and exceptions down at the bottom, which contain general-purpose classes that help streamline our contact with the outside world (.net collections, mono-cecil, etc). These helpers are common and should be completely agnostic to anything SheepAop, which is confirmed by this graph: there’s no line coming out from these modules.
  2. SheepAop.Aspects contains a set of out-of-the-box sample implementations of SheepAop aspects. They’re not part of the framework, so I don’t expect to see anything SheepAop to reference it. The graph agrees.
  3. SAQL is a DSL layer on top of SheepAop API. The core machinery of SheepAop should be completely agnostic of SAQL. The graph looks fine: all Saql namespaces appear to be left alone by the rest of the town.
  4. SheepAop is designed so that making new Advice implementations should be easy because it’s the one part in SheepAop that will change (and be added) the most. For that reason, all advice classes should be external to the core so that it’s easy to plug new ones. The graph seems to agree with that. There are currently only 2 advices in SheepAop: AroundAdvice and LifecycleAdvice, and the graph shows no arrow pointing to that direction. (Except the unexpected arrow from Pointcuts to AroundAdvising, which we’ll explore below).

Potential Problems

There are, however, few things that I was not categorically pleased. I’ll list them all as a note for now, even though I’m not sure whether or not they turn out to be code-smells. But we will investigate each one of them later on.

  1. I was surprised to see how much SheepAop.Compile namespace assumes such a central role in the whole architecture. Its responsibility is supposed to be limited to the compilation part of SheepAop. It should depend on other SheepAop components to do this task, but conversely the other components should NOT have much interest in the compilation business. The graph, however, shows that almost every part of SheepAop depends on the compiler, which doesn’t seem right.
  2. As mentioned above (in point#4), SheepAop Advice classes should be pluggable modules that adding new ones should be cheap and easy. I’m quite curious about why SheepAop.Pointcuts namespace has any hard reference to AroundAdvice.
  3. I also did not expect to see SheepAop.Core to depend on the Runtime component and Attributes. Attributes is supposed to provide users with a friendly interface to interact with the core SheepAop API. What’re the attributes possibly doing that makes them so critical even the core engine has to depend on it?
  4. Like Advices, Lifecycles needs to be a pluggable thing so that it’s easy to add new ones (I’m still developing some as we speak), as well as for users to make their own custom Lifecycles to be plugged to SheepAop. I was hoping not to see any hard reference from the Core to Lifecycles if I can help it. We’ll see what we can do.
  5. (After the first refactoring below) We have cyclic dependencies between Core<->Pointcut. We’ll investigate this. Probably Pointcut should really be part of the Core, considering how central the role it plays in SheepAop.

Investigation

Let’s now investigate each of them. I’ll start from the first one.

1. Everything -> SheepAop.Compiler

There are currently several dependencies toward the compiler which we want to get rid of. I’ll start with the most worrying one: the circular dependency between the Compiler and the Core.

clip_image003[4]

The one I need to look out for is the reference INTO the Compiler (greens and blacks). The answer seems quite obvious here: AdviceBase and AspectDefinition are defined as Compiler components. They are not. They have little to do with compilation. Let’s try moving these classes into the Core namespace. If they are indeed part of the Compiler, there will appear even more dependencies toward the Compiler coming from these classes. I’ll also need to bring along IAdvice and IWeaver since they are depended upon. So let’s try and refactor this.

clip_image005[4]

Great! The dependencies are now heading to one direction. All the unit-tests still pass. So this one is sorted. Let’s take a look on our next dependency on the Compiler.

Wait, they’re all gone!

clip_image006[4]

No more circular dependency on the Compile components (orange). It turns out that those 4 classes we just refactored were the culprit of all the heavy dependencies on the Compiler. As shown by the green blocks, the only components that now depend on our Compiler are our 2 Advices: AroundAdvice and LifecycleAdvice. That sounds perfectly correct. And the picture is starting to take its shape.

2. Pointcuts -> AroundAdvising

We want all Advices (AroundAdvice in this case) to be decoupled from the framework because it’s one part in SheepAop that will change (and added) the most, so it has to be easy to plug new Advices.

Same deal, I click on the line between Pointcuts and AroundAdvising to view the matrix. (It’s quite a tall picture so I cut out the irrelevant bits).

clip_image008

The green squares are the ones that shouldn’t be there. I wonder what they are. Let’s check the first one: PropertyPointcut.

clip_image010

That’s a method I found in the class. That’s a leftover method I inherit from the old version of SheepAop infrastructure (when Advice was still tightly coupled to the system). I don’t think they still contribute to the new SheepAop infrastructure. I checked the other green squares and they all seem to have the same garbage leftovers. Let’s get rid of them and run the unit-tests to make sure it doesn’t break anything.

clip_image011

Great, all the greens are cleared. All unit-tests still pass.

clip_image013

Now AroundAdvice is completely decoupled from SheepAop. The graph (centered on AroundAdvising) shows no green, meaning that there’s nothing in the whole SheepAop that depends on it (except our public Attributes). So this has been another success.

3.A SheepAop.Core <-> SheepAop.Runtime

There are several components that SheepAop.Core is dependent on, which it shouldn’t. Let’s start with the Runtime namespace. SheepAop Core should not depend on its Runtime components.

clip_image015

The green ones should not be there. It seems that they are all pointing to 2 method delegates (AdviceInvoker and AdviceCallback at the top), all of which are coming from Joinpoint classes. It makes me feel that those 2 delegates are integral parts of SheepAop Joinpoint system. Maybe I should put them together with the Joinpoint. In fact, I start to feel that these objects warrant their own separate namespace ( “Joinpoint”, instead of piggybacking the “Core” namespace). But first thing first.

clip_image017

Great. Now let’s split all those Joinpoint classes to a separate namespace, and hope that the resulting namespaces won’t cause cyclic dependencies.

clip_image019

Wow, this is unexpected. Not only that Core and Joinpoint did not end up in a cyclic dependency, there is in fact no dependency at all between them! They are both unrelated. It really was, after all, a good decision to split them to separate namespaces. But now it makes me wonder about the name. What’s with the name “SheepAop.Core.Joinpoint” if it’s not related to the Core? Since Runtime is the biggest dependent of this namespace, I think I should merge it with Runtime. After all, a join-point is a SheepAop representation model to describe runtime snapshots. Ironically, it turns out that the 2 delegates we pulled out of Runtime namespace were actually at the right place after all.

clip_image021

OK, we’re good. Let’s move on.

3.B SheepAop.Core <-> SheepAop.Attributes

As explained at the start of this post, Attributes is a facade API for users to interact with SheepAop framework, and it should sit right at the edge, together with SAQL. So why does the Core depend on it?

image
Aha, a black one. SingletonAttribute extends LifecycleAttributeBase, but LifecycleAttributeBase has a static CurrentInstance that defaults to SingletonAttribute. A cyclic.

image

Generally I have no problem with this kind of cyclic dependency because it does not indicate a code smell. But in this particular case, LifecycleAttributeBase needs to be refactored anyway into an ILifecycleProvider interface which will fit better as a Core component. In general, attributes are intended to be used as user API. Doesn’t feel right to see it as a part of the Core engine. So yeah, I’ll refactor this, and hope that it will also solve the circular dependency.

image

It does. That was quite a big refactoring, but all the unit-tests still pass, so we’re good. We’re now on a much better state with this refactoring.
As shown by the green blocks, the only component that knows anything about SheepAop Attributes (orange) is now only the Compiler, which sounds fantastically correct. (SheepAop.Aspects, as mentioned earlier, is just a sample implementation of SheepAop. So just ignore it, it’s not part of the framework).

Notice that our refactoring has also cleared the cyclic dependency between the Core and Lifecycle, so we just unconsciously solved the first half of task#4.

4.A Sheep.Core.Lifecycle <-> Sheep.Core

(Done)

4.B Sheep.Aop.Lifecycle <-> SheepAop.Runtime

To reiterate, we want Lifecycle to be decoupled from the framework so that it’s easy to introduce new ones (which are being developed as we speak), and for users to add their custom ones.

image

Ah this one is easy. IMayHaveAspect is a very specific spare-part of the PerThis lifecycle, so it should really belong in the Lifecycle namespace. So let’s move it.

image

Ah looks much better now.

5. SheepAop.Core <-> SheepAop.Pointcuts

I have a strong feeling that Pointcut should be merged into the Core. But let’s see.

image

Ah it looks like the Core only knows Pointcut by IPointcut interface and PointcutBase. I think I can move IPointcut to be part of the Core, and refactor the reference to PointcutBase to use the interface instead. (It’s generally a good idea to program against interfaces anyway).

image

Sweet. The cyclic traffic is gone, and the dependency becomes unidirectional: SheepAop.Pointcut –> SheepAop.Core, which feels quite right. So this is our final result:

image

Now we can see no more direct cyclic dependency in our code. But I can’t convincingly tell whether there is any indirect circular dependencies in the graph.
The graph is messy. Now that the relationships between the components have changed substantially, I think we should rearrange our graph to better reflect the new shape of our architecture. (I’ll also hide helper classes and exceptions from the graph).

image

The modules are now arranged so that all dependencies flow from top to bottom. The structure of the architecture is now very visible from the graph. As you can see, all arrows are pointing downwards. Or in simple terms: there is no possible circular dependency between our modules, directly or indirectly.

Next Step

I think that’s it for today, I’m calling it a night. We have achieved quite a good result so far. Tomorrow (or whenever I got time), I will continue the review with NDepend Metrics and CQL code-inspections. I can see some warnings already on my NDepend panels indicating some problems detected by NDepend code-inspector. But I’m too tired for the day, so I’ll spare that for the next post.

All refactored code has been checked back in to the SheepAop repository. You can check it out if you want to explore the result of today’s refactoring.

Advertisements

NDepend for Continuous Quality

Full Disclosure: Patrick Smacchia (lead dev of NDepend) offered me an NDepend Professional license so that I could use it and blog about it. I gladly received it. Now I’m going to tell you how much it rocks, but not because I have to. I had been having interest in NDepend for quite some time. It’s a great tool in promoting good application code and architecture, something I’m quite fond of. This is a tool that I would have been using anyway if the project I’m working on were on .Net.

What is NDepend?

In case you haven’t been within the reach of geeky hype signal recently and never heard about NDepend, it is a tool that reads into your source code, and statically analyses if the code meets certain design guidelines and quality. By “statically”, it means that it examines your code without actually running it. NDepend is often used in code-review process during development to easily identify spots in the code that could be improved.

If that sounds like FxCop to you, yes they share the same goal in principle, but they are different in 2 ways:

  1. FxCop works on fine-grain details. It inspects each line of instruction or coding structure, and how they comply with framework design guideline. NDepend works more from high-level application architecture viewpoint. It visualizes the code in terms of dependencies, complexities, and other design metrics in quantifiable numbers, giving you a big picture idea of the state of the code and spot smelly areas. You can then define some thresholds to define an acceptable level of code quality. Further down this post, this difference will become crystal clear.
  2. FxCop gives you a set of rules you can use, but and there is no easy way to express your own rule. Adding new rule involves writing new add-in to be plugged into FxCop. The primary strength of NDepend is on its CQL (Code Query Language), a DSL that allows you to express your own custom rule and evaluate it on the fly as you type.

Using NDepend

NDepend makes me wish my current project at work was written on C#. It is in fact written in Java. Incidentally, developed using big-ball-of-mud approach, the code has degraded to one of the worst source-codes I have worked on. That should have been absolutely the perfect place to use NDepend and refactor the big mess out loud. Alas NDepend doesn’t deal with Java. The good news is, NDepend just announced to release NDepend for Java, XDepend. But that’s for another post.

For our purpose, I pick a codebase from one of open-source .net projects I am quite familiar with, CodeCampServer, a web application developed using a Domain-Driven-Design and Test-Driven-Development on MVC platform. That perfectly mimics my idea of good application architecture.

UI appearance of Visual NDepend very much feels like Visual Studio environment.

image  image

Getting NDepend to analyse assemblies is incredibly easy! Literally couple of intuitive clicks, and before I know it, I am presented with fantastic details of information about the project.

Dependency Graph

Dependency graph is the best way to understand the architecture of the code.

image

You can navigate further deeper to get more intimate understanding about the code. Since CodeCampServer is built using Domain-Driven-Design, the domain-model layer is more than likely to be our primary interest in understanding the application, particularly when it’s developed by other people.

DependencyGraphSnapshot

What we are looking at is a direct x-ray scan of CodeCampServer.Model.Domain namespace, and it says 1 thing: the code is impressive! Doesn’t take much to notice that CodeCampServer is developed using a solid Domain-Driven-Design: the code almost looks like business document. None of all classes within this namespace has any reference to some geeky technical details or infrastructure components. Indeed, the dependency graph of this code directly reflects the business models and their associations, which is readily consumable not only for developers, but also for business people. Big A+ for CodeCampServer.

Dependency graph can also be presented in the form of matrix.

image

3 rules to read this matrix:

  • Green: number of methods in namespace on the left header that directly use members of namespace on the top header
  • Blue: number of methods in namespace on the left header that is directly used by members of namespace on the top header
  • Black: number of circular direct dependency between namespaces on the left and on the top headers

As such, the following graph generates English description: 42 methods of the assembly CodeCampServer.Website are using 61 methods of the assembly CodeCampServer.Model

image image

So back to the previous complete matrix, it shows that CodeCampServer.Model.Domain namespace does not have any reference to any other namespace (all horizontal boxes are blue). To put in other word, domain-model is the core of this application (see Onion Architecture). All DDD-ish so far.

What I find interesting is that data-access namespace (CodeCampServer.DataAccess) makes *zero* direct communication with CodeCampServer.Model.Domain. It means that none of the repositories makes a direct communication with domain entities at all. This is impressive! It might seem counter-intuitive at first, how can those repositories create instances of domain-entities from DB, as well as save the state of domain-entities to DB without making any communication with any method/property/constructor of domain entity?

CodeCampServer uses NHibernate as ORM which completely decouples data-access concerns from the application code. Repositories in data-access layer simply delegate all CRUD operations of domain entities to NHibernate that will handle the creation and persistence of domain-entities. The repository itself makes totally zero direct communication with domain-entity.

Overall, CodeCampServer has a very good dependency matrix, but you will be surprised of how many unexpected smells that you can find in your application. You might find many things are not written in the way how you architect it. If one of your developers take a shortcut and, say, call security-checking from domain-entity, you will notice an unexpected reference from the dependency matrix (Domain->Security). You can even setup automatic alert when that happens, e.g. when domain layer has any reference to other namespace. More on that later.

Circular Dependency

This is what we normally need to be alerted with. CodeCampServer.Website.Impl makes a circular reference with CodeCampServer.Website. A reference from CodeCampServer.Website into a member in more specific namespace like CodeCampServer.Website.Impl signals some kind of smell.

You can navigate through the detail of the reference by clicking on the box. Turns out that CodeCampServer.Website.Globals make a reference to CodeCampServer.Website.Impl.RouteConfigurator.

image image

The reference is apparently caused by Windsor IoC configuration in Globals.RegisterMvcTypes() methods.

image

Since Global is only referencing RouteConfigurator by its class, and not communicating with it in any way, NDepend indicates 0 on the reference. These 2 components are entangled for a conscious intention,so this analysis concluded no design problem. However this is a good example of using NDepend feedback to discover potential design smell in many parts of applications. For example look at the following dependency matrix for Spring.Net code:

image

The black areas indicates that those namespaces are tightly entangled together. Tight coupling means that all those 6 namespaces can only be understood together, which adds complexity in understanding and maintaining the code. Of course tight-coupling also means that they all can only be used together. I.e. to use dependency-injection capability from Spring.Core, you will need the presence of Spring.Validation in the baggage.

Metric

NDepend presents overview information about our code in various metric. For example, here is “Line of Code” metric for CodeCampServer.

MetricTreemapSnapshot

CodeCampServer is written with TDD, so unsurprisingly, Unit-Tests takes up majority of the code lines. Followed by integration-tests. The biggest portion of application-code is on Website layer. And here is my favorite part: just like a street map, you can zoom deeply into this map by scrolling your map, from top assembly level down to classes and individual methods! In fact, you can use your mouse scroll to zoom almost anything in NDepend user interface.

Now let’s see the map for Cyclomatic Complexity metric. The more if/else/switch statements in your code, the higher Cyclomatic Complexity it gets.

MetricTreemapSnapshot

CodeCampServer is developed using Domain-Driven-Design approach, so when it comes to complexity and decision-making, Model layer takes up most of the space. Unit-test takes just about the same size.

Code Query Language

I can’t repeat this enough: CQL is brilliant! It gives you the power to gather just about any information you need about your source code. CQL is very similar to SQL, but instead of querying the database, you are querying into your source-code. You can use CQL to quickly search for particular areas in your code that can be of your interest in term of code quality.

The information that you can query with CQL is comprehensive. NDepends provides a huge 82 code metrics that provides access to large range of information you need.

  • Which methods have been refactored since last release and is not thoroughly covered by unit-tests?
    SELECT METHODS WHERE CodeWasChanged AND PercentageCoverage < 100
  • What public methods could be declared as private?
    SELECT METHODS WHERE IsPublic AND CouldBePrivate
  • What are the 10 most complex methods?
    SELECT TOP 10 METHODS ORDER BY CyclomaticComplexity

Code Inspection

So far we have been talking about all various statistics and analysis, but we haven’t seen the part that matters the most in a static analysis tool: code inspection. You can set up custom coding guidelines using CQS, and let NDepend regularly inspect your codebase to meet the project standard, and issue warnings when somebody violates the rule.

Let’s say, Architectural Guidelines:

  • DDD Value Object has to be immutable
    WARN IF Count > 0 IN SELECT TYPES WHERE !IsImmutable AND HasAttribute “ValueObject”
  • Web layer should not have direct dependency to Data layer
    WARN IF Count > 0 IN SELECT METHODS FROM NAMESPACES “Web” WHERE IsDirectlyUsing “Data”
  • Repository should not flush NHibernate session. It violates unit-of-work pattern.
    WARN IF Count > 0 IN SELECT METHODS FROM NAMESPACES “Data” WHERE IsUsing “NHibernate.ISession.Flush()”
  • All public methods of Services has to be surrounded by database transaction
    WARN IF Count > 0 IN SELECT METHODS FROM ASSEMBLIES “Services” WHERE IsPublic AND !HasAttribute “Transactional”

Quality Standards:

  • Methods that are way too long, and obviously unreadable
    WARN IF Count > 0 IN SELECT METHODS WHERE NbILInstructions > 200 ORDER BY NbILInstructions DESC
  • Methods that have too many parameters are smelly.
    WARN IF Count > 0 IN SELECT METHODS WHERE NbParameters > 5 ORDER BY NbParameters DESC
  • Classes that have too many methods. Chances are they violate Single Responsibility Principle
    WARN IF Count > 0 IN SELECT TYPES WHERE NbMethods > 20 ORDER BY LCOM DESC
  • Classes that overuse inheritance. You should consider favour_composition_over_inheritance principle
    WARN IF Count > 0 IN SELECT TYPES WHERE DepthOfInheritance > 6 ORDER BY DepthOfInheritance DESC

Continuous Quality

Continuous quality is the ultimate goal of code-analysis tools. You can hook NDepend to your project’s continuous integration servers.

Hooking NDepend to TeamCity allows it to continuously monitor the quality of the project based on design rules you specifies, and ring the alarm when somebody checks-in some malicious code that violates design guidelines of the house. This gives an immediate feedback to the developers to fix the code before it degrades the codebase. Yes, we “Fix” code NOT only when something is broken, but also when the code does not meet certain quality threshold! It’s often difficult to get that across to some people.

This ensures not only that all codes being checked-in can actually be built into an actual application that satisfies all the requirements (unit-tests), but also ensures that the codebase continuously meet acceptable level of quality.

TeamCity is a really nice development dashboard that provides you a great detail of information about the current state of the application and its historical progressions, and NDepends supplies quantitative numbers for various quality metrics that you can track back. You can even use NDepend to analyse differences between 2 versions of source code and the implication to the metrics. You can observe how the metrics are evolving over time. In general, this will gives you a trend looking more or less like this:

image

At the start of the project, the code is looking good. Methods are short, classes are awesomely designed, loose coupling is maintained. Over time as the complexity grows, methods are getting stuffed up with if/else and non-sense code, classes are entangled together… The code starts to decay. When the code gets so bad and the heat goes beyond acceptable level or violates certain coding guidelines, NDepend starts complaining screaming for refactoring.

The developers starts cleaning up their mess and refactor the code, while the code is still reasonably manageable. The heat cools down a bit, and quality is getting better. As the development progressing, the heat keeps jumping up and each time NDepend gives you immediate feedback, allowing you to refactor the code before it gets too difficult. The overall code quality is constant over time.

Without continuous inspection to code quality, the diagram would have been:

image

The code decays steadily as it grows. It’s becoming less and less manageable with the intention to refactor it when you feel critical. But before you know it, the code is already too difficult to be refactored. Making a single change is totally painful, and people end up throwing more mud into it everyday until the code turns into a big ball of mud, that the only thing in your mind everyday is to scrap the code and move on to the next project. And I know that feeling all too well 😛

Report

When it comes to code analysis report, NDepend is top notch. There are ridiculously massive details of information about your code. From trivial lines-of-code, test-coverage, and comments, to dependencies and warnings. You can even compare 2 versions of the source-code. And all of them can be kept track in historical order within TeamCity. All these details can be overwhelming, luckily it’s presented with a great high level summary at the top, followed by increasingly detailed information as you walk down. Here are big-picture summary of CodeCampServer’s codebase:

image

Stable or abstract do not mean anything bad or good about the code. That is just a fact about the code.

  • Stable package means that a lot of components are using it. I.e., it has a lot of reasons NOT to change
  • Abstractness indicates the ratio between interfaces, abstract classes, or sealed classes within the package. Abstract package gives more flexibility in changing the code.

Good quality code is a balance between these 2 properties. It’s about finding equilibrium between 2 extreme characteristics:

  • Painful: the package used by a lot of people, but lack of abstractness, hence painful to change
  • Useless: the package is used by no one, but it has a lot of abstractness, hence over-engineering.

The website and data-access of CodeCampServer are infrastructure implementations, and it is structured that there is nothing in the application that depends on them. Thus they’re located at the edge of instability in the diagram. The core of the application, Model layer are by nature stable. They cannot be easily changed without affecting the other part of the application. Luckily it is highly decoupled (using service-interface and repository-interface), so it has higher abstraction, although not enough to make it into green area.

Summary

Weakness

There are however some area that NDepend still falls short.

  1. CQL is not as powerful as you might expect. The most significant one being the absence of any notion of subquery, or join, or any other mean that allows you to traverse relationship between components. What seem to be pretty simple queries are actually impossible to be expressed in current version of CQL:
    • Select all methods in classes that implement IDisposable
    • Any method that constructs any class from namespace “Services”. (Rule: services should be injected by IoC)
    • Any member in namespace “Services” that has instance field of types from “Domain.Model” namespace. (Rule: DDD services should be stateless)
    • All namespaces that refer to another namespace that make reference to opposite direction (i.e. circular reference)
  2. Continous Integration support is a bit weak. In fact, there is no direct support to TeamCity. Constraint violations can’t be fed into TeamCity alert system. And while TeamCity can store NDepend historical reports as artefacts, there is no easy way to follow the progression summary of the stats. E.g. growth of code size or dependencies (other than collecting all the reports ourselves), like what TeamCity does with code-coverage graphs. But maybe I’m just too demanding…
  3. 82 metrics are impressive, added with CQL it makes NDepend far superior than other code-analysis tools in term of the easiness of customisation. However, I haven’t found any mean to actually extend the metrics (or the constraints). Even though FxCop’s customisability is very limited, it’s fairly easy to write your own plugin and extend the tool yourself.

Having said that, NDepend is a really impressive tool. Lives for architects and application designers have never been easier. They no longer need to deal only with whiteboards and CASE tools. NDepend allows them to deal straight with the actual code realtime. Looking into the code from architectural viewpoint. NDepend provides great deal of flexibility with its brilliant CQL. NDepend is very intuitive, and it took me almost no time to get going and do interesting things.