Day 1 – Pex and Moles

Today was the first day of  my .NET 4.0 training. And I would like to share the highlights of each day. Well at least the highlights for me. So this is the first day of 5.

Introductions and first days are always slow. So this was no exception. There were some basic introduction things for people who have newer came into contact with electricity. But a good 5 hours in something was presented that caught my eye: Pex and Moles.

Pex and Moles are actually two separate peaces of software:

  • Pex automatically generates test suites with high code coverage.
  • Moles allows to replace any .NET method with a delegate.

I am still unsure about the real day-to-day value that moles will offer me. This is not because the software is not up to pair with what I would use but that I am mostly working on code I can change and refactor so that the need for such a tool is not needed. But more on that later on.

First you will need a copy of Pex and Moles. So here is the download link. You can even find a version for the express version (non-commercial). After you have the file just let the installer do its work an tolerate the 2 ~ 3 times your focus will be stolen (it is worth it).

First you need some code to let pex have fun with. I just quickly wrote a little class with one method. And here it is:

namespace PexAndMoles
{
    public class Calculator
    {
        public int Add(int one, int two)
        {
            if(one == 0 || two == 0)
                throw new ArgumentException();

            if(one < two)
                throw new ArgumentException();

            return one + two;
        }
    }
}

There is a reason for all those ifs in there. It is for the sole reason to give pex something to work on 🙂

So to get started just left-click on the method you want to “work on”. You should see something like this.

Run PEX

Pex will ask you which testing framework it should use. You can choose from all the major testing framework. But to keep it simple I chose to stick with MSUnit.

Select testing framework

After a short time where you are tempted by a “follow us on Facebook” link the results are presented and if you are lucky (depending on the code complexity) pex will find all major test scenarios for your method.

In my case this is what it came up with.

Pex results

And those are all the test scenarios I wanted (or even expected).

Now that you have your tests you want to keep them for later (most probably some sort of regression testing). So pex can help you there to. If you select all created “results” a “Promote…” button will appear. If pressed it adds the “results” as unit tests into your testing project or creates a new one and adds them there.

The code generated is confusing at worst and funny at best. It is not the go-to example of good/clean code. But it is auto-generated and can be regenerated if future changes break the tests. The naming convention is “acceptable”. Before I rant too much here is the code generated:

namespace PexAndMoles
{
    [TestClass]
    [PexClass(typeof(Calculator))]
    [PexAllowedExceptionFromTypeUnderTest(typeof(ArgumentException), AcceptExceptionSubtypes = true)]
    [PexAllowedExceptionFromTypeUnderTest(typeof(InvalidOperationException))]
    public partial class CalculatorTest
    {
        [PexMethod]
        public int Add(
            [PexAssumeUnderTest]Calculator target,
            int one,
            int two
        )
        {
            int result = target.Add(one, two);
            return result;
            // TODO: add assertions to method CalculatorTest.Add(Calculator, Int32, Int32)
        }
        [TestMethod]
        [ExpectedException(typeof(ArgumentException))]
        public void AddThrowsArgumentException547()
        {
            int i;
            Calculator s0 = new Calculator();
            i = this.Add(s0, 0, 0);
        }
        [TestMethod]
        [ExpectedException(typeof(ArgumentException))]
        public void AddThrowsArgumentException81()
        {
            int i;
            Calculator s0 = new Calculator();
            i = this.Add(s0, 1, 0);
        }
        [TestMethod]
        public void Add520()
        {
            int i;
            Calculator s0 = new Calculator();
            i = this.Add(s0, 1, 1);
            Assert.AreEqual<int>(2, i);
            Assert.IsNotNull((object)s0);
        }
        [TestMethod]
        [ExpectedException(typeof(ArgumentException))]
        public void AddThrowsArgumentException470()
        {
            int i;
            Calculator s0 = new Calculator();
            i = this.Add(s0, 2, 3);
        }
    }
}

And that is pex in a nutshell. At least that is what I was able to find out about it in the half day I spend with it.

I know that Moles was not mentioned here but I would like to spend some more time with it before writing about it in more detail, besides the post is long enough.

And that is all the time I have today.

Thx for your time.

Advertisements

Validating a validator

If you take your precious free time and build a validation framework from the ground up you want the architecture to be perfect, or as close to perfection as possible. So you are bound to run into some problem on the way there. The problem that I was struggling with a while ago was the question hot to validate the validator?

The problem became painfully obvious when I wrote the first System.String validator that required some external configuration. To illustrate my little dilemma here is the validator code.

namespace valy.Validators.String
{
 public class Regex : BaseValidator<string>
 {
   public Regex(IEnumerable<IValidator<string>> validationParts) : base(validationParts){}
   public Regex(IValidatorConfiguration configuration) : base(configuration){}

   protected override IValidationResult DoValidate(string objectToValidate)
   {
     if (reg.IsMatch(objectToValidate, RegularExpression))
       return Pass();
     return Fail(ValidationFailReasons.Invalid);
   }

   protected override void CheckParameters()
   {
     Require(this, v => v.RegularExpression, s => !string.IsNullOrEmpty(s));
   }

   public string RegularExpression { get; set; }
 }
}

For now try to ignore the existence of the CheckParameters method.

In this special case I have to validate that a valid non-empty regular expression is given to the validator. You could do it in the validation body with a simple if statement, but then it would be in the responsibility of every validator to do it’s internal validation and communicate possible violations to the external world. This would lead to a incosistent framework in no time at all. What I wanted was a way by which the validator could define a set of rules that had to be met before the actual validation could happen.

This does not give us complete protection. In the above example the object to validate can still be null. But that is not a problem of the validator anymore.

But lets return to the method that you should have ignored until now. The CheckParameters methods purpose in live is to define a set of rules that ensure that the internal state of the validator is consistent. This means that if the method “passes” the validator is ready to use and will not throw any bogus exception if invoked.

The core of this internal validation scheme is the Require method implemented in the base validator. Using this simple building block you can put together relatively complex validation schemes like show in the next code sample.

protected override void CheckParameters()
{
 Require(this, v => v.MaxLenght, i => i > MinLenght);
 Require(this, v => v.MinLenght, i => i < MaxLenght);
 Require(this, v => v.MaxLenght, i => i > 0);
}

The complexity of the example above will not win you and noble prizes but will ensure that your range validator can not be given an invalid range. And this works for me(until now)!

This is not perfect and the thing that is annoying me the most is the fact that the current object has to be passed as a parameter to the function. This is necessary because the Require method is defined on the parent class and therefore has to somehow link to the actual overriding class. But this is just a minor hiccup. There are other potential problems that are not causing me any headaches yet, so they will get fixed when they start to hurt.

All in all this scheme works and I am quite happy with it. The last part I would like to show you is the implementation of the Require method.

protected void Require<TValidator, TParam>(TValidator validator, Expression<Func<TValidator, TParam>> property, Expression<Predicate<TParam>> predicate)
{
 var propertyValue = property.Compile().Invoke(validator);
 if (!predicate.Compile().Invoke(propertyValue))
   throw new ValidatorInitializationException(
             string.Format("Validation inicialization failed on predicate : {0} for member : {1}",
             predicate.Body, property.Body),
             GetType());
}

As you can see the implementation is nothing special. Just two functions that the evaluated at runtime and if predicate fails a ValidationInitializationException is thrown. Really there is nothing more to say about this topic.

Hope that this helped someone out there and if someone out there seas something wrong with this write a comment and tell me about it.

Configuration entropy

The term software entropy is coined and the definition of it can be found here. I would like you to take into consideration the term configuration entropy.

Because of the increased complexity and forever changing requirements the way we write software has changed. In the past we could try and “one up” each-other by coding the same piece of logic in every project we worked on. This option has become impractical and dangerous. Dangerous because we tended to write half baked software that was just not up for the challenge. And to be honest I do not want to write a data access layer or DAL everytime, I just want to solve problems in my problem domain and not in all others that happen to be in my way.

To make the long story short: we are having more and more configuration in our applications. If you have to work on a compiled language you will notice quite quickly that once that a peace of code leaves the safe area of your test environment it usually developes some problems. And usually that results in an recompile and redeploy… So this is quite a big intensive put more “stuff” into configuration files.

So what does this mean? This means that you will reach a point where the configuration becomes more, and more complex than your actual code. The unit testing world has already proven that unit tests are part of our code and that they deserve the same tender attention and care as code itself. The same goes for all of that configuration. If you do not constantly watch after all the configuration out here you will get overrun. And when this happens you will have a BIG problem.

When you can’t tell where those mysterious values are coming from then my friend you are really at the heart of the configuration entropy zone. And getting this situation under control will require quite some effort. So we should avoid it at any cost.

In my experience the best way to do this to divide and conquer. This is achieved by following the following steps:

  1. Ignore the existence of the appSettings section in the app.config/web.config file
  2. Configure once, use often
  3. Write custom configuration sections!! <- How this is done later on
  4. Lose the idea that all configurations are stored in one file
  5. All configuration that can be put in a separate file should be put in a separate file
  6. Do not place configuration in code <- only partially true

Ignore the existence of the appSettings section

The developers in Redmond were so nice as to give as a common and easy to use place where to put all our application configurations. This sounds great but is really a curse. Why? Because after some time you will notice that it gets hard to manage. You will spend more cognitive effort trying to remember how the configuration key is called and to validate if the configuration value is really what you expect. In the configuration API all is an object by default so there is no easy way to determine if the value given is really a number or just a bogus string.

And from my personal experience this section tends to get big (100+ entries) and environment management gets harder and harder!

Write custom configuration sections

Because we ignore the appSettings configuration section we have to get a suitable substitute. And luckily we have! They are called custom configuration sections. The main advantage of them is that they are type safe. Remember the type validation you had to with the appSettings?

This is the right time to notice that there are two ways of doing this:

  1. The “old” way:
    In the old days the custom configuration manager just contained an event handler in which you could/should pars an XML node that contained your configuration section. This is not the prefered way but still gets the job done.
  2. The “new” way:
    Today we have a more elegant solution to the problem. Now you have a typed interface to the configuration values in the XML file. Sadly this means sticking to some limitations of this implementation.

What you want to provide is a custom configuration section for different configuration groups. So for instance all configuration values that have something to do with the configuration of the price calculation logic should be in a configuration group called “Price” or something like this.

A detailed introduction into the world of custom configuration sections will follow in one of the future posts.

Lose the idea that all configurations are stored in one file

In todays flood of configuration values it almost impossible to store all configuration values in one big file. But still this happens, the file in question is usually the web.config or the app.config file. Why? Because they are automagically created by the project wizard. This is not bad for some cases but if the configuration of your application gets really heavy you will feel the burn. Maintenance will become a nuisance because the file will be just to big to manage.

So what do we do about this? The simplest and most logical solution is to use an algorithm that we all learned in school: “Divide and conquer”. By this I mean that you should put all configurations that can be in own files into own files. The prime candidate for this are log4net and Spring.net. Both can be put in own configuration files quite easily.

What do you gain by this?

  1. The configuration does not clog your app.config or web.config file anymore, leaving room for more vital configuration to be there
  2. When you need to change the configuration (for example having different logging configurations for test and live) you do not have to manipulate a complex XML file but just replace a file which is much more easily done
  3. You have different components in different configuration files, thus eliminating the need to feverishly scroll and look for the right section of the configuration file.

I can not put enough emphasis on tis issue. Try to believe me that this is a good thing. Some people may complain that the configuration just got scattered all over the place and that refactoring is now harder to do because of the configuration dependencies. But in the long run the separation will improve you codling skills and help you keep the configuration under control while the project will grow and evolve.

Do not put configuration in code

This is only partially true. Why? Because this rule does not apply to interpreted languages like ruby, python, php (reluctantly calling it a language). Do not be mislead by this. Configuration entropy is as big an issue on the other side of the river as it is here.

So why does the rule not apply to the interpreted languages? Because the whole code is big configuration file. Some will scream now and call me a stupid moron. But thing about it for a minute. In my world the “definition” of a configuration is: A non-compiled part of the application with which the behavior of a routine can be changed without a recompile. And because interpreted languages are not compiled we can change anything in the source-code and continue.

But this is where the problems begin. Because there is virtually no penalty in changing the source-code itself configuration tends to get scattered all around the code. Making it hard to manage the changes made to it. And a clear environment configuration gets quite frustrating when you have to alter multiple files to get the job done. That is why event in those languages I suggest that configuration should be kept in as few places a possible.

Returning back to the world of compiled languages the picture changes. If we stick to my definition of a configuration then all configuration that is stuck in code can not be changed without recompiling the application. Making it quite useless.

The final words

So after this short trip in this topic of configuration entropy I would like you to remember the following:

  • Configuration is part of your code, so treat it like this
  • Divide and conquer, do not put all your eggs in one basket
  • Do not leave configurations in compiled code

Validation as it should be in my eyes

public void Foo(string parameter1, string parameter2)
{
	if(string.IsNullOrEmpty(parameter1))
		throw new ArgumentException("parameter1", "Parameter1 should not be null or empty");
	if(string.IsNullOrEmpty(parameter2))
		throw new ArgumentException("parameter2", "Parameter2 should not be null or empty");

	// Actual method logic
}

Seems familiar? Well it should. I bet that in your current project there is at least one method that looks like this. Do you think that this is a good way of validating input parameters? The honest answer is that this approach is not wrong but it has it’s shortcomings:

  • If the validation logic changes you will have to recompile the solution
  • The validation message is hard-coded
  • If you would like to reuse the validation logic you have a major refactoring session before of you
  • There is bound to be inconsistency in the validation logic across different methods

The main question here is not what is wrong with this approach but how to make it better.

In my experience as a software developer I have seen many implementation of what could be called a validation framework and used many that are already there. Some are good, very good, but lack some basic functionality. What I expect from a validation framework is the following:

  1. Uphold the DBC principle
  2. Have a way of defining the validation logic outside the compiled solution source
  3. Support IOC
  4. Uphold the DRY principle
  5. Extendability
  6. Optional: Provide a fluent interface for validation setup

DBC

Or design by contract is the α and Ω of a good validation framework. If you want to validate something you have to know what to validate. Some years ago Bertrand Meyer developed this concept for his little language called Eiffel. The idea is to write software that doesn’t do more or less than it claims to do. To validate these claims we have the following “tools”:

  • Preconditions: Every routine places a certain set of preconditions that have to be met for the routine to execute.
  • Postconditions: Every routine does something to the world it runs in. And this is the mechanism we use to validate that the effects of the routine are as promised.
  • Class invariants: Basically describes a set of rules that ensure that the object is in a valid state, from the callers perspective.

If a object and its methods adhere to the principles of DBC all callers know:

  • What conditions (states of parameters) must be provided for the method to execute.
  • That if the preconditions are met the method will execute.
  • That the method fill finish. It is something that we take for granted, but the existence of a postcondition ensures that the method will conclude with a positive or negative result.
  • If the method finishes successfully you know exactly what you get.
  • After the method is done executing the object will be in a valid state.

And if you follow DBC to the point you will find out that the calling routine is responsible for the preconditions of the called routine. That is something that is hart to realize in the .NET world but is still a nice idea.

Non compiled configuration

Sometimes you have the luxury to decide what is valid or what is not, you can say that the user name may not be shorter than 5 characters. But at the same user names may not be longer than 255 characters. Why is that? Because of the maximum capacity of the database column. And this is my point. In my experience 80% of validation constrains are imposed on the system, from databases, business rules,… So the last thing you want  is to have the validation logic compiled into the code. For example:

public class User
{
  [RequiredField(Message= "A user name has to be provided")]
  [StringLengthRange(MinLength = 5, MaxLength = 255, Message = "The user name must be between 5 and 255 characters long")]
  public string UserName { get; set; }

  [RequiredField(ResourceKey = "Account/UserNameErrorMessage")]
  [StringLengthRange(MinLength = 8, MaxLength = 255, Message = "The password must be between 8 and 255 characters long")]
  public string Password { get; set; }
}

Lets say that because of popular demand the user name length is extended from 255 to 500 characters. Now you have to open the source code, make your changes, compile and redeploy it to the customer. Thats a lot of work for a simple task. Not to speak of a nasty side effect.

This implies that not only the validation logic is hardcoded but the error messages to. So what do you do if you want to provide a multi-language application? Any ideas? Well to put is in plain words: because you rely on some auto-magic mechanism to validate your objects and produce the error messages on the front-end you just ran into a major problem!

And for those of you that are thinking to use the build in resource capabilities let me tell you that you can’t. Why?

  1. You do not have any clean access to the culture information of the user (the one using the application)
  2. You get a compiler error if you use anything except types and static strings in attributes

So what I want is an external uncompiled, we could call it dynamic, way of configuring the logic. The most popular option is an external XML configuration, like with log4net and nHibernate.

What I want is something like the following:

<validators>
  <validator name="UserNameValidator" base_type="System.string">
    <validator ref="StringNullOrEmpty" />
    <validator ref="Regex">
      <regex>^([a-zA-Z0-9]{6,15})$</regex>
    </validator>
  </validator>
  <validator name="PasswordValidator" base_type="System.string">
    <validator ref="StringNullOrEmpty" />
    <validator ref="Regex">
      <regex>^\w*(?=\w*\d)(?=\w*[a-z])(?=\w*[A-Z])\w*$</regex>
    </validator>
  </validator>
</validators>

<object_validation type="Foo.Person, Foo">
  <invariants>
    <member name="UserName" type="System.string">
      <validator ref="UserNameValidator" />
    </member>
    <member name="Password" type="System.string">
      <validator ref="PasswordValidator" />
    </member>
  </invariants>
<preconditions>
<property name="UserName" direction="set">
      <validator ref="UserNameValidator" />
    </property>
    <method name="SetPassword">
<parameter name="password">
        <validator ref="PasswordValidator" />
      </parameter>
    </method>
  </preconditions>
</object_validation>

Not perfect but it’s a start. Defining the validation like this brings some advantages:

  1. You do not have to change your code to change your validation logic
  2. You can actually reuse validation logic

IOC

Why do we all love and use log4net and hate other frameworks. I think that one reason is that it just works. But the one I want to point out the fact that log4net does not tightly couple with your code! If you want to test your module and ignore the logging you can. And the same is to be expected from the validation. I do not want that my code gets “infected” by a third party framework. I want to put it in a separate part of the application and talk to it when I want and not when the validation things that I should talk to it.

In achieving this goal IOC is more a guideline. If you design your software like it uses a IOC framework you will get exactly the loosely coupled framework you want. And as an added bonus using it with an IOC framework will be a breeze.

It is not my intention to explain the IOC principle here and list all the IOC frameworks out there. So if you are interested in it here is a nice starting point.

DRY

Don’t repeat yourself! Three words that should guide your every key stroke when writing software!

If I am to use a validation framework I want to define all of my validation logic there. ALL OF IT. I do not want to have some additional code somewhere where it could cause trouble later. So if the framework forces me to write a XML configuration I want it to hold all the validation logic!

Basically like with log4net. I define how to log in the XML configuration later I just say that I want to log something and my log messages land where they should.

So if there has to be some validation code written it should be generated out of the XML configuration.

Extendability

There is no framework out there that could cover all the bases. There is not way that your software could handle all that people want it to do. So I want it to be extendabile! I want to validate my custom object with some fancy rule that no one in their right mind would think about. I should be able to make it happen, with minimal effort.

So that is something that is quite important.

Fluent interface

I know that this breaks some earlier requirements. But for sheer usability there is nothing better. And to be honest, as a software developer I know that it is not a hard thing to do. So give me something like

public ValidatorCollection SetUpValidator()
{
  var validators = new ValidatorCollection();

  var stringValidator = new Validator<User>()
                                 .For(u => u.UserName)
                                 .NotNull()
                                 .NotEmpty()
                                 .LongerThan(8)
                                 .ShorterThan(255)
                                 .Maches("^([a-zA-Z0-9]{6,15})$")
                                 .Message("UserName must...");
  // ...

  validators.Add(stirngValidator);

  // ...

  return validators;
}

Breaks my non compiled principle and potentially the DRY principle but is damn easy to use. And I hate it for that. How can something to wrong be so good to use.

This feature is optional but a really, really nice to have.

What now?

Well that is a good question! To be honest I spend quite some time looking at validation frameworks and non of them fulfill all of the points above. So I decided to do something radical.

I will build my own and share it with the world!

I would like to provide something comparable with log4net. Easy to use but immensely powerful! I even spend 5 minutes creating a logo 🙂

valy logo

valy logo

This is still more of a brain child of mine than a clear plan of what I would like to do. So please do leave a comment and tell me what you would like to see. Or even better, join!

You can find the projects homepage here.

As I mentioned earlier this is still in the faze of figuring out what to do so please be gentle.

Thanks for the support up front!