home

Soft stuff @ bjarte.com

text

Freaky coincidence

Just sat down to draw some vectorized mountaintops in Illustrator to practice my design skills. I turned on Spotify and the first song started. It was Jeremy by Pearl Jam. It starts

“At home drawing pictures of mountaintops…”

Freaky :p

1 month ago

January 10, 2010
Comments
text

An abstraction on top of NHibernate

Prepare for a random thought > 140 chars:

I just read Ayendes post where he debates why we should not put an abstraction on top of NHibernate.

The reason to do such a thing would be to avoid coupling your code to the persistence layer. This in itself is not a bad idea until you take a deeper look at what is required to make a useful abstraction.

A simple persistence interface on top of NHibernate would make it easy to change the persistence layer but

  1. What happens when you need other features from NHibernate. Will you add it to the abstraction layer ? Not a very solid abstraction really.
  2. NHibernate is already an abstraction on top a relational database. Do you want to add another one ?

The term “Leaky abstraction” comes to mind.

2 months ago

January 8, 2010
Comments
text

Taking your game to the next level

Experience is the best way to learn, especially if you fail. However, you should not be reinventing the wheel and making mistakes people have made before you.  Use the knowledge on the web for what it is worth, and stand on the shoulders of giants as you take the next step to become a better developer.

I have been learning new things about the business of software, project management and software architecture.  I have read a lot of books, listened to hours and hours of pod casts, watched screen casts and been to conferences. Finding the best content among all available sources is hard. In this post I will share some highlights hoping that I can help you in your search for the best content.

The business of software

I was listening to the StackOverflow podcast when they were interviewing Jason Calacanis. I had never heard about this guy but he was apparently an internet celebrity and serial entrepreneur in the web space. He had just started a show “This week in startups” and I checked it out.

In the show he is interviewing guests on how you start up and run a successful internet business. He has a very interesting personality which makes this show a must watch. I am listening to many shows, like “This week in tech”, “This week in Google”, “Herding Code”, “Elegant code cast”, “Software engineering radio”, “HanselMinutes”, “StackOverflow” and many others, but this show is always the highlight of the week.

After I started watching this program I am really getting a taste for the business side of things. Turns out it is as complicated and fun as creating software.  You need to be passionate, hard working and even a little bit smart to be a good entrepreneur.

Lately I have been reading all sorts of books on the subject. One of the most intriguing books I have read lately is “The purple cow” by Seth Godin. He talks about how you need to create remarkable products that people want to talk about. The traditional model of marketing is not working anymore because people is getting overloaded with information and It’s hard or even impossible to make your product stand out. Read the book or get it at audible.com.  It is worth it. Audible, by the way, is great.

Right now I am reading a book called “Predictably irrational”. People act very irrationally when it comes to money. The funny thing about it is that you can actually predict their irrational behavior.  In this book the author gives a unique insight to the human psyche. It’s fun to read. I recommend this one as well.

Software design and architecture
Software design and architecture has been one of my interests for a long time and this year is no different. The Norwegian Developer Conference was packed with great speakers presenting on parallel tracks. Despite having a choice between great speakers I always found myself listening to Udi Dahans presentations. Udi talks about SOA not only from a technical perspective but in the intersection between business and software. He has a great way of keeping things simple but not simplistic.

I have been following Greg Youngs blog on Code Better for quite some while, and was thrilled when I got the opportunity to attend his class on the CQRS architecture in Bergen.  The CQRS architecture works great together with domain driven design to create and manage complex domain models. It’s worth reading up on.

The best software design book I have read this year is “Patterns of Enterprise Application Architecture”.  The material is presented in a clear and concise manner and the book is a great read. No wonder it has cult status in the developer community.

Another great book is the free book “Getting real” by 37 Signals. It talks about every aspect of developing web software. The big takeaway from reading this one is “Keep it simple”. It’s separated into several small essays. Reading an essay takes 5 minutes and every one contains gems of knowledge.

The process of developing software
Thru my work I have seen many interesting takes on how you should develop software, some good, some bad.  There is no “one true way” of developing software. I believe that your choice of process greatly depends on the culture in your company and the physical location of the people involved. Distributed teams have many limitations and should be avoided if possible.

Agile and Lean software development is interesting. I have been reading the Poppendieck books on Lean and some of the more business oriented books of Womack and Jones. However the highlight of the year is “The Goal”. It is written as a novella which makes the subject even more interesting.  This book introduces “the theory of constraints” which is a very interesting concept, and a “must know” for every developer (and business person).

Seeing the big picture
Learning about business, process and software gives synergies. Each of the areas affects the others and you start seeing the bigger picture. This will make you a better developer, or as in my case a great one ;)

2 months ago

December 28, 2009
Comments
text

The roles of software development

Building software requires people with a certain skill set. To build a successful product you need a person with a clear vision of where the product is going. This person is the most important member of the team. Without him (or her) you will not be successful. This person, the champion, can guide the development towards the goal. The champion knows what features to include and what features not to include in the product. His clear vision will stop feature creep and lengthy unproductive discussions. The champion listens to others but has the final word. Democracy is not the way to go if you want to build remarkable software.

Developing a quality product is difficult, and a champion is only one piece of the puzzle. To build a great software product, you need great software. You won’t fool anybody by putting lipstick on a pig. This is a key aspect missed by a lot of companies.

To build great  software you need a person who knows how to build software. It sounds obvious doesn’t it ? This person, let’s call him a developer lead, should be responsible for the development team.  He is responsible for ensuring that sound software engineering practices are followed throughout the project and that the software quality is up to par. This is not a simple job.

Ideally the champion and the developer lead is the same person. The reason is simple: Communication. Communicating the vision  is hard, and fewer people means simpler communication. If you want to develop software effectively you need to make trade-offs between the features you would like to have and the time it takes to implement them. Often you can make small adjustments to your requirements and gain a lot in terms of development time and better technical solutions. This translates to faster time to marked and eventually more revenue. Identifying these trade-offs are difficult, but if the champion knows the technical aspects of software development this problem disappears.

Having one person doing the work of a champion and a developer lead can be a recipe for failure if this one person fails at any of the two roles. A common mistake is to put a domain expert to the task of leading the development team, and thinking that if you put enough developers on the team the software will build itself.

I have seen many products become mediocre due to the lack of a champion or a dev. lead. The roles need not be explicit, but they need to be present in the team in some form.

What kind of people do you think is required to develop remarkable software ?

3 months ago

November 30, 2009
Comments
text

HTML 5 looks nice

I just had a look at HTML 5 the other day. The Chrome browser has support for a lot of the features and you can check them out here if you have Chrome installed. To me it looks like Silverlight and Flash can get in trouble in the future. The first question that hit me was how long time it will take before the majority of the browsers support it. According to Microsoft it might take 5-10 years, but I’m not so sure about that considering the speed at which the web is developing.

Check it out at http://www.chromeexperiments.com/

3 months ago

November 19, 2009
Comments
text

A simple exception handling strategy

Time to scribble down some of the things I don’t like seeing when reading C# code.

try {
    something();
}
catch(Exception e){
    //doNothing
}

and

try {
    something();
}
catch(Exception e){
    logSomething();
    //do nothing
}

and

try {
    something();
}
catch(Exception e){
    logSomething();
    return null;
}

and

try {
    something();
}
catch(Exception e){
    //throw base Exception class
    throw new Exception("Something didn't happen");
}

All of these are exception handling anti-patterns and the reason should be obvious. Still I see them all the time. I might even have written some code like this back in the days.

One reason not to write software this way is that your system will have side effects that are impossible to spot. Also, I’m guessing the users of the software won’t like strange things happening in the UI.

To avoid problems like this I’m usually using a combination of custom exceptions and a back stop. When I see the possibility of an exception occuring in my code and it makes sense to create a more specific exception I do just that. Remember to keep the original exception as an inner exception of the new one. To prevent exceptions from bubbling onto the user interface I have a back stop where I catch all exceptions before they are exposed to the user. Since I am throwing custom exceptions in my code I can reason about the exceptions and give the user the proper feedback. Sometimes the exceptions require more attention than just giving feedback to the user. In these cases my exception might become an event that is picket up by the appropriate handler.

I guess this is the common way to handle exceptions, but who knows. Do you have any other strategy ?

3 months ago

November 14, 2009
Comments
text

Cucumber and the Gherkin language

If you are able to read this entire post I will owe you a beer ;)

The other day I was at a Norwegian .NET User Group (NNUG) meeting in Bergen. Aslak Hellesøy was showing off his Cucumber tool. I will give a uber quick recap skipping a lot of detail and correct terminology.

Cumcumber is a mature BDD testing framework written in Ruby. When using Cucumber you describe your features in English (or the language of your choice). Cucumber will parse the feature and generate test templates (a.k.a. step definitions) that the developer will be completing.

Feature

# language: en
Feature: Division
  In order to avoid silly mistakes
  Cashiers must be able to calculate a fraction
 
  Scenario: Regular numbers
    Given I have entered 3 into the calculator
    And I have entered 2 into the calculator
    When I press divide
    Then the result should be 1.5 on the screen

The language used to describe a certain feature must follow a set of structural rules. The language is developed for Cucumber and is called Gherkin.

Step definitions

Given a feature you would use Cucumber to generate a template for writing your tests, called step definitions.

Before do
  #todo
end
 
After do
end
 
Given /I have entered (\d+) into the calculator/ do |n|
  #todo
end
 
When /I press (\w+)/ do |op|
  #todo
end
 
Then /the result should be (.*) on the screen/ do |result|
  #todo
end

This example is in Ruby but you can generate the steps definitions in other languages as well. Some languages seems to have good support, others not so much.

When you have generated your step definitions and filled out the missing code you can run the feature and see if your code is meeting the business specifications.

Cucumber is pretty sweet and it’s a shame it’s not mature enough for usage with the .NET platform. Also there is to much friction installing it and getting it to run properly.

Digging deeper -Parsing

I’m not the kind of guy that needs to know every technology to feel complete. There are just way to much out there. When it comes to theory around software I don’t like to be out in the dark. Things like patterns, architecture, processes and the business surrounding the software is very interesting to me.

The question that caught my attention in the Cucumber presentation was

How do you take a feature and build test templates from it?

One thing is for sure, you are not using “find and replace” :) I felt a gaping hole in my knowledge and needed to do some research. I was not interested in the technical aspects, but the theory behind it. I often like to start at the bottom and  work my way up. I knew I had to do some reading about parsing, but I did not quite know where to begin. After some googeling I found some nasty papers. I quickly realized I was missing the required background so I needed to find my old book from the university called “Introduction to the theory of computation” by Michael Sipser. The book is theoretical but still not very hard to read for someone who has read similar books before. I am now about to give a summary in plain English  of my research, so that you don’t have to read the book.

Warning: I might be skipping some details and simplifying.

Languages

A language is basically a set of strings. A given string can be in the language or not. Looking back at Cucumber you can say that a feature (the string) is in the Gherkin language if it complies with the rules of the Gherkin language. But how do you describee these rules that decide if a given string is in the language or not? Also, if the feature is in the language, how do you break it down into it’s components so that you can reason about the semantics,i.e the meaning, of the feature ?

There are different ways to describe languages. Depending on your method of choice, you are able describe a language or not. Let me clarify.

Regular languages

A regular language is a language that can be described by using a regular expressions.

If you want to describe the language consisting of all strings starting with “a”, then continuing with any number of “b”s and ending with an   “a”, i.e the set {aba, abba, abbba, ..}, you can describe this language with the regular expression “ab+a”.

Now you might think that you can describe any set of strings using regular expressions, but that is not the case. The language consisting of all string with  ”a”s and continuing with the same number of “b”s cannot be described by a regular expression. (More formally you would describe this language as B={anbn | n>=0})

Context free languages

Context free languages are a superset of regular languages. Some of these languages cannot be described using regular expressions. You describe these languages using a set of rules (also known as productions). Let’s look at the following rules.

A => 0A1
A => B
B => #

For a given rule, the left side of the arrow contains a variable (upper case letter). The right side contains a string of variables and determinants. In this case the determinants are 0,1 and # and the variables are A and B.

To build a string in the language using these rules follow the following steps:

  1. Write down the start variable. It is the variable on the left-hand side of the top rule unless specified otherwise.
  2. Find a variable that is written down and a rule that starts with that variable. Replace the written down variable with the right-hand side of that rule.
  3. Repeat step 2 until no variables remain.

Using the grammar above you can generate the string 000#111. The generation would go something like this, starting with the first rule.

A => 0A1 => 00A11 => 000A111 => 000B111 => 000#111

Actually using these rules you could generate any string starting with any number of zeros, then a hash sign, and then ending with the same number of ones. This language cannot be described using a regular expression.

A parse tree

Building the string 000#111 above can be represented as a parse tree:

A parse tree is decomposing a string in the language into it’s separate components. Now it starts getting interesting. If I somehow could create a parse tree from a feature in the Gherkin language I could do all kinds of sexy stuff with it. Cool!

Ambiguity

The problem with context free languages is that you in some cases can have different parse-trees giving you the same string in the language. In other words applying the rules in a different order will give you the same string. If I am using the parse tree to reason about the meaning of the string, two different parse tree would give two different meanings to the same string. Not a good thing.

Parsing expression grammar

A grammar is another word for the rules describing the language. I.e a grammar is a set of rules. A  parsing expression grammar (PEG) is very similar to the context free grammars I described above. The big difference is that the languages described using a PEG are not ambiguous. For a given string in a language there is only one parse tree. Sweeeeeeeeeeet. Now we’re talking :) (And no, I’m not a nerd)

It turns out that Gherkin is described using a PEG. Cool huh? In ruby there is a tool called TreeTop that takes the rules of the grammer as input. If you give it a string from the language, i.e a feature in this case, you will be able to generate a parse tree for this string.

This is the point I am at in my research currently. I have also found a .NET tool that is able to create a parse tree of a given string if you supply the rules to it. My next step is to steal the rules of the Gherkin language and move them over to this tool, and see If I am able to build a parse tree for a Cucumber feature using this tool. That would be cool.

Also I need to look at the theory of how you actually build a parse tree for a string. Not by starting  with the rules and building a string, but the other way around.

Did this make sense ?  Am I on the right track ? Please tell :)

Till next time.

4 months ago

October 31, 2009
Comments
text

Set based validation in the CQRS Architecture

Today we’ll get our hands dirty.

If you are following the CQRS design pattern you might run into trouble when it comes to set based validation. Consider the task of registering a new user. How would you in your domain figure out if that user name is unique? You can obviously not query your event store, because the event store is not built for that purpose.

Querying the reporting database
The solution is actually querying the reporting database on the read side to figure out if the user name is unique. There is (in my opinion) a little problem with this approach. By querying this way you are making the read side responsible for knowing domain concerns. You are effectively making the read-side a part of the domains bounded context and adding more responsibilities to it. Before we introduced set based validation the separation was clean and nice. It is not that clean any more.

You could keep the design clean by having a separate database with the only purpose of answering set based queries and keep this database within the bounded context of the domain. This would keep the design cleaner (perhaps), but it would not be a practical choice in most scenarios.

Another oops
In a scalable architecture you are publishing events asynchronously. They are picked up by the read-side and applied to the reporting database thru the denormalizer. Consider the situation where the UserCreatedEvent(UserName=”BjartN”) is in the denormalizer queue, and the domain is instructed to add yet another user named “BjartN”. The reporting database knows northing of the first event resulting in the domain publishing the same event twice. When the second event hits the reporting database the entire thing blows up, because the user already exists. What to do next ? The event has already occurred (twice) and has been published to many other systems. Basically the events have occurred and you need to deal with it. The only way to fix it is to issue an compensating action. This could be issued by the denormalizer to the domain as a command.

Be consistent if you can
Dealing with these kinds of concurrency issues is not something you want to do for fun. If you have the option to publish the event and save to the reporting database in a single transaction (e.g. synchronously) you should consider that.

4 months ago

October 27, 2009
Comments
text

Why would you store the entire history of the domain ?

I got this great question in one of my comments:

Modeling the entire history of the domain seems to only be valuable if the domain requires it as part of your domain? I guess the biggest concern with a pattern like this would be how you would effectively deal with different versions of code that have existed at different points along the path. The report one would generate from 10 releases ago might not be the same if you are reapplying events as you go. This could certainly have business implications too, how do you suggest dealing with this?

If the the application of an event changes, do you simply create new events and never modify old ones?

I will try to answer some of it:

Storing the entire history
It’s not so much the domain that usually requires you to keep the entire history, but you keep it for auditing. If you only keep your current state in your database, how do you know that the data is correct ? Anything could have happened to keep it from being correct, like bugs and evil people :) With an add-only model you even have the possibility of writing the events to a write-once-storage. Now it is impossible to manipulate the data. If you write a bug that creates false events, you would later execute compensating actions to undo what you did. Remember that an event is something that  has happened (past tense) so it makes no sense to change it in any way.

There is another good reason to keep the history as well. When saving events you are not saving data to your current data model, but you are actually storing all the user behaviour. If you only store the current state of the domain, the reporting you can do on this data is limited. The information stored in an add-only model is much richer. You can do reports on things you didn’t event think of when you created the application.

Versioning
If your events changes you would create a new version of that event, and  keep the old ones. To keep your domain code form being bloated with handling of all versions of  events you would basically introduce a component that converts your events from previous  to newer versions, and then apply them on the domain. Remember that events are things that actually happened in your domain so in most cases the information in deprecated events are valuable.

That said, there are of course situations where you would choose different architectural patterns.

4 months ago

October 23, 2009
Comments
text

Creating an event storage

The event storage is basically where you persist all the events your domain is publishing. The only thing that makes it a little bit tricky to implement is the fact that you need to worry about concurrency. Concurrency is tricky, but not that tricky.

Event storages can be created using various technologies.  According to Greg the best way is to write the events onto the disc, circumventing the file system. Of course then you’ll need to write your own indexing as well. That’s just crazy stuff and I’m not digging into those APIs any time soon :p In this example I’ll be showing an event storage for a sql database. We need to be able to store our serialized events and keep track of what aggregate root published the event. Also we need to maintain the concurrency version of the aggregate root.

Database

First let’s look at the database schema

EventProviders

EventProviderId (uniqueidentifier)
Type (nvarchar)
VersionNumber (int)

Events

FkEventProviderId (uniqueidentifier)
DateTime (datetime)
Data (varbinary)

The name “event provider” refers in this case to the aggregate root. Each event provider has an id, a type name and a version number. The version number is used to do optimistic locking, making sure we don’t update an aggregate root that has been changed in the meantime. The version number also serves another task; the version number is always equal to the number of events published by the aggregate root.

Now let’s look at the events table. Each event has a reference to the event provider that published the event. We also store a serialized version of the event and the time it was persisted. Notice we have no “eventId”. We don’t need one.

Event storage

We will access the event storage using the IEventStorage interface.

public interface IEventStorage
{
    IEnumerable GetAllEventsForEventProvider(Guid id);
    void Save(IEventProvider provider);
}

To save an aggregate root we pass the a reference to the IEventProvider interface (implemented by the aggregate root) to the event storage.

public interface IEventProvider
  {
    IEnumerable GetChanges();
    void ClearChanges();
    Guid Id { get; set; }
    int Version { get; set; }
  }

The Save method

To save the IEventProvider (i.e. Aggregate Root) we need to do the following

  • Start transaction
    • Get the event provider from the database
    • If the event provider does not exist in the database, create it
    • If the event provider exists, check that the database version matches the current version of the aggregate root. If not throw a concurrency exception.
    • Save all events the event provider has published after previous version
    • Update the version number in the event provider. The version number is now the previous version number plus the number of events published since previous version.
  • End transaction

The implementation is pretty straight forward. As you see all the complexity in this operation comes from the fact that we need to check for concurrency violations.

Included project

I am including a project with an implementation of the event store and some other things like the repository and the aggregate root base class. This was written quickly at the DDD Course by me and some other peeps, and is far from production ready and it has no UI. However I think it’s better to post this (crappy) code than to post nothing at all. The idea is that you can have some code to look at. There are some integration tests you can run if you set up the database. If you want to use an implementation like this, you can get some speed optimizations by using stored procedures. It’s evil, but still.

Some things to keep in mind

I have not talked about implementing snapshots to improve performance but it is really not that hard. I would use the Memento Pattern and not store the actual aggregate root. From there on it is just plankekjøring ;) Mark Nijhof is publishing an extensive example at some point in the near or far future.

4 months ago

October 21, 2009
Comments