Saturday, May 7, 2011

The “4+1” View Model of Software Architecture

In November 1995, while working as Lead software architect at Hughes Aircraft Of Canada Philippe Kruchten published a paper entitled: "Architectural Blueprints—The “4+1” View Model of Software Architecture". The intent was to come up with a mechanism to separate the different aspects of a software system into different views of the system. Why? Because different stakeholders always have different interest in a software system. Some aspects of a system are relevant to the Developers; others are relevant to System administrators. The Developers want to know about things like classes; System administrators want to know about deployment, hardware and network configurations and don't care about classes. Similar points can be made for Testers, Project Managers and Customers. Kruchten thought it made sense to decompose architecture into distinct views so stakeholders could get what they wanted. In total there were 5 views in his approach but he decided to call it 4 + 1. We'll discuss why it's called 4 + 1 later! But first, let's have a look at each of the different views.

  • The logical view
    This contains information about the various parts of the system. In UML the logical view is modelled using Class, Object, State machine and Interaction diagrams (e.g Sequence diagrams). It's relevance is really to developers.
  • The process view
    This describes the concurrent processes within the system. It encompasses some non-functional requirements such as performance and
    availability. In UML, Activity diagrams - which can be used to model concurrent behaviour - are used to model the process view.
  • The development view
    The development view focusses on software modules and subsystems. In UML, Package and Component diagrams are used to model the development view.
  • The physical view
    The physical view describes the physical deployment of the system. For example, how many nodes are used and what is deployed on what node. Thus, the physical view concerns some non-functional requirements such as scalability and availability. In UML,
    Deployment diagrams are used to model the physical view.
  • The use case view
    This view describes the functionality of the system from the perspective from outside world. It contains diagrams describing what the system is supposed to do from a black box perspective. This view typically contains Use Case diagrams. All other views use this view to guide them.

Why is it called the 4 + 1 instead of just 5?
Well this is because of the special significance the use case view has. When all other views are finished, it's effectively redundant. However, all other views would not be possible without it. It details the high levels requirements of the system. The other views detail how those requirements are realised.

4 + 1 came before UML
It's important to remember the 4 + 1 approach was put forward two years before the first the introduction of UML which did not manifest in its first guise until 1997. UML is how most enterprise architectures are modelled and the 4 + 1 approach still plays a relevance to UML today. UML 2.0 has 13 different types of diagrams - each diagram type can be categorised into one of the 4 + 1 views. UML is 4 + 1 friendly!

So is it important?
The 4 + 1 approach isn't just about satisfying different stakeholders. It makes modelling easier to do because it makes it easier to organise. A typical project will contain numerous diagrams of the various types. For example, a project may contain a few hundred sequence diagrams and several class diagrams. Grouping diagrams of similar types and purpose means there is an emphasis in separating concerns. Sure isn't it just the same with Java? Grouping Java classes of similar purpose and related responsibilities into packages means organisation is better. Similarly, grouping different components into different jar files means organisation is better. Modelling tools will usually support the 4 + 1 approach and this means projects will have templates for how to split the various types of diagrams. In a company when projects follow industry standard templates again it means things are better organised.

The 4 + 1 approach also provides a way for architects to be able to prioritise modelling concerns. It is rare that a project will have enough time to model every single diagram possible for an architecture. Architects can prioritise different views. For example, for a business domain intensive project it would make sense to prioritise the logical view. In a project with high concurrency and complex timing it would make sense to ensure the process view gets ample time. Similarly, the 4 + 1 approach makes it possible for stakeholders to get the parts of the model that are relevant to them.


  1. Architectural Blueprints—The “4+1” View
    Model of Software Architecture Paper

  2. Learning UML 2.0 by Russ Miles & Kim Hamilton. O'Reilly

Thursday, May 5, 2011

Character Encoding

Character encoding is something that's important but can be very confusing people. Let's go back to the beginning... Original nearly all character encoding was done in ASCII. 7 bits to represent 128 characters. This was fine until software became more popular and demands to support more and more international languages grew. Cue Unicode to solve this problem. The latest version (version 6.0) of Unicode includes support for 109,000 different characters! Wow! But here's where it gets confusing. Unicode can be implemented in different character encoding mechanisms. For example: UTF-8, UTF-16 and UTF-32.

So what's the difference? Well UTF-8 uses by default one byte per character unless it has to use more to encode a character. UTF-16 uses 2 (unless it has to use more) and UTF 32 always uses 4 bytes.

Ok so you're thinking what's this all about? Well UTF-8 is trying to be smart. Use as few bytes as possible and thereby minimise footprint. This makes it beneficial to things like internet web pages which have a lot of simple text in mark up. UTF-8 also represents ASCII characters in as they are. Recall ASCII uses 7 bits per character. UTF-8 maps the characters exactly how they appear in ASCII and then uses the leftover bit (remember 8 bits in a byte :-)) to indicate if this character is using multiple bytes or just a single byte. Cool.

This is all very nice until you get to a situation where you are going to need to support a more complex range of characters. Suppose your architecture has a lot of text in an Asian language for example - now of which are ASCII characters. This can still be supported by UTF-8 but UTF-8 may end up using 3 bytes to the encode the character in some cases when UTF-16 will do it in 2. In this case, UTF-16 will give you a smaller footprint.

So when to use UTF-32? Well UTF-16 can be variable as well. Each character is between 2 and 4 bytes but usually 2. This is for similar reason as to why UTF-8 is variable. UTF-32 is never variable and is always 4 bytes per character. This means when you get your file size you can work out exactly how many characters there are or when you know the number of characters you can work out exactly what size your file is. This might characteristic suit some applications.

Ok, so you're back in Java land putting together your web architecture. Which character encoding should you use? Well you gotta use Unicode or else you'll run into problems when something as simple as a euro symbol comes along. You should use UTF-8 by default as this will give you the smallest footprint unless you are supporting a huge amount of unusual characters which is unlikely. Finally, you gotta be consistent. You gotta make everything UTF-8. This includes:

  • All html pages

  • All JSPs and Javascripts

    <%@ page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8">


  • Your JVM (i.e. your app / web server)
    Check with your vendor
  • Your database.
    Check with your vendor

Otherwise you may see an euro symbol on your web page but you'll store it as gobbly gunk in your DB.

Have fun.