Sunday, June 29, 2014

Book Review: REST API Design Rulebook (Mark Masse)

RESTful style architectures are becoming more and more ubiquitous. There are many reasons for this:
  • Web Tiers are full of JavaScript yearning to make nice simple AJAX requests
  • The obvious shortcomings of SOAP style strong contracts 
  • They are a nice alternative to ESB's for integrating heterogeneous architectures
However, like the MVC design pattern or the Agile Software Methodology, while many projects may be claiming to be using the RESTful architectural approach everyone is doing it differently.

In 2010, Martin Fowler - in his excellent blog - discussed the Richardson Maturity Model. This model provided a very good categorisation technique to assess the degree of RESTfullness based on how many core REST principles were being used.  Although, that same model gets a reference in Mark Masse's REST API Design Rulebook, Masse's book goes much more into low level detail about various REST best practises.

For example:
  • Negative caching: adding cache headers not just for positive responses but to 3xx and 4xx responses.  This is something that may not be obvious, but could be a good performance / scalability boost depending on the nature of your application and user usage patterns etc.
  • How to version your URIs, your representational forms and your resource objects
  • Using a consistent forms to represent link relations 
In addition, there is a abundance of other ideas and guidelines, some pretty simple but nonetheless important:
  • Don't end URI's with a slash
  • Don't use underscores in URI paths
  • Put "api" in the domain part of your rest resource path, e.g. 
https://api.dropbox.com/1/oauth/request_token
The main reason why I think it is good to have a book like this is because when a development team are trying to use a REST style architecture, disagreements, misunderstanding will inevitably happen. For example, the proverbial: 'Should a URI end in a plural or singular noun?'
It is always good to be able to reference a respected industry resource to prevent rabbit holes appearing and eating into your valuable project time.

Furthermore, there are some really quick and easy things you can do to make a much a better REST API that are discussed in the book. For example:
  • Adding an ETag HTTP header to that shopping cart resource as items go in and out of it.
  • Using query fields to generate partial responses and using the ! for an excludes option.
Now for some constructive criticism.  Firstly, I don't there will ever be complete consistency in REST approaches. Some of the so called best practises could be argued to be just subjective or nice-to-haves. They are not something that are going to make a big difference to either functional or non-functional aspects of your architecture. Some of the industry leaders not only take different approaches in their REST APIs, but they are also sometimes doing the opposite of what Massé suggests. For example, Massé suggests not to include a file extension type in your REST URLs (see Chapter 2), but  (at the time of writing) Twitter do just that (the URI is: https://api.twitter.com/1.1/statuses/mentions_timeline.json)

Furthermore, in a lot of projects you will be writing a REST API purely to serve JSON back to a Web client.   This is quite different to a public facing API on the scale of Twitter, Amazon etc.   Therefore you need to ask yourself if you really need to put the time investment in so that you are adhering to every single REST best practise instead of just doing what makes sense for your project.  Some obviously make sense, some could be argued to be overkill. For example making sure you are sending back the correct HTTP code when you know the client never even checks if it is a 403 or 405.

I do think this book is written more as if you were designing a public facing API.  If you find yourself in that situation you should definitely, definitely, definitely be considering everything Massé is saying in the book.  But note, the emphasis is on considering which doesn't always mean adhering or adopting.

The book does a very good job in covering best practises that a lot of developers just wouldn't think of (setting response headers such as content-length, setting the location header in responses for newly created resources, how to support HTTP 1.0 when you are trying to discourage caching) and is definitely worth a read but you really have to think about what makes sense for your project. As stated, some of the suggestions are quick wins others you have to assess the cost and the benefit. Following the core basic REST principles (statelessness, a good resource model, uniform interface etc.) is what is really important after that the challenge is figuring out what works best for each specific project and how to make the most of your time. That will vary from project to project and will depend on project scope, scale etc.   A good architectural decision should always consider the various options but make the appropriate decision for the appropriate project.

Until the next time, take care of yourselves.







Tuesday, June 17, 2014

MongoDB and Grails

So recently, I had a requirement to store unstructured JSON data that was coming back from a web service. The web service was returning back various soccer teams from around the world. Amongst the data contained in most of the soccer teams was a list of soccer players, who were part of the team. Some of the teams had 12 players, some had 20 some had even more than 20. The players had their own attribute, some were easy to predict some impossible. For the entire data structure, the only attribute that I knew would definitely be coming back was team's teamname. After that, it depended on each team.
{
   "teams": [{
       "teamname":"Kung fu pirates",
       "founded":1962,
       "players": [
          {"name": "Robbie Fowler", "age": 56},
          {"name": "Larry David", "age": 55}
          ...
        ]},
        { 
        "teamname":"Climate change observers",
        "founded":1942,
        "players": [
          {"name": "Jim Carrey", "age": 26},
          {"name": "Carl Craig", "age": 35}
          ...
        ]},
        ...
   ]

}
There are several different ways to do store this data. I decided to go for MongoDB. Main reasons:
  • I wanted to store the data in as close as possible format to the JSON responses I was getting back from the web service. This would mean, less code, less bugs, less hassle.
  • I wanted something that had a low learning curve, had good documentation and good industry support (stackoverflow threads, blog posts etc)
  • Something that had a grails plugin that was documented, had footfall and looked like it was maintained
  • Features such as text stemming were nice to have's. Some support would have been nice, but it didn't need to be cutting age.
  • Would support good JSON search facilities, indexing, etc
MongoDB ticked all the boxes. So this is how I got it all working. After I installed MongoDB as per Mongo's instructions and the MongoDB Grails plugin, it was time to write some code. Now here's the neat part, there was hardly any code. I created a domain object for the Team.
class Team implements Serializable {

    static mapWith = "mongo"

    static constraints = {
    }

    static mapping = {
        teamname index: true
    }

    String teamname

    List players
    static embedded = ['players']
}
Regarding the Team domain object:
  • The first point to make about the Team domain object was that I didn't even need to create it. The reason why I did use this approach was so that I could use GORM style api's such as Team.find() if I wanted to.
  • Players are just a List of object. I didn't bother creating a Player object. I like the idea of always ensuring the players for the team were always in a List data structure, but I didn't see the need to type anything further.
  • The players are marked as embedded. This means the team and players are stored in a single denormalised data structure. This allows - amongst other things - the ability to retrieve and manipulate the team data in a single database operation.
  • I marked the teamname as a index.
  • I marked the domain object as
    static mapWith = "mongo"
    This means that if I was also using another persistence solution with my GORM (postgres, MySQL, etc.) I am telling the GORM that this Team domain class is only for Mongo - keep your relational hands off it. See here for info. Note: This is a good reminder that the GORM is a higher level of abstraction than hibernate. It is possible to have a GORM object that doesn't use hibernate but instead goes to a NoSQL store and doesn't go near hibernate.
You'll note that in the JSON that there are team attributes such as founded that haven't been explicitly declared in the Team class. This is where Groovy and NoSQL play really well with each other. We can use some of the Meta programming features of Groovy to dynamically add attributes to the Team domain object.
private List importTeams(int page) {
    def rs = restClient.get("teams") // invoke web service
    List teams = rs.responseData.teams.collect {
         teamResponse ->
                Team team = new Team(teamname: teamResponse.teamname)
                team.save(); // Save is needed to dynamically add the attribute
                teamname.each {key, value ->
                    team["$key"] = value 
                }
                teamname.save(); // We need the second save to ensure the variants get saved.
                return teamname
        }
    log.info("importTeams(),teams=teams);
    teams
}
Ok, so the main points in our importTeams() method
  • After getting our JSON response we run a collect function on the teams array. This will create the Team domain objects.
  • We use some meta programming to dynamically add any attribute that comes back in the JSON team structure to the Team object. Note: we have to invoke save() first to be able to dynamically add the attributes that are declared in the Team domain object to the Team domain object. We also have to invoke save() again to ensure that attributes that are declared in the Team domain object to ensure they are saved. This may change in future versions of the MongoDB plugin, but it is what I had to do to get it working (I was using MongoDB plugin version 3.0.1)
So what's next? Write some queries. Ok so two choices here. First, you can use the dynamic finders and criteria queries with the GORM thanks to the MongoDB plugin. But, I didn't do this. Why? I wanted to write the queries as close as possible to how they are supposed to be written in Mongo. There were a number of reasons for this:
  • A leaky abstraction is inevitable here. Sooner or later you are going to have to write a query that the GORM won't do very well. Better to approach this heads on.
  • I wanted to able to run the queries in the Mongo console first, check explain plans if I needed to and then use the same query in my code. Easier to do this, if I write the query directly without having to worry about what the GORM is going to do.
The general format of queries is:
teams = Team.collection.find(queryMap) // where queryMap is a map of fields and the various values you are searching for. 
Ok, some examples of queries...
Team.collection.find(["teamname":"hicks"]) // Find a team name hicks
Team.collection.find(["teamname":"hicks", "players.name": "Robbie Fowler"] // As above but also has Robbie Fowler
Team.collection.find(["players.name": "Robbie Fowler"] // Any teams that has a Robbie Fowler
Team.collection.find(["teamname":"hicks", "players.name": "Robbie Fowler", {"players.$":1}]  // Returns matching player only
Team.collection.find(["teamname":"/ick/"]) // Match on the regular expression /ick/, i.e. any team that contains text ick.
Anything else? Yeah sure. I wanted to connect to a Mongo instance on my own machine when in development but to a Mongo machine on a dedicated server in other environments (CI, stage, production). To do this, I updated my DataSource.groovy as:
environments {
    development {
        grails {
            mongo {
                host = "localhost"
                port = 27017
                username = "test"
                password = "test"
                databaseName = "mydb"
            }
        }
        dataSource {
            dbCreate = "create-drop" // one of 'create', 'create-drop', 'update', 'validate', ''
            url = "jdbc:h2:mem:devDb;MVCC=TRUE;LOCK_TIMEOUT=10000"
        }
    }

    
    ci {
        println("In bamboo environment")
        grails {
            mongo {
                host = "10.157.192.99"
                port = 27017
                username = "shop"
                password = "shop"
                databaseName = "tony"
            }
        }
        dataSource {
            dbCreate = "create-drop" // one of 'create', 'create-drop', 'update', 'validate', ''
            url = "jdbc:h2:mem:devDb;MVCC=TRUE;LOCK_TIMEOUT=10000"
        }
    }
}
You'll see I have configured multiple datasources (MongoDB and PostGres). I am not advocating using both MongoDB and a relational database, just pointing out it is possible. The other point is that the MongoDB configuration is always under:grails { mongo {

Ok this is a simple introductory post, I will try to post up something more sophisticated soon. Until the next time, take care of yourselves.


Saturday, May 31, 2014

JavaScript Performance tip using a Closure

Here is a simple performance JavaScript tip using a JavaScript closure that is so quick to explain you have no reason not to keep reading.   Consider this: you want a function which will tell you the position on a rugby team for a certain number. Those of us who are familiar with the sport will know that in rugby, the various positions always have the same number.
  1. Loose head
  2. Hooker
  3. Tight head
  4. Lock
  5. Lock
  6. Flanker (blind side)
  7. Flanker (open side)
  8. No. 8
  9. Scrum half
  10. Out half
  11. Left wing
  12. Inside Centre
  13. Centre
  14. Right wing
  15. Full back
So for your first attempt you define a positions array in the global namespace and then a function to resolve the position for a given number.
var positions = ['No position', 'Loose head', 'Hooker', 'Tight head', 'Lock', 'Lock', 'Flanker', 'Flanker', 'No. 8',  'Scrum half', 'Out half', 'Left wing', 'Inside centre', 'Outside centre', 'Full back']

var getPosition = function(n) {
   return positions(n);
}
Everyone starts laughing at you at code review time. You get comments such as:
"Don't pollute the name space, re-write please."
"Did you even bother using JsLint?"
"Do you want fries with that?"

For your second attempt, you put the positions array into a function as a local variable.
var getPosition = function() {
    var positions = ['No position', 'Loose head', 'Hooker', 'Tight head', 'Lock', 'Lock', 'Flanker', 'No. 8', 'Flanker', 'Scrum half', 'Out half', 'Left wing', 'Inside centre', 'Outside centre', 'Right wing' 'Full back'];   // positions is a local variable
   return positions[n];
}
Back at the second code review:
"Well it's better, but you do realise that every time that function is called, the array is allocated?"
"But you said, don't pollute the global name space"
"Look we have standards here, don't pollute global name space and don't incur performance costs when you don't need do"
"Ok"

You leave the code review despondent and crestfallen. Back at your coffee stained desk, when your headphones are on low volume, you keep hearing the word closure, closure, closure from a few of the JavaScript nerds sitting near you.  That evening, you are having difficult getting to sleep.  Closure, closure, closure.  The word won't leave your head.  Closure, closure, closure.   Yeah you know that closures offer a way to encapsulate code in JavaScript, but this isn't about encapsulation. Wait a sec! Closures close over free variables and make them accessible to the lexical scope. Hmmm... This means you could do something like...
var getPosition = function () {
    var positions = ['No position', 'Loose head', 'Hooker', 'Tight head', 'Lock', 'Lock', 'Flanker', 'Flanker', 'No. 8', 'Scrum half', 'Out half', 'Left wing', 'Inside centre', 'Outside centre', 'Right wing', 'Full back'];   // positions is a local variable
    return function(n) {
        return positions[n];
    }
}()

console.log(getPosition(4));
What's going on here?
  • The inner function "closes over" the positions array
  • The outer function returns the inner function, which is then assigned it to the variable getPosition
  • The outer function is self invoking. This means the declaration, invocation and assignment will happen all at the same time.
  • Because it is the outer function which allocated the array and because the outer function is only invoked once, the array is only allocated once.
  • Because the inner function has access to what it "closes over" even after the outer function has executed, it means the inner function will have access to the positionsarray (that is only allocated once).
So you return to code review a little more sanguine. You get one comment:
"Third time lucky"

You might be thinking, it doesn't take that long to allocate an array in JavaScript is this not overkill? Well, it depends. Is your application needing to squeeze every nano second? Is the amount of data you need to initialise a lot more than 15 elements in an array? These are pertinent questions of course. But the technique is a handy one to know because encapsulation is clearly not the only advantage of closures.

Saturday, May 24, 2014

Grails tip: refactoring your URLs

On the current project I am working we use a lot of integration tests. For you non-Grails users out there, Integration tests test your Controller APIs, your Services and any persistence that might happen all very neatly. The only slice of the cake they don't test from a back end perspective are your Grails filters for which you'd need something like a functional test. In the Grails world, Controller API's are mapped to URL requests in the URLMappings.groovy file. This is just a simple Groovy to configure which HTTP request go to which Controller.

For example:

class UrlMappings {
    static mappings = {
        "/sports/rugby/ball" (controller: "rugbyBall", action = [POST: "createBall", DELETE: "removeBall", GET: "getBall"])
        ...

So in the above example, the HTTP request /sports/rugby/ball will go to the RugbyBallController and will go to the methods: createBall(), deleteBall(), getBall() depending on weather the request is a GET, POST or DELETE. Now suppose, you have your project all set up to server up the CRUD operations for the rugby ball and after a few hectic sprints some software entropy creeps and you need to refactor your Controller APIs but before you race ahead and do that your project manager looks you in the eye and says:

"You must support all existing APIs as clients are using them".

This is how generally refactoring works in the real world when things go into production. There is always a phase of supporting of old and new, deprecating the old and then when everyone is happy removing it. Anyway, you begin by updating your URLMappings.groovy
class UrlMappings {
    static mappings = {
        // Old APIs
        "/sports/rugby/ball" ( controller: "rugbyBall", action = [POST: "oldCreateBall", DELETE: "oldRemoveBall", GET: "oldGetBall"])
        ...

        // New APIs
        "/sports/rugby/v2/ball" ( controller: "rugbyBall", action = [POST: "createBall", DELETE: "removeBall", GET: "getBall"])
        ...

The URLMappings.groovy show the old and the new. The old APIs are going to controller methods that you have renamed. Clients using these APIs are not impacted because they only send HTTP requests, they do not which Controller are behind these endpoints. The old APIs already have really good integration tests and our project manager has mandated that the new APIs must have similar quality integration tests before they go anywhere near pre-production.
def "test adding a single item to your cart"() {
    setup: "Set up the Cart and Landing Controller"
       //...
    when:
       //...
       rugbyBallController.oldGetBall();
       rugbyBall = JSON.parse(rugbyBallController.response.contentAsString)
    then:
       rugbyBall.isOval();
Mr. Project manager says: "I want this all new tests added by Friday or you are not going for a pint after work. You need a quit way to get your integration tests done". Thinking about that cool larger and its quenching effect on the back your throat, you remember Groovy's excellent support for invoking methods dynamically where you can specify the name of the method as a variable.
   myObject."$myMethod"()  // myMethod is a Groovy String variable.
In the above code snippet, myMethod is a variable that corresponds to the name of method you want to invoke on myObject. "$myMethod" means, evaluate the variable myMethod (which of course will be the method name), the () of course just means invokes the method. Eureka moments happens when you remember that the old and new APIs will return the exact same JSON. All you need to do is run the same test twice, once for the old code and once for the new. Since you are using the spock framework for your integration tests that's easily achieved using a where block.
def "test adding a single item to your cart"(String method) {
    setup: "Set up the Cart and Landing Controller"
       //...
    when:
       //...
       rugbyBallController."$method"();
       rugbyBall = JSON.parse(rugbyBallController.response.contentAsString)
    then:
       rugbyBall.isOval();
    where:
       method = ["oldGetBall", "getBall"]
Happy days. Now go off and drink that lager.