Sunday, June 9, 2019

Defining a Resource

Fielding's dissertation  describes a Resource as:

"Any information that can be named" ... "a document or image, a temporal service (e.g. “today’s weather in Los Angeles”), a collection of other resources, a non-virtual object (e.g. a person), and
so on. In other words, any concept that might be the target of an author’s hypertext
reference must fit within the definition of a resource. A resource is a conceptual mapping
to a set of entities, not the entity that corresponds to the mapping at any particular point in
time."

Defining a Resource is both a Science and an Art. It requires both Domain knowledge and API Architectural skills.   The following points detailed below serve as a checklist which may help you determine the shape of your Resource, what data it should contain and how it should be presented to consumers of your API.

The Resource must contain a Business Description

  • The business description should be 3 - 4 sentences in simple prose which explain what the Resource is. 
  • A developer with a moderate knowledge of  your system should be able to understand the description
  • Any caveats of the Resource should be made clear

The Resource should be useful on its own


This is similar used to the maxim of defining the boundary of a micro-service, where a micro-service should be considered to be useful on its own.  Similarly, a Resource should be useful on its own.

For example, instead of:
/street-address/{id}

RESPONSE

{
    "street1": "String",
    "street2": "String"
}
and
/address-extra/{id}

RESPONSE 

{
    "city": "String",
    "country": "String"
}
It should be:
/address/{id}

RESPONSE

{
    "street1": "String",
    "street2": "String",
    "city": "String",
    "country": "String"
}
If a Resource on its own is not useful and always necessitates a subsequent request, it means code will inevitably become more complex and there will be a performance impact incurred from the second request

Use an Appropriate Noun


Use of a simple noun over a compound noun is preferred.  For example, Address is better than AddressInfo or AddressDetail.  This is a general rule, there will always be exceptions.

If using multiple Resources to represent different views of the same data, for example: Address and AddressDetail, use the simple noun e.g Address first.  Then if the second representation is more detailed use ResourceNameDetail or if it is less detailed use ResourceNameSummary.  For example, suppose there is a requirement to introduce an Address type Resource:
  1. Address is introduced first
  2. If a subsequent view of Address is needed that is more detailed, the new Resource should be called AddressDetail
  3. If a subsequent view of Address is needed that is less detailed, the new Resource should be called AddressSummary

If it is only used in a READ does it need to be a Resource?


If a Resource is only ever used in a Read request and never a Write (Create, Partial Update, Full Update, Delete, ...) request it is questionable if it needs to be defined as a Resource with its own URI.  It could just be added to the parent payload and if there is a concern that payload then becomes too complex, the parent could just provide a sparse query - where the client can decide per API request what it wants returned.

Resources should conform to the uniform interface


The uniform interface is a very important part of good API design.  It is not just about using special verbs for different requests but also ensuring the data shape is consistent.

If creates, reads, updates, deletes etc are done in a consistent way, it means code is more consistent, reusable and more maintainable.

This means:
GET /addresses/{id}
and
GET /addresses
must return the same address data structure to represent an Address.
GET /addresses/{id}
RESPONSE
{
    "id":"546",
    "street1": "String",
    "street2": "String",
    "city": "String",
    "country": "String"
}
and
GET /addresses

RESPONSE
{
    "elements": [
         {
              "id":"546",
              "street1": "String",
              "street2": "String",
              "city": "String",
              "country": "String"
         },
         ...
     ]
}
Similarly, for write payloads, the Data Structure should be the same.  So, a partial update to change street1 would be:

PATCH /addresses/{id}
REQUEST

{
    "street1": "Walkview"
}

RESPONSE
{
    "id":"546",
    "street1": "Walkview",
    "street2": "Meadowbrook",
    "city": "Dublin",
    "country": "Ireland"
}
and not something like
PATCH /addresses/{id}
REQUEST

{
    "newStreet1Value": "Walkview"
}

From a Resource perspective, the data structure must be consistent. A different data structure means a different Resource, which should be named differently and have its own path.

Don't expose everything


If your DB model is quite sophisticated, you can be sure not all attributes need to be exposed at an API level. Some fields may only be getting persisted for back office processing and should never presented make it to any UI.

When adding an attribute to a Resource, consider:

  • to only include fields that you are sure the client is interested in 
  • if you are not sure, leave the attribute out. It is much smaller problem to add an attribute later on, then to remove an attribute that has already been exposed.

API Models shouldn't blindly mirror DB Relational model or OO Models


In database modelling approaches such as normalizing data or collapsing inheritance hierarchies are used.  In Object Orientated design, techniques such as polymorphism, inheritance hierarchies etc are used to promote things like code reuse and to reduce coupling.

Resource modelling does not have to follow theses techniques. The consumer of an API doesn't care if the data is all in one table, or normalized over multiple tables.  In general, the API returns data in a format that is easy to use and does not require much additional mapping by the client before it can become useful.

Use Hierarchical data to Avoid repetition


One of the advantages of hierarchical data over flat formats such as CSV is that it provides a mechanism to avoid repetition.  For example, consider a flat data structure which contains a list of persons and what team they are in.  In CSV this is:

team, firstname, lastname
Liverpool, Mo, Salah
Liverpool, Andy, Roberston

In JSON this could be:
{
    "team": "Liverpool",
    "players": [
        {
            "firstName":"Mo",
            "lastName":"Salah"
        },
        {
            "firstName":"Andy",
            "lastName":"Roberston"
        },
         ...
     ]
}

Use Hierarchical Data to Make context clear


Another advantage of hierarchical data is that it helps provide context. To understand a flat data structure you need to know what the query was that generated the data to understand the meaning of it.  For example, consider a bunch of rows that contain a date range.

name, fromDate, toDate, holidays
Tony, 2018-01-01, 2018-02-02, true
Tony, 2018-02-03, 2018-03-01, false

You could make assumptions that there is a new row when there is a change in Tony being on holidays.  But, what if there is another column?

name, fromDate, toDate, holidays, sick
Tony, 2018-01-01, 2018-02-02, true, false
Tony, 2018-02-03, 2018-03-01, false, true

Are the date ranges corresponding to holidays, sickness or both?

If we get more data back maybe it might be clearer...
name, fromDate, toDate, holidays, sick,
Tony, 2018-01-01, 2018-02-02, true, false
Tony, 2018-02-03, 2018-03-01, false, true
Tony, 2018-03-02, 2018-04-01, false, false
Now it looks like it's sickness that the date range corresponds to and its only a coincidence it lines up with a holiday period. However, when we get more data back this theory also fails.
name, fromDate, toDate, holidays, sick,
Tony, 2018-01-01, 2018-02-02, true, false
Tony, 2018-02-03, 2018-03-01, false, true
Tony, 2018-03-02, 2018-04-01, false, false
Tony, 2018-04-02, 2018-05-01, true, false

It gets even more complicated when just don't have some information.  For example:
name, fromDate, toDate, holidays, sick,
Tony, 2018-01-01, 2018-02-02, true, false
Tony, 2018-02-03, 2018-03-01, false, true
Tony, 2018-03-02, 2018-04-01, false, false
Tony, 2018-04-02, 2018-05-01, true, false
Tony, 2018-05-02, 2018-06-01, null, false
Tony, 2018-06-02, 2018-07-01, null, false
Tony, 2018-07-02, 2018-07-08, true, false
Tony, 2018-07-08, 2018-07-09, true, null

The limitation with flat data structures is not only lack of normalisation but that they can only go so far in making the data self-describing.

When it isn't clear what data means, it is inevitable processing the data will be buggy.

We could represent the same person data in hierarchical format as:
{
    "name":"tony",
    "holidays": [
         {
            "fromDate":"2018-01-01",
            "toDate":"2018-02-02"
         },
         {
             "fromDate":"2018-04-02",
             "toDate":"2018-05-01"
         }, 
         {
             "fromDate":"2018-07-02",
             "toDate":"2018-07-09"
         }
     ],
     "sick": [ 
         {
             "fromDate":"2018-02-03",
             "toDate":"2018-03-01"
         }
     ]
}
Now, the data is much more self describing and it is clear when a date range is for a holiday and when it is for a sick period.

Resource Relationships

Resources on their own only describe themselves. A Resource model describes relationships between Resources.  This will give an indication of:
  • dependencies between Resources. What Resources are needed for a particular Resource to exist or what is impacted when a particular Resource changes: updated or deleted.
  • Data navigation - in a large domain model, it is much easier to understand and follow if navigational and directional sense is provided to consumers of the model.  Especially, when to navigation across (Resources loosely connected) can be be differentiated from navigation down (Resources strongly connected)
Hypermedia links aren't only used to achieve HATEOAS.  Resources that describe what they are linked to using hypermedia links demonstrate a very powerful mechanism to express the Resource model. Advantages include:
  • A large domain model is split into more manageable pieces.  Typically users are only interested in a particular part of the model.  When Resources self describe their own relationships, it means a large complex model is split up into more digestible chunks and users get the information they need quicker. 
  • The Resource model is self-describing and kept in sync with code. Everything is co-located.
Make clear Parent - Child relationships
A Child Resource describes its Parent URL hierarchical name spacing. A Parent Resource has children of one or many types should make this clear by providing links to the children.  For example, if a Team Resource has Players child Resources.  The Team payload should make this clear.

REQUEST
https://api.server.com/teams/4676
RESPONSE

{
    "id":"34533",
    ...,
    "_links": {
          "self":"https://api.server.com/teams/4676",
          "players":"https://api.server.com/teams/4676/players"
    }
}

Make clear Peer relationships

This is similar to above except it is for Resources that exist in a different hierarchical name space. So for example, suppose the team is in division 1.  A link should be included in the team's division attribute.
REQUEST
https://api.server.com/teams/4676

RESPONSE
{
    "id":"34533",
    "division": {
        "name":"Division 1",
        "_links": {
              "self":"https://api.server.com/divisions/1"
        }
     },
     ..., 
    "_links": {
        "self":"https://api.server.com/teams/4676",
        "players":"https://api.server.com/teams/4676/players"
    }
}

Make clear Links to Other Representations

If data is modeled to have multiple Resources which represent different representations of the data, the Resources should also include links to each other.
REQUEST
https://api.server.com/teams/4676

RESPONSE
{
    "id":"34533",
    "division": {
        "name":"Division 1",
        "_links": {
              "self":"https://api.server.com/divisions/1"
        }
     },
     ..., 
    "_links": {
        "self":"https://api.server.com/teams/4676",
        "players":"https://api.server.com/teams/4676/players",
        "teamDetails":"https://api.server.com/teamDetails/4676"
    }
}

Wednesday, December 19, 2018

What's the case for your API?

Disclaimer: In pure REST, API's are opaque and the URL should be whatever what was sent as a link in the response to a previous request. But, I'm not talking pure REST, I'm talking more pragmatic APIs which involve some concepts from REST and as well as general API best practices. 

When writing an API, it starts simple. You identify the obvious resources and end up with endpoints such as:
/api.mycompany.com/tweet

Eventually, your API will have to capture more sophisticated concepts and model more complex resources that cannot be expressed in short single nouns.  Some real world examples include:
  • Enabling request validation via a Request Validator resource (AWS API Gateway API)
  • Performing a customer search via a Customer Search resource (Google Customer Search API)
  • Running powerful checks against code via a Check Runs resource (Github API)
In English grammar, nouns that are really two nouns joined in some way are called compound nouns and in English grammar, compound nouns follow one of three patterns:
  1. All the one word: haircut, toothpaste
  2. Two words: rain forest, ice cream
  3. Hyphenated: self-esteem, brother-in-law
In the API world there are different options to choose from but it is better for consistency that your APIs just pick one approach and stick to that it. So firstly, what are the options for compound nouns from an API perspective?

Camel Case


Camel case is the practise of writing each word in the phrase with a capital letter.  There are two variations:
  1. Initial upper case (also know as Pascal's case) is where the first letter is also a capital, for example: IceCream.  Pascal's case is popular in programming languages for naming classes e.g. Java. 
  2. Initial lower case is where the initial letter is always lower case, for example: iceCream.  This approach is popular in programming languages (again Java is a good example)  for naming variables.  When people say camel case, they are usually referring to the initial lower case format.

Kebab Case

In Kebab Case, the individual words are separated by hyphens. Ice cream is expressed as ice-cream.  This approach is used in the Lisp programming language, in lots of URLs (for example, every blog post in www.blogger.com e.g. http://dublintech.blogspot.com/2018/08/oauth-20-authorisation-code-grant.html).  The observant amongst you will note sometimes the word "dash" is sometimes used in technical references instead of "hyphen".  So, what's the difference?  In English grammar, hyphen is the thing  used to join two words to make one whereas the dash is the thing used to usually add some sort stylistic emphasis to the end of a sentence such as: "I might have an interesting point here - you never know".

In programming we don't care whether the term is hyphen and dash. They are used interchangeably and mean the same thing.

The kebab case approach became popular in Web URIs because search engines knew that the hyphen meant separate words and could index the URI properly.  This convention used by search engines meant hyphens became a de facto standard for URIs.

Snake Case

In this approach, an underscore is used to separate words.  Ice cream becomes ice_cream. This approach is used in Python and Ruby for anything other than a class name or static constant.

Join words

In this approach the words are just joined. There is no -, no _ and no capitalisation of anything. This is not a popular approach with developers because it's difficult to read.

APIs

Should we use camelCase, kebab-case or snake_case in an API?  Well unfortunately, Mr. Fielding's dissertation did not go into such detail.  So what are people actually doing?  And is the approach used consistent across the API's URL and the JSON Body.  Let's take a look.

AWS

AWS have different API styles for different services.  The API Gateway REST API reference shows that JSON payload uses camel case but the URL uses nothing, it's just:
/restapis/{id}/requestvalidators/{requestvalidatorId}

Google

Surprise, surprise Google also have lots of APIs. The Google
Custom Search API is similar to the AWS API Gateway API.  The compound noun in the URL is just the one word and the JSON body is camel case.  

The Google Gmail API has camel case in request body and in some URLs, for example the forwarding addresses API.  

The Google youtube API sometimes will use kebab case in the URL e.g. yt-analytics but in other cases will use single word e.g. youtubepartner.   But the JSON payload is camel case.

Github

The Github API is a good example where we get a reminder that if possible, you should try to avoid this issue by trying to avoid compound nouns as it avoids them by using some creative name spacing.

However, some more rooting around and you'll find a compound noun such as check run expressed using kebab case in the URL and he JSON body using snake case. 

Stripe

Stripe use snake case in the URL and in the JSON body.  For example the PaymentsIntents API

 https://api.stripe.com/v1/payment_intents 

and JSON body...
{
  "id": "pi_Aabcxyz01aDfoo",
  "object": "payment_intent",
  "allowed_source_types": [
    "card"
  ],
  "amount": 1099,
  "amount_capturable": 1000,

Paypal

Paypal have more compound nouns than the other APIs checked. APIs for resources such as billing agreement  the API will use kebab case in the URL but then use snake case in the JSON payloads.

Twitter

Twitter use snake case in the URL e.g. /saved_searches/ and snake case in the JSON payloads.

Facebook

Facebook's Graph API tends to avoid resource naming in URLs and in JSON bodies it is snake case.

By this stage, you should be getting a little but confused. So let's recap via the table below.

APIURLJSON Body
AWS API GatewayNo separatorcamelCase
Facebook Graph API N/Asnake_case
Github Snake and Kebabsnake_case
Google custom search No separatorcamelCase
Google Gmail camelCasecamelCase
LinkedIn camelCasecamelCase
Pay pal kebab-casesnake_case
Stripe snake_casesnake_case
Twitter snake_casesnake_case


Everyone is different, what should I do?

So there is a lack of consistency across the industry.  However there are point worth making:
  1. In general compound nouns are best avoided.  In all the APIs checked (except PayPal), they appear in under 5% of the APIs.  This means developers don't get upset when their favourite approach is not used.
  2. The only Web API in the selection above that had more than 5% of its APIs using compound nouns was PayPal and they went for kebab-case in URIs.
  3. kebab-case is never used in any JSON body.  The syntax is allowed.  So what drives this trend? It's more than likely because JavaScript Web UIs are possibly the mos popular client invoking API and the similarly the most popular back end language serving the API is Java and both of those dudes don't allow the - in any of their declarations.

Is there anyone else saying anything?


In the excellent REST API Design Cookbook, industry expert Mark Masse suggests:
  1. Make your APIs lower case, which rules out camel case
  2. Use kebab case when expressing compound terms in your URIs
  3. Avoid using underscores as they may not display well since some browser render hyperlinks with an underline added

Make a decision

  1. Avoid compound nouns if you can.  This isn't always possible. Sticking to a ubiquitous language is important and helpful.  If you have a complex business application you will have lots of compound nouns. 
  2. If you can't avoid compound nounds and more than 5% of the APIs are going to involve compound nouns use kebab case for your URIs. Why?  Because if you have a complex business domain it's not only developers you need to think about.  Lots of BA's, Product Architects, curious Managers will also be looking at your APIs. Kebab-case is the easiest to read for everyone.
  3. For JSON body, I think it is okay to use camelCase because this is the easiest to map back to JavaScript and Java code.  It is also a recommendation from Google to use camelCase in JSON.
  4. If you have to use camelCase in your URI's, consider using the first letter capital approach for the URIs as the URIs are supposed to marking resources not attributes. Resources are more analogous to Java Classes which also use initial letter capital format; whereas the JSON payload attributes are analogous to Java attributes which use initial lower case.
Until the next time, take care of yourselves.



Monday, November 19, 2018

Technical Debt will kill your Agile Dreams

Warning this blog post is a bit of a rant, if you want something more technical there's plenty of other articles on this blog and many others!
Technical debt is a concept used in software engineering to express the additional complexity that is added to a project due to technical decisions that result in inferior solutions being chosen because they can be delivered quicker. It is analogous to financial debt. Borrow money for some gain, incur debt but -and there's always abut - you have to pay the debt off. Otherwise, with interest, the debt will grow and you will eventually be bankrupt.  Similarly, it is ok to incur some technical debt.  If you spend too long looking for the perfect solution your customers will have moved over to someone else. At the same time, if you incur too much technical debt and you are not meeting your repayments, just as the interest on the financial debt will compound repayments, if you don't meet your technical debt repyaments the complexity and entropy increases with time and eventually it will stagnate your product. Something that should really only take one month to deliver, suddenly takes 3 months, then 6 months, then there's only a few people in the entire company who can do it in 6 months, then a few of them leave, then...
Where the analogy breaks down is that all too often technical debt is also used to express that there has just been some bad engineering. Bad engineering decisions are in a different category to ones that were tactically made with full knowledge that the short-term priority was worth it. When it's clear that such a decision was, in fact, a tactical decision, it is much easier to convince people that refactoring needs to happen and the debt has to be paid off. Unfortunately, when the term is used as a polite way of saying bad engineering, it's unlikely there is any repayment strategy in place and it is even harder to create one because first, you need to convince people there is some bad engineering, then you need to convince people it is causing problems, then you have to have the ability to think of a better approach, cost it and then convince the various stakeholders the investment is worth it.  It is like trying to win 5 matches in a row away from home when the odds are against you.  
So, what are the signs that your technical debt (irrespective of whether it is intentional or accidental) is just too high and you could be filing for bankruptcy soon?
Well, think about this. In the Agile world, we want to develop, build, release in quick cycles and get rapid feedback from the customer and go again. This is only possible if there is an abundance of high quality and well engineered automated tests that run fast and provide confidence that a change no matter what it is has not broken anything. It doesn't matter if the breakdown of the tests are: 68% unit tests and 32% are integration tests or 91% are unit tests and 9% integration tests, the tests have to run fast and they have to provide confidence to all stakeholders. Otherwise releasing will be a pain and it will not be possible to release regularly. That means being Agile and getting all the benefits of it will be very difficult — no matter how good your backlog grooming sessions are.
What usually makes it difficult for developers to write good tests? Well, it's usually technical debt. It doesn't matter if it is intentional or accidental.
Now, there are all sorts of tools that will measure technical debt and put it up on a nice looking SonarQube board but usually these tools only pick up the trivial stuff — removing an unused import etc. Who cares?  Such trivial stuff isn't going to slow anyone down.   The real technical debt problems are the ones that slow people down, they make it harder to make changes, fix bugs, add functionality and do it all quickly with confidence - why because they have made things much harder to test.  Sadly, these are technical problems are usually not just something an IDE, a PMD or Checkstyle will prompt you to do.  They are generally much deeper in nature - towards the architectural end of specturm. Some examples:
  • Lack of data encapsulation and immutability leading to huge cyclomatic complexity, unpredictable code paths and difficulty in predicting impacts.
  • Impedence mismatches
  • Lack of modularity and too much coupling
  • Lack of or bad application of patterns
  • A proprietary language introduced with no IDE support, no mechanism to easily unit test or debug code. Zero support from Stackoverflow.
  • Spawning of threads and asynchronous call paths when are better approaches which would be much easier to test
This is where the analogy of technical debt and financial debt breaks down.   When you have architectural debt you have big problems. When you have a HashMap used when an ArrayList would have made more sense, you don't. Financial debt doesn't have such a critical distinction.  Unless we say, the debt is due to a friendly and sympathetic bank you have a good relationship with or its debt due to some lunatic load shark who will call around to your house with a baseball bat.  
So, if you realize you are approaching your credit limit, what do you do? Firstly, you need to get confidence in your tests. They need to test key functionality and they need to be maintainable. That is more important than speed. If you can't get confidence, you can't ship. There is not much use with tests that run in 5 minutes if no-one has any confidence the functionality you deliver will actually work. Secondly, once you have confidence in the tests (even if they are ugly end-to-end tests), you need to get them to run fast. This is where you can start refactoring — the end to end tests should facilitate refactoring of parts of the call paths; there may be obvious sprouts for example and this should help to move towards a classical test pyramid
Thirdly, you now need to understand why the code is so difficult to achieve high-quality tests. Too much coupling, bad exception handling, bad decompositions it's probably a long list. Understanding why your code is difficult to test is a key architectural and engineering skill as it requires the ability to not just understand the complexity but the ability to be able to know how to reduce it.   Reducing the debt should then provide a pathway to make the code easier to test and thus make good tests run fast.   Achieving this means you are winning the battle against eventual project stagnation. 
So, the last part of the rant.  There is a growing problem with the application of Agile which means we end very easily end up with faux Agile. The various Agile books, courses, blogs will detail things like story points, burn downs, stand ups. That stuff is all good but there is not enough focus on technical excellence. Without the technical excellence you get inevitable architectural problems and code that is difficult to test irrespective of your best efforts at story pointing, backlog grooming, sticking yellow post-its up on walls, having arguments over the definition of done, doing your show and tells. All that stuff is nice and beneficial but in comparison to technical excellence it is almost superficial. It is something people can easily see but on its own it never captures the complexity of technical debt.  It is the ratio of technical debt to technical excellence that determines whether you can write testable code easily and thus be able to deliver in regular short iteration which is the goal of Agile. Isn't it? 
Lastly, something we all say to our kids when they try any sport: "If you are losing kido, never give up, play to the end of the match". 
Technical Debt is natural, it happens in every project in the world.  Your job is to keep it in check and when it gets too high to be able innovate your way work around it and eventually lesson it.  That's such an essential characteristic of a good Architect, it is also a handy interview question. 
"Mr. Candidate, describe some Technical Debt you have experienced, describe the effects of it and what strategies you put in place to either work around it or deal with it?"
Or even, 
"Mr. Candidate here are some examples of Technical Debt, how would you prioritise which ones to deal with? Could you describe some better solutions and how would you convince people they are worth it?"
The quality of detail in the answer you get back is very likely to indicate the level of pragmatism, experience and the innovation skills of the Architect /  Engineer.
Until the next take care of yourselves.

Sunday, August 5, 2018

OAUTH 2.0 Authorisation Code Grant

OAuth 2.0 provides a number of security flows (or grant types) to allow an application access user's data in another application.  In this blog, we will look at the
OAuth 2.0 grant: Authorisation Code Grant.

Firstly, a number of definitions:
  • Client: The application the user is currently interacting with.  For example, let's assume a fictitious funky blogging site: www.myfunkyblog.com.  The client wants to communicate with another application and retrieve something about the user from there.  For example, their favourite photo!  Let's assume the fictitious megaphotosharing.com as the service the client wishes to access. 
  • Client ID: this is an ID which identifies the client.  It can be passed around openly in Web URLs etc
  • Client secret ID:  A secret ID that only the client knows. This is kept server side and will be used in requests to the application for which access is sought.   It cannot be passed around in Web URLs. 
  • Resource owner: This is usually the human, who is using the client application. The resource owner has data in another application (e.g. megaphotosharing.com) that the client (myfunkyblog.com) wishes to access. The goal is to facilitate that sharing without the need for the Resource owner aka the human to ever pass their megaphotosharing.com password to myfunkyblog.com.  Note: the resource owner doesn't have to be a human but interestingly according to the OAuth spec, when it is a human it can also be referred to as the end - user. 
  • Resource Server: hosts the protected resources of the resource owner that the client is interested in. So this is the megaphotosharing.com server that has the Resource owner photos the myfunkyblog.com is interested in.
  • Authorisation Server: the server who issues a token to myfunkyblog.com after the resource owner has successfully authenticated and allowed myfunkyblog.com obtain some of its megaphotosharing.com.  Sometimes the Authorisation Server and the Resource server are effectively the same but they don't have to be. 
  • Access token: a special type of token the myfunkyblog.com authorisation server gives megaphotosharing.com to access the protected resources. It will contain scope, lifetime and other access attributes.

Use case

So the use case is that the client (myfunkyblog.com) wants to access information about
the resource owner (the human) from another application: megaphotosharing.com.

Client Registration

The first the client must do is register with the service (megaphotosharing.com)
providing its name, website etc.  The service will return a secret client code.
The client keeps this private and is responsible for ensuring only it knows it.  Usually,
it will encrypt and persist it in the client somewhere on the backend.  The service will
also receive a client id.  Unlike the client secret, this is public and can be passed around
in URLs etc.

Flow

Ok so now the actual flow.    The user is browsing around myfunkyblog.com and accesses a part of the site where myfunkyblog.com wants to know what the end-user's favourite photo is.

A pop-up screen appears to end-user.

This has URL:

https://megaphotosharing.com/oauth/authorize?response_type=code&client_id=CLIENT_ID&redirect_uri=CALLBACK_URL&scope=read

Key parts of this URL:
  • megaphotosharing.com: This is the domain for the authorisation server
  • response_type=code: Required parameter to enable the client informs the authorization server the desired grant type. An alternative value would be the "token", this is for the implicit flow.
    "code" means the client wants an authorization code which will be returned after
    resource owner logs in.  This authorization code will be used in a subsequent request by the Client. 
  • client_id: Required parameter, to identify the client.  Remember this is public and
    can be passed to and from a web browser.
  • redirect_uri: This is an optional parameter.  It enables the client to dynamically specify the URL the auth server should redirect to.  In some flows, this isn't needed as there is only one redirect URI and this is registered by the client with the service during client registration.
  • scope:  This is an optional parameter.  It specifies the level of access that the application is requesting. In this case it is just a read.  The auth server uses this to inform the user / resource owner what the client is trying to do.
The user then logs into megaphotosharing.com who tells the user what the Client wants to do.  If the user selects ok, megaphotosharing.com redirects to the redirect URI passed up.

https://myfunkyblog.com/callback?code=212132kjhkhj

Notice how the client id is passed over the web in a URL and the authorisation code is passed back over the web.

The client, then uses the returned authorization code, its client id, client secret and the grant type to make a POST request Server to Server to get an Access Token.  This happens all on the back end.

https://megaphotosharing.com/v1/oauth/token?client_id=CLIENT_ID&client_secret=CLIENT_SECRET&grant_type=authorization_code&code= 212132kjhkhj&redirect_uri=CALLBACK_URL

Notes:

  • client id and client secret identify the client.  This is a back-end request and hence it is okay to pass the client_secret (which would obviously never pass to or from the browser).
  • grant_type: This must be set to authorisation_code. As it indicates the Authorisation Code Grant.  Remember, the grant is used to indicate the flow the client is using (it can also be used by the server what types of flows are available). If the client was using the Client Credentials Grant, this value would be: "client_credentials".  If the client was using "Resource Owner Password Credentials Grant" the value would be "password".
  • code: 212132kjhkhj - The actual authorisation code what was returned from initial authorisation request from the authorisation server. This is required.
  • redirect_uri: if the redirect_uri was included in the authorisation request this value must be the same as the value used in that request.


The client then receives back an access token. Something like this:

{"access_token":"ACCESS_TOKEN","token_type":"bearer","expires_in":2592000,"refresh_token":"REFRESH_TOKEN","scope":"read","uid":1001013121222}

It will now use this to access some of the resource owner's resource data.

So what's the big deal?

  • There are obviously big advantages for users not having to tell one website its password for another site. 
  • Reduces the number of passwords a user needs to remember
  • Allows richer websites by allowing disparate applications to talk to each other. 

Why do people find it confusing?

There are a number of reasons why people find OAuth 2.0 confusing. 
  • There are a few different flows or grants.  The Authorisation Code Grant is just one.  Sometimes when you google explanations for OAuth 2.0 you get explanations for different grants without making it clear what is and isn't being explained. Hence why I put Authorisation Code Grant in the title.  
  • Terminology.  I'll just speak for myself.  But if I am reading quickly, I am likely to:
    •  confuse "Client" with the end-user
    • get confused between the Resource Server and Authorisation Server
  • Consistenty.  A lot of places implement OAuth 2.0 or something very similar to OAuth but will refer to things differently along the way.  For example, go to quora.com and try to login to google.  You are taken to: 
    https://accounts.google.com/signin/oauth/oauthchooseaccount?client_id=917071888555.apps.googleusercontent.com&as=rdWeinbqWJbt6ChoW2f3Fg&destination=https%3A%2F%2Fwww.quora.com&approval_state=!ChRyQlhnbEYzai1xQTliNlNmTEVmNRIfZ3doM2hlRVIycGdiMEVBN1JaNXdOM085MERXLVVCWQ%E2%88%99ANKMe1QAAAAAW2i2to0SOyO2_w3k3O4gjwUKQLGNmZ2h&oauthgdpr=1&xsrfsig=AHgIfE8EzSxvWfzyxou0dwLDxv4GhD6e5g&flowName=GeneralOAuthFlow
    
    There's no response_type in that URL.
  • OAuth is an authorisation spec.  It is usually used with Authentication spec like Open Connect but that is actually a separate spec.