Sunday, August 5, 2018

OAUTH 2.0 Authorisation Code Grant

OAuth 2.0 provides a number of security flows (or grant types) to allow an application access user's data in another application.  In this blog, we will look at the
OAuth 2.0 grant: Authorisation Code Grant.

Firstly, a number of definitions:
  • Client: The application the user is currently interacting with.  For example, let's assume a fictitious funky blogging site: www.myfunkyblog.com.  The client wants to communicate with another application and retrieve something about the user from there.  For example, their favourite photo!  Let's assume the fictitious megaphotosharing.com as the service the client wishes to access. 
  • Client ID: this is an ID which identifies the client.  It can be passed around openly in Web URLs etc
  • Client secret ID:  A secret ID that only the client knows. This is kept server side and will be used in requests to the application for which access is sought.   It cannot be passed around in Web URLs. 
  • Resource owner: This is usually the human, who is using the client application. The resource owner has data in another application (e.g. megaphotosharing.com) that the client (myfunkyblog.com) wishes to access. The goal is to facilitate that sharing without the need for the Resource owner aka the human to ever pass their megaphotosharing.com password to myfunkyblog.com.  Note: the resource owner doesn't have to be a human but interestingly according to the OAuth spec, when it is a human it can also be referred to as the end - user. 
  • Resource Server: hosts the protected resources of the resource owner that the client is interested in. So this is the megaphotosharing.com server that has the Resource owner photos the myfunkyblog.com is interested in.
  • Authorisation Server: the server who issues a token to myfunkyblog.com after the resource owner has successfully authenticated and allowed myfunkyblog.com obtain some of its megaphotosharing.com.  Sometimes the Authorisation Server and the Resource server are effectively the same but they don't have to be. 
  • Access token: a special type of token the myfunkyblog.com authorisation server gives megaphotosharing.com to access the protected resources. It will contain scope, lifetime and other access attributes.

Use case

So the use case is that the client (myfunkyblog.com) wants to access information about
the resource owner (the human) from another application: megaphotosharing.com.

Client Registration

The first the client must do is register with the service (megaphotosharing.com)
providing its name, website etc.  The service will return a secret client code.
The client keeps this private and is responsible for ensuring only it knows it.  Usually,
it will encrypt and persist it in the client somewhere on the backend.  The service will
also receive a client id.  Unlike the client secret, this is public and can be passed around
in URLs etc.

Flow

Ok so now the actual flow.    The user is browsing around myfunkyblog.com and accesses a part of the site where myfunkyblog.com wants to know what the end-user's favourite photo is.

A pop-up screen appears to end-user.

This has URL:

https://megaphotosharing.com/oauth/authorize?response_type=code&client_id=CLIENT_ID&redirect_uri=CALLBACK_URL&scope=read

Key parts of this URL:
  • megaphotosharing.com: This is the domain for the authorisation server
  • response_type=code: Required parameter to enable the client informs the authorization server the desired grant type. An alternative value would be the "token", this is for the implicit flow.
    "code" means the client wants an authorization code which will be returned after
    resource owner logs in.  This authorization code will be used in a subsequent request by the Client. 
  • client_id: Required parameter, to identify the client.  Remember this is public and
    can be passed to and from a web browser.
  • redirect_uri: This is an optional parameter.  It enables the client to dynamically specify the URL the auth server should redirect to.  In some flows, this isn't needed as there is only one redirect URI and this is registered by the client with the service during client registration.
  • scope:  This is an optional parameter.  It specifies the level of access that the application is requesting. In this case it is just a read.  The auth server uses this to inform the user / resource owner what the client is trying to do.
The user then logs into megaphotosharing.com who tells the user what the Client wants to do.  If the user selects ok, megaphotosharing.com redirects to the redirect URI passed up.

https://myfunkyblog.com/callback?code=212132kjhkhj

Notice how the client id is passed over the web in a URL and the authorisation code is passed back over the web.

The client, then uses the returned authorization code, its client id, client secret and the grant type to make a POST request Server to Server to get an Access Token.  This happens all on the back end.

https://megaphotosharing.com/v1/oauth/token?client_id=CLIENT_ID&client_secret=CLIENT_SECRET&grant_type=authorization_code&code= 212132kjhkhj&redirect_uri=CALLBACK_URL

Notes:

  • client id and client secret identify the client.  This is a back-end request and hence it is okay to pass the client_secret (which would obviously never pass to or from the browser).
  • grant_type: This must be set to authorisation_code. As it indicates the Authorisation Code Grant.  Remember, the grant is used to indicate the flow the client is using (it can also be used by the server what types of flows are available). If the client was using the Client Credentials Grant, this value would be: "client_credentials".  If the client was using "Resource Owner Password Credentials Grant" the value would be "password".
  • code: 212132kjhkhj - The actual authorisation code what was returned from initial authorisation request from the authorisation server. This is required.
  • redirect_uri: if the redirect_uri was included in the authorisation request this value must be the same as the value used in that request.


The client then receives back an access token. Something like this:

{"access_token":"ACCESS_TOKEN","token_type":"bearer","expires_in":2592000,"refresh_token":"REFRESH_TOKEN","scope":"read","uid":1001013121222}

It will now use this to access some of the resource owner's resource data.

So what's the big deal?

  • There are obviously big advantages for users not having to tell one website its password for another site. 
  • Reduces the number of passwords a user needs to remember
  • Allows richer websites by allowing disparate applications to talk to each other. 

Why do people find it confusing?

There are a number of reasons why people find OAuth 2.0 confusing. 
  • There are a few different flows or grants.  The Authorisation Code Grant is just one.  Sometimes when you google explanations for OAuth 2.0 you get explanations for different grants without making it clear what is and isn't being explained. Hence why I put Authorisation Code Grant in the title.  
  • Terminology.  I'll just speak for myself.  But if I am reading quickly, I am likely to:
    •  confuse "Client" with the end-user
    • get confused between the Resource Server and Authorisation Server
  • Consistenty.  A lot of places implement OAuth 2.0 or something very similar to OAuth but will refer to things differently along the way.  For example, go to quora.com and try to login to google.  You are taken to: 
    https://accounts.google.com/signin/oauth/oauthchooseaccount?client_id=917071888555.apps.googleusercontent.com&as=rdWeinbqWJbt6ChoW2f3Fg&destination=https%3A%2F%2Fwww.quora.com&approval_state=!ChRyQlhnbEYzai1xQTliNlNmTEVmNRIfZ3doM2hlRVIycGdiMEVBN1JaNXdOM085MERXLVVCWQ%E2%88%99ANKMe1QAAAAAW2i2to0SOyO2_w3k3O4gjwUKQLGNmZ2h&oauthgdpr=1&xsrfsig=AHgIfE8EzSxvWfzyxou0dwLDxv4GhD6e5g&flowName=GeneralOAuthFlow
    
    There's no response_type in that URL.
  • OAuth is an authorisation spec.  It is usually used with Authentication spec like Open Connect but that is actually a separate spec. 

Friday, July 27, 2018

Java Lambda Streams and Groovy Clouses Comparisons

This Blog post will look at some proverbial operations on List data structure and make some comparison between Java 8/9 and Groovy syntax.  So firstly, the data structure.  It's just a simple Rugby player who has name and a rating.

Java

class RugbyPlayer {
    private String name;
    private Integer rating;
    
    RugbyPlayer(String name, Integer rating) {
        this.name = name;
        this.rating = rating;
    }

    public String toString() {
        return name + "," + rating;
    }
        
    public String getName() {
        return name;
    }
        
    public Integer getRating() {
        return rating;
    }
}

//...
//...
List<RugbyPlayer> players = Arrays.asList(
    new RugbyPlayer("Tadgh Furlong", 9),
    new RugbyPlayer("Bundee AKi", 7),
    new RugbyPlayer("Rory Best", 8),
    new RugbyPlayer("Jacob StockDale", 8)
);

Groovy

@ToString
class RugbyPlayer {
    String name
    Integer rating
}
//...
//...
List<RugbyPlayer> players = [
    new RugbyPlayer(name: "Tadgh Furlong", rating: 9),
    new RugbyPlayer(name: "Bundee AKi", rating: 7),
    new RugbyPlayer(name: "Rory Best", rating: 8),
    new RugbyPlayer(name: "Jacob StockDale", rating: 8)
]

Find a specific record

Java

// Find Tadgh Furlong
Optional<RugbyPlayer> result = players.stream()
    .filter(player -> player.getName().indexOf("Tadgh")  >= 0)
    .findFirst();      
String outputMessage = result.isPresent() ? result.get().toString() : "not found";
System.out.println(outputMessage);

Groovy

println players.find{it.name.indexOf("Tadgh") >= 0}

Comments

  • The Java lambda has just one parameter: player.  This doesn't need to be typed as its type can be inferred.  Note: If there were two parameters in the parameter list, parenthesis would be needed around the parameter list.
  • In Java, a stream must be created from the List first before the functional operation can be applied.  
  • A lambda is then used to before performing a function which returns an Optional
  • The lambda definition doesn't need a return statement.  It also doesn't need {} braces or one of those semi-colons to complete a Java statement.  However, you can use {} if you want to and if you want to, you must include the ; and the return statement.  Note: if you lambda is more than one line, you don't get a choice, you must use {}.   It is recommended best practise to keep Lambda's short and to just one line. 
  • Java 8 supports fluent APIs for pipeline Stream operations.  This is also supported in Groovy Collection operations.
  • In Java a player variable is specified for the Lambda.  The Groovy closure doesn't need to specify a variable.  It can just use "it" which is the implicit reference to the parameter (similar to _ in Scala).  
  • The Java filter API takes a parameters of type Predicate.   A Functional Interface means: can be used as the assignment target for a lambda expression or method reference.  Predicate, is type of Functional interface.  It's one abstract method is: boolean test(T t).    In this case, in the lamda, the player corresponds to t.  The body definition should evaluate to a true or a false, in our case player.getName().indexOf("Tadgh") will always evaluate to true or false.  True corresponds to a match. 
  • Java 8 has other types of Functional Interfaces:
    • Function – it takes one argument and returns a result
    • Consumer – it takes one argument and returns no result (represents a side effect)
    • Supplier – it takes not argument and returns a result
    • Predicate – it takes one argument and returns a boolean
    • BiFunction – it takes two arguments and returns a result
    • BinaryOperator – it is similar to a BiFunction, taking two arguments and returning a result. The two arguments and the result are all of the same types
    • UnaryOperator – it is similar to a Function, taking a single argument and returning a result of the same type
  • Java 8 can infer the type for the lambda input parameters. Note if you have to specify the parameter type,  the declaration must be in brackets which adds further verbosity.
  • Groovy can println directly.  No System.out needed, and no need for subsequent braces.
  • Like Java, Groovy doesn't need the return statement.  However, this isn't just for closures, in Groovy it extends to every method.    Whatever is evaluated as the last line is automatically returned. 
  • Groovy has no concept of a Functional interface.  This can mean if you forget to ensure your last expression is an appropriate boolean expression, you get unexpected results and bugs at runtime.
  • The arrow operator is used in both Groovy and Java to mean effectively the same thing - separating parameter list from body definition. In Groovy it is only needed it you need to declare the parameters (the default it, doesn't suffice). Note: In Scala, => is used.

 

Find specific records

 

Java

// Find all players with a rating over 8
List<RugbyPlayer> ratedPlayers = players.stream()
    .filter(player -> player.getRating() >= 8)
    .collect(Collectors.toList());
ratedPlayers.forEach(System.out::println);

Groovy

println players.findAll{it.rating >= 8}

Comments

  • In the Java version, the iterable object ratedPlayers has its forEach method invoked.   This method takes a FunctionalInterface of type Consumer (see Jdoc here).  Consumer, methods a function which takes an input parameter but returns nothing, it is void.  
  • In Java, the stream.filter() will return another stream. Stream.collect() is one of Java 8's Stream terminal methods. It performs mutable fold operations on the data elements held inside the stream instance return by the filter method.  
  • Collectors.toList () returns a Collector which collects all Stream elements into a List.
  • When using the toList() collector, you can't assume the type of List that will be used.  If you want more control you need to use the toCollection().  For example: 
    • .collect(toCollection(LinkedList::new)
  • Note: We could have omitted the .collect() operation and invoked forEach straight on the stream.   This would make the Java code shorter.  
players.stream()
   .filter(player -> player.getRating() >= 8)
   .forEach(System.out::println);
  • System.out::println is a method reference - a new feature in Java 8.   It is syntactic sugar to reduce the verbosity of some lambdas.  This is essentially saying, for every element in ratedPlayers, execute, System.out.println, passing in the the current element as a parameter.
  • Again less syntax from Groovy.  The function can operate on the collection, there is no need to create a Stream.  
  • We could have just printed the entire list in the Java sample, but heck I wanted to demo forEach and method reference.

 

Map from object type to another

Java

// Map the Rugby players to just names. 
// Note, the way we convert the list to a stream and then back again to a to a list using the collect API. 
System.out.println("Names only...");
List<String> playerNames = players
    .stream()
    .map(player -> player.getName())
    .collect(Collectors.toList());
playerNames.forEach(System.out::println);

Groovy:

println players.collect{it.name}

Comments

  • A stream is needed to be created first before executing the Lambda.  Then the collect() method is invoked on the Stream - this is needed to convert it back to a List. This makes code more verbose. 
  • That said, if all you are doing is printing the list, you can just do...
    players.stream()
       .map(player -> player.getName())
       .forEach(System.out::println);
    

 

Perform a Reduction calculation

Java

System.out.println("Max player rating only...");
Optional<Integer> maxRatingOptional = players.
   stream()
   .map(RugbyPlayer::getRating)
   .reduce(Integer::max);
String maxRating = maxRatingOptional.isPresent() ? maxRatingOptional.get().toString() : "No max";
System.out.println("Max rating=" + maxRating);

Groovy

def here = players.inject(null){ 
    max, it -> 
        it.rating > max?.rating ? it : max
} 

Comments

  • In the Java version, the reduce operation is invoked on the Stream.  There are three different versions of this method.   In this version, no initial value is specified meaning and an Optional type is returned.  The input parameter of type BinaryOperator.  Because BinaryOperator is a functional interface it means a lamda expression or method reference can be used to specify its value.  In this case, the method reference Integer.max() is used.
  • The null safe operator is used in the Groovy inject closure - so that the first comparsion will work 
  • In Java, it is possible to avoid the isPresent check on the optional by just doing...
    players.stream()
       .map(RugbyPlayer::getRating)
       .reduce(Integer::max);
       .map(Objects::toString).orElse("No max")
    

Summary

  • Groovy is still far more terse
  • However, some of the function operations in Java are lazily run.  For example map(), filter() which are considered intermediate.  Intermediate operations produce antoher Stream.  They won't execute unless a terminal function e.g. forEach, collect, reduce is invoked on the stream.  Terminal functions are value or side-effect producing. 
  • Intermediate operations can either be stateless or stateful.  Stateless operations like map() or filter() can operate on elements independently. Stateful operations like distinct() or sorted() may incorporate data from previously seen elements.  
  • The elements of a stream are only visited once during the life of a stream. Like an Iterator, a new stream must be generated to revisit the same elements of the source.
  • In Java processing streams lazily allows for two performances efficiencies:
    • fusing of multiple operations to minimise passes of the data
    • avoiding examination of the data unless it is necessary.  A stream() may have an infinite number of elements, but a findFirst() or limit() operation will mean that only data that needs to checked will be.
  • This may the code more verbose in cases, but it also means it can be more performant.
  • Groovy also offers some lazy functions. 
Full Java code here. Full Groovy code here.

Saturday, May 12, 2018

Scala Syntax: 7 points

A few years back I dipped into some Scala as a hobby language. Recently, in order to get a quick overview of Spark I did the 'Big Data Analysis with Scala and Spark' from Coursera. It's a great course. But, one aspect I found challenging was just getting my head around Scala syntax again. Some of it, yeah the basic stuff can be counter-intuitive depending on your perspective.

1. Method / Function Definition

Typing on the right rather than the left. Consider this simple function definition:
def sayHello(param: String): String = {
    "Hello" + param
}
Javaholics will note:
  • The return is specified at the end of the method definition, rather than the beginning. 
  • The type of the parameter is specified after the parameter name rather than before. 
  • Before the function body there is a = 
  • There are two colons (:), one between the parameter and the type and one before the return type.

2.  Unit

Google "Unit" and you be quickly told you that Unit is the Scala's version of Java void.  But, Java’s void is a keyword.  Scala’s Unit is a final class which only has one value: () - which is like an alias for no information. Unit indicates a method returns nothing and therefore has side effects, something we don't want to do much of in Scala. So is that counter intuitive? No.
But here is what I find is. If a function has no return type in the function definition and no equals it means Unit is implicitly the return type. Example:
def procedure {
    println "This String is not returned"
}

procedure: ()Unit
Big deal? Of course not. But what about:
def procedure {
     "This String is not returned"
}
Expect the String to be returned, it wont be. How about this?
def addNumbers(a: Integer, b: Integer) {
    return a + b
}
This will give a compile warning: :12: warning: enclosing method addNumbers has result type Unit: return value discarded return a + b It will compile but nothing will be returned:
def addNumbers(a: Integer, b: Integer) {
    a + b
}
will give no compile warning and will also return nothing.

3.  Underscore

In anonymous Scala functions, _ is like Groovy's it. In Groovy we can to multiple all numbers between 1 and 5 we can do:
(1..5).collect {it * 2}
In Scala we can do:
(1 to 5).map{_*2}
However, in Scala, the second time _ is referenced, it refers to the second parameter
val ns = List(1, 2, 3, 4)
val s0 = ns.foldLeft (0) (_+_) //10

4. Passing anonymous functions. 

Pass one anonymous function and you don't need any curly parenthesis. Pass two and you do.
def compose(g:R=>R, h:R=>R) = (x:R) => g(h(x)) 
val f = compose({_*2}, {_-1})

5. Arity-0 

When a method has no arguments, (arity-0), the parentheses can be omitted in invocation
size()
...
size  // do it like this 
But this technique should never be used when method has side effects. So,
queue.size // ok
println // not ok do println()

6. Declare parameter types

Function defiinitions / Method definition have to declare parameter types but function literals don’t.
def addNumbers(a, b): Number {
:1: error: ':' expected but ',' found.

7. Ternary Operator

There is no ternary operator in Scala. There is one in Java, Groovy, JavaScript. Python 2.5 added support for it. Instead you can do if else on one line and since if / else is an expression you can return a value. For example: In Java we would do:
(eurovision.winner == "Ireland") ? "Yippee" : "It's a fix"
Scala, it's:
if (eurovision.winner == "Ireland") "Yippee" else "It's a fix"

Friday, May 11, 2018

And some more REST tips

In previous blog posts I have covered some ideas and tips for achieving a REST architecture. In this post, I cover a few more ideas and tips.

Caching

  • Caching is a big part of the original dissertation.  See section 5.1.4 
  • Strategies include validation (client checks it has the latest version) and expiration (client assumes it has the latest version until a specified time)
  • Expiration:
    • Expires header tells client when resource is going to expire. The value 0 means avoid caching
    • Cache-Control
      • Use max-age directive to specify how long response should be considered valid for; s-maxage for shared caches
      • Can also be used in requests no-cache means re validate response with server
  • Validation
    • Etag - unique version of resource. Used in conjunction with If-none-match request header
    • Last-Modified - tells client when resource last changed

Controller APIs

  • When something does fit neatly to a CRUD operation, consider a Controller API

Handling Dates

  • Use ISO-8601 for your dates - better for natural sorting, handles timezone, locale nuetral, support from most programming languages
  • Accept any timezone as anyone in the world may call your API
  • Store in UTC, not in your server's timezone.  There should be no offset when persisted.
  • Return in UTC.  Allow the client to adjust to its timezone as necessary
  • Don't use time if you don't need it.  If Date only suffices, only persist Date. This means, timezone complexity goes away. 

HEAD

Headers

  • Always return what headers are useful.  Consider: 
    • Content-Type
    • Content-Length
    • Last-Modified
    • ETag
    • Location

Hypermedia (advantages)

  • Less coupling
  • Consistent format for links => cleaner client code
  • Developer productivity: API's easier to navigate 
  • Make easier to introduce services in a more granular way
  • Code easier to debug - messages always have the URL that created them via the self link

Hypermedia (choices)

  • HAL - reduces Address coupling 
  • SIREN - reduces Address and Actions coupling
  • Collection+JSON (CJ) - reduces Address, Action and Object coupling

Idempotent

  • Can be called several times and return the same result
  • OPTIONS, GET, HEAD, PUT and DELETE are all idempotent

Long Running Requests

  • Some operations take a long time.  In such cases, consider returning a 202 with the location field set to a URL the client can poll to check for operation progress.

Method not allowed

  • If an API only supports GET, it should return a 405 for any PUT, POST, DELETEs etc

Must Ignore Principle

  • Clients should ignore data they are not interested in. This makes it much easier for APIs to be backwardly compatible .  If an API returns extra data and some clients aren't expecting it they will just ignore it. 

Not acceptable

  • When a resource doesn't support a specific media type, it should return 406  (see Masse, Rule: 406 (“Not Acceptable”) must be used when the requested media type cannot be served

OPTIONS

  • OPTIONS should return what actions are available on a resource

Partial Update

  • Handle partial updates with  PATCH

Query

  • The query component of a URI should be used to filter collections

Resource Creation

  • When a Resource has been successfully created a 201 should be returned 
  • The location header should indicate the URL to get the Resource. 

Safe

  • Actions are considered Safe if they Do not modify resources
  • OPTIONS, GET and HEAD are safe

Self link

  • Response bodies should always include a self link - the URL that was used to return the resource. 

Singular or Plural?

  • Use Singular for Singular Document type resource  - when there can only be one.  For example: /humans/12343343/head
  • Otherwise plural