Dublin Tech: 2017

Wednesday, November 15, 2017

More Fail early - Java 8

Fail fast or Fail early is a software engineering concept that tries to prevent complex problems happening by stopping execution as soon as something that shouldn't happen, happens. In a previous blog post and presentation I go more into detail about the merits of this approach, in this blog post I will just detail another use of this idea in Java 8.

In Java, Iterators returned by Collection classes e.g. ArrayList, HashSet, Vector etc are fail fast. This means, if you try to add() or remove() from the underlying data structure while iterating it you get a ConcurrentModificationException. Let's see:

import static java.util.Arrays.asList;
List ints = new ArrayList<>(asList(1,2,3,4,5,6,9,15,67,23,22,3,1,4,2));
    
for (Integer i: ints) {
    // some code
    ints.add(57);  // throws java.util.ConcurrentModificationException
}

In Java 8u20, the Collections.sort() API is also fail fast. This means you can't invoke it inside an iteration either. For example:

import static java.util.Arrays.asList;
List ints = new ArrayList<>(asList(1,2,3,4,5,6,9,15,67,23,22,3,1,4,2));

    
for (Integer i: ints) {
    // some code
    Collections.sort(ints); // throws java.util.ConcurrentModificationException
}

This makes sense. Iterating over a data structure and sorting it during the iteration is not only counter intuitive but something likely to lead to unpredictable results. Now, you can get away with this and not get the exception if you have break immediately after the sort invocation.

import static java.util.Arrays.asList;
List ints = new ArrayList<>(asList(1,2,3,4,5,6,9,15,67,23,22,3,1,4,2));

    
for (Integer i: ints) {
    // some code
    Collections.sort(ints); // throws java.util.ConcurrentModificationException
    break;
}

But, that's hardly great code. Try to avoid old skool iterations and you use Lambdas when you can. But, if you are stuck, just do the sort when outside the iteration

import static java.util.Arrays.asList;
List ints = new ArrayList<>(asList(1,2,3,4,5,6,9,15,67,23,22,3,1,4,2));
Collections.sort(ints);
    
for (Integer i: ints) {
    // some code
}

or use a data structure which sorts when you add.

This new behaviour of the Collections.sort() API came in Java 8 release 20. It is worth having a look at the specific section that details the change in the API:
"
Area: core-libs/java.util.collections
Synopsis: Collection.sort defers now defers to List.sort
Previously Collection.sort copied the elements of the list to sort into an array, sorted that array, then updated list, in place, with those elements in the array, and the default method List.sort deferred to Collection.sort. This was a non-optimal arrangement.
From 8u20 release onwards Collection.sort defers to List.sort. This means, for example, existing code that calls Collection.sort with an instance of ArrayList will now use the optimal sort implemented by ArrayList.
"

I think it would have helped if Oracle were a little more explicit here on how this change could cause runtime problems. Considering everybody uses the Collections framework if an API that previously didn't throw a exception now can for the same situation (bad code and all that it is), it is better if the release notes made it easier for developers to find that information out.

Thursday, October 5, 2017

Book Review: RESTful Web Clients

RESTful Web Clients is written by guru Mike Amundsen who amongst other things co-authored RESTful Web APIs with REST guru Leonard Richardson and Sam Ruby.

The book's primary focus is on the hypermedia aspect of REST, particularly from the client's perspective. As Roy Fielding detailed in this famous blog post "if the engine of application state (and hence the API) is not being driven by hypertext, then it cannot be RESTful and cannot be a REST API" and let's face it, we have all seen APIs purporting to be REST with no hypermedia whatsoever with lots of coupling between client and server. Some of this is just down to basic ignorance and some of it probably down to misunderstanding the Richardson Maturity Model

Rather than begin with a summary of Fielding's dissertation like most material on REST, this book begins with details of a simple web application that uses JSON RPC APIs. From the simple example Amundsen shows that while the JSON RPC approach functionally works, it results in a lot of coupling between client and server meaning that if the APIs need to change it will be difficult to do that easily as the client(s) with all its hardcoded of contracts will be impacted. And we know software does need to change from time to time right?

Amundsen distills the coupling with the JSON RPC approach into three distinct types which can be considered and assessed individually:

Objects - the JSON objects that appear in API responses. Clients need to be able to understand them to handle a simple a response to a GET request
Addresses - the URLs clients needs to know to invoke requests
Actions - details methods and arguments for all non-trivial operations. Again clients need to know this before invoking requests.

With the coupling clearly demonstrated, the scene is nicely set to move onto one of key advantages of a REST style archictecture: reducing coupling through hypermedia.

To explain this advantage, Amundsen again uses the approach of specific examples. Firstly, by detailing the JSON hypermedia type HAL. Using this approach reduces the Address coupling and examples of how generic response handling can be written on the client to leverage and take advantage of this decoupling are detailed. However HAL doesn't solve everything. Without a custom extension there is still coupling to the JSON Objects and the possible Actions available to the client. A work around to this is given and I would highly recommend anyone considering using HAL to read Chapter 4.

Next up is another JSON hypermedia type known as Siren. Kevin Swiber designed Siren and registered it with IANA in 2012.

Siren splits response entities into:

class - this is an array, the values of which indicate what the current resource represents e.g. Customer, Person
properties - set of name-value pairs
entities - a list of linked and representational sub entities
actions - contains a set of valid operations of the associated entity and how to invoke those operations including a list of fields which match HTML5 input types (hidden, text, number). This is something not in HAL that helps reduce client-server coupling further
links - links to other resources. Each link has a class, href, rel, title, type property

Siren reduces coupling to Addresses and Actions, however it does not reduce coupling to Objects. There is no meta-data specification for the class type meaning the client has to hardcode the structure of the object somewhere. Like HAL it is possible to create a custom extension but this is not part of the Siren specification.

The third hypermedia type detailed is Collection+JSON format (Cj). Interestingly, this format was designed by the author himself. The basic elements of a Cj message are:

Links - Simlar to HAL and Siren links
Items - Similar to HAL and Siren properties and also includes meta data about the properties
Queries - Information on how to construct various reads (HTTP GETs)
Templates - Information on how to construct various writes (HTTP POSTs, PUTs, DELETEs...)
error - information ref errors

The key point here is that since Cj includes the metadata about the items, it decouples the client from the Objects in the JSON responses something both HAL and Siren could only achieve with custom extensions.

So which format? Well two good points to make here:

That can be a practical decision and not just a technical one. You may prefer Cj because out of the box it achieves most decoupling, but your customer may be used to and prefer HAL.
Rather than trying to support every possible format, think about architecting so it possible to support extra formats if you need to. The approach suggested is described in the Amundsen's Representor pattern - which is inspired from the Message Translator Pattern

So in summary, this is another great REST book from O'Reilly. The style of the book in general is pragmatic rather than academic. It really emphasizes and demostrates the importance of hypermedia in REST APIs and is backed up with practical examples. The central argument in the book is that Cj achieves the most decoupling. Even if it was written by the author, the argument is well made and I don't think it would be fair to make accusations of any selection bias since he does detail how you can extend Siren and HAL to achieve the same level of decoupling.

Bottom line - if you want to understand the hypermedia aspects of REST, read this book.

Sunday, August 13, 2017

From Developer to Architect: Patterns, Architecture Types, Soft Skills, and Continuous Delivery (Video Tutorial)

O’Reilly don’t just publish great technology books they also do some great video tutorials which are available from Safari. I recently just finished the series: From Developer to Architect: Patterns, Architecture Types, Soft Skills, and Continuous Delivery which consists of tutorial style videos about architectural patterns and anti-patterns, soft skills and a run through of some DevOps ideas and best practices. All the 17 tutorials are presented by two seasoned Architects Mark Richards and Neil Ford and take about

Neil Ford

Even though the course is more at the fundamental / introductory level there are some topics and nuggets of information that are still useful either because you never had exposure to them (for example, not many projects use event driven architectures extensively) or you have just forgotten them - in which case the course serves as an excellent refresher.

Some highlights:

The differences between Application, Integration and Enterprise Architecture are well detailed.
The Expand and Contract pattern is one mechanism to get over the DB coupling the Shared Database pattern introduces.
Shared database has a problem if the disparate applications all use their own caching. It becomes even more complex to determine when applications aren't looking at the latest data.
Even though ReST is generally stateless from the client's perspective, the resources have state and therefor using ETag is a good idea. Consideration should also be given to use 409 / 412 HTTP status codes when clients are using the wrong version of the Resource.
In a classical layered architecture, an open layer is one that can be by-passed e.g. Service Layer. However, having too many open layers completely defeats the purpose of a layered architecture.
Different patterns in Event Driven Architectures:

Event processor / Mediator topology: Useful when ordering is required, achieved through orchestration
Broker topology: No central mediator, custom process components receive events directly

Long running feature branches are anti-thetical to CI. In my own humble opinion this is another reason why if you want to really to do CI use a tool that is good for branching / merging - GIT.
Dierzler's law this is an excellent application architecture which reminds people not to be fooled by the initial illusions of progress in rapid prototyping. The user wants a full solution and doing a full solution is more difficult than the initial stages. I really believe anyone involved with rapid prototyping, hackathons or any form agile development should know this law inside out / upside down.
The Anti Corruption layer is a good pattern to use a facade / adapter approach to hide away bad legacy code.

Mark Richards

A common theme throughout the course, is no matter what the architecture or technology you need to be prepared to change it and that means you need to be careful to get the level of coupling right. I would add to that an architecture that has a lot of tight coupling will struggle to deal with technical debt because if you need to fix one little part you impact everything.

So any criticisms? Well not much. Perhaps some of the explanations are too abstract. For example, the Space architecture would be better explained with more specfics using a NoSql database to achieve the data splits. Other than that, there are large amounts dedicated to soft skills. While I don't doubt these to be important, you can get a lot of this by having a cup of coffee with a decent project manager or development manager and personally I prefer the technical stuff. Overall it's a great series for an into to software architecture and perhaps it would be nice to see a follow up with more deep diving. I would suspect that is exactly what the follow up series Software Architecture Fundamentails Beyond the Basics entails

Lastly, although this tutorial series is primarily aimed at Developers who want to be (or have become) Architects, I think anyone involved with any sort of SDLC can benefit from it. Project, Dev Managers and Product Owners should all understand Dietzler's law for example.

References

Sunday, July 16, 2017

Outputting the given, when, then, Extending Spock

Spock is a Java testing framework, created in 2008 by Peter Niederwieser a software engineer with GradleWare, which facilitates amongst other things BDD. Leveraging this example, a story may be defined as:

Story: Returns go to stock

As a store owner
In order to keep track of stock
I want to add items back to stock when they're returned.

Scenario 1: Refunded items should be returned to stock
Given that a customer previously bought a black sweater from me
And I have three black sweaters in stock.
When he returns the black sweater for a refund
Then I should have four black sweaters in stock.

Scenario 2: Replaced items should be returned to stock
Given that a customer previously bought a blue garment from me
And I have two blue garments in stock
And three black garments in stock.
When he returns the blue garment for a replacement in black
Then I should have three blue garments in stock
And three black garments in stock.

Spock makes it possible to map tests very closely to scenario specifications using the same given, when, then format. In Spock we could implement the first scenario as:

class SampleSpec extends Specification{
    def "Scenario 1: Refunded items should be returned to stock"() {
        given: "that a customer previously bought a black sweater from me"
        // ... code 
        and: "I have three black sweaters in stock."
        // ... code
        when: "he returns the black sweater for a refund"
        // ... code
        then: "I should have four black sweaters in stock."
        // ... code
    }
}

What would be nice would be to ensure accurate mapping of test scenario requirements to test scenario implementation. We could get someway down this path if we could output the syntax of the given, when, then from our test. Spock allows us to add this functionality through its extension framework.

So, let's say our BA is really curious and wants more confidence from the developer that they stuck to the same given, when, then format and their code is in-sync. They want to get this information easily. Developer could provide this information by first defining this annotation

import java.lang.annotation.*
import org.spockframework.runtime.extension.ExtensionAnnotation

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.TYPE)
@ExtensionAnnotation(ReportExtension)
@interface LogScenarioDescription {}

Followed by this implementation:

import org.apache.log4j.Logger
import org.spockframework.runtime.AbstractRunListener
import org.spockframework.runtime.extension.AbstractAnnotationDrivenExtension
import org.spockframework.runtime.model.FeatureInfo
import org.spockframework.runtime.model.SpecInfo


class LogScenarioDescriptionExtension extends AbstractAnnotationDrivenExtension; {
    final static Logger logger = Logger.getLogger("scenarioLog." + ReportExtension.class);

    @Override
    void visitSpecAnnotation(Report annotation, SpecInfo spec) {
        spec.addListener(new AbstractRunListener() {
            @Override
            void afterFeature(FeatureInfo feature) {
                if (System.getEnv("logScenarios")) {
                    logger.info("***SCENARIO TEST:*** " + feature.name)
                    for (block in feature.blocks) {
                        logger.info(block.kind);
                        for (text in block.texts) {
                            logger.info(text)
                        }
                    }
                }
            }
        })
    }
}

This will then be applied to the test

@LogScenarioDescriptionExtension
class SampleSpec extends Specification{
  //...

When the test is executed it gives the following output:

***SCENARIO TEST:*** Scenario 1: Refunded items should be returned to stock
GIVEN
that a customer previously bought a black sweater from me
AND
I have three black sweaters in stock.
WHEN
he returns the black sweater for a refund
THEN
I should have four black sweaters in stock.

output to a specific logfile for scenario logging by using the following log4j:

log4j.rootLogger=INFO, stdout

log4j.logger.scenarioLog.extension.custom=INFO, scenarioLog

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%m%n

log4j.appender.scenarioLog=org.apache.log4j.FileAppender
log4j.appender.scenarioLog.File=logs/scenario.log
log4j.appender.scenarioLog.layout=org.apache.log4j.PatternLayout
log4j.appender.scenarioLog.layout.ConversionPattern=%m%n

and now you have a logfile that your BA, QA can read! This helps foster an Agile culture of collaboration and ATDD where it possible to check that test scenarios implemented with those that were agreed.

Dublin Tech