Sunday, January 27, 2013

Scala pattern matching: A Case for new thinking?

A new thinking?
The 16th President of the United States. Abraham Lincoln once said: "As our case is new we must think and act anew".   In software engineering things probably aren't as dramatic as civil wars and abolishing slavery but we have interesting logical concepts concerning "case". In Java the case statement provides for some limited conditional branching.  In Scala, it is possible to construct some very sophisticated pattern matching logic using the case / match construct which doesn't just bring new possibilities but a new type of thinking to realise new possibilities.

Let's start with a classical 1st year Computer Science homework assignment: a fibonacci series that doesn't start with 0, 1 but that starts with 1, 1.   So the series will look like: 1, 1, 2, 3, 5, 8, 13, ... every number is the sum of the previous two.

In Java, we could do:
public int fibonacci(int i) {
    if (i < 0) 
        return 0;
    switch(i) {
        case 0:
            return 1;
        case 1:
            return 1;
        default:
            return fibonacci(i-1) + fibonacci(i - 2);
    }
}     
All straight forward. If 0 is passed in it counts as the first element in the series so 1 should be returned. Note: to add some more spice to the party and make things a little bit more interesting I added a little bit of logic to return 0 if a negative number is passed in to our fibonacci method.

In Scala to achieve the same behaviour we would do:
def fibonacci(in: Int): Int = {
  in match {
    case n if n <= 0 => 0
    case 0 | 1 => 1
    case n => fibonacci(n - 1) + fibonacci(n- 2)
  }
}
Key points:
  • The return type of the recursive method fibonacci is an Int. Recursive methods must explictly specify the return type (see: Odersky - Programming in Scala - Chapter 2).
  • It is possible to test for multiple values on the one line using the | notation. I do this to return a 1 for both 0 and 1 on line 4 of the example.
  • There is no need for multiple return statements. In Java you must use multiple return statements or multiple break statements.
  • Pattern matching is an expression which always returns something.
  • In this example, I employ a guard to check for a negative number and if it a number is negative zero is returned.
  • In Scala it is also possible to check across different types. It is also possible to use the wildcard _ notation. We didn't use either in the fibonacci, but just to illustrate these features...
    def multitypes(in: Any): String = in match {
      case i:Int => "You are an int!"
      case "Alex" => "You must be Alex"
      case s:String => "I don't know who you are but I know you are a String"
      case _ => "I haven't a clue who you are"
    }
    
Pattern matching can be used with Scala Maps to useful effect.  Suppose we have a Map to capture who we think should be playing in each position of the Lions backline for the Lions series in Austrailia.  The keys of the map will be the position in the back line and the corresponding value will be the player who we think should be playing there.  To represent a Rugby player we use a case class. Now now you Java Heads, think of the case class as an immutable POJO written in extremely concise way - they can be mutable too but for now think immutable.
case class RugbyPlayer(name: String, country: String);
val robKearney = RugbyPlayer("Rob Kearney", "Ireland");
val georgeNorth = RugbyPlayer("George North", "Wales");
val brianODriscol = RugbyPlayer("Brian O'Driscol", "Ireland");
val jonnySexton = RugbyPlayer("Jonny Sexton", "Ireland");  
val benYoungs = RugbyPlayer("Ben Youngs", "England");
    
// build a map
val lionsPlayers = Map("FullBack" -> robKearney, "RightWing" -> georgeNorth, 
      "OutsideCentre" -> brianODriscol, "Outhalf" -> jonnySexton, "Scrumhalf" -> benYoungs);
    
// Note: Unlike Java HashMaps Scala Maps can return nulls. This achieved by returing
// an Option which can either be Some or None. 
    
// So, if we ask for something that exists in the Map like below
println(lionsPlayers.get("Outhalf"));  
// Outputs: Some(RugbyPlayer(Jonny Sexton,Ireland))
    
// If we ask for something that is not in the Map yet like below
println(lionsPlayers.get("InsideCentre"));
// Outputs: None
In this example we have players for every position except inside centre - which we can't make up our mind about.  Scala Maps are allowed to store nulls as values.  Now in our case we don't actually store a null for inside center. So, instead of null being returned for inside centre (as what would happen if we were using a Java HashMap), the type None is returned.

For the other positions in the back line, we have matching values and the type Some is returned which wraps around the corresponding RugbyPlayer. (Note: both Some and None extend from Option).

We can write a function which pattern matches on the returned value from the HashMap and returns us something a little more user friendly.
def show(x: Option[RugbyPlayer]) = x match {
  case Some(rugbyPlayerExt) => rugbyPlayerExt.name  // If a rugby player is matched return its name
  case None => "Not decided yet ?" // 
}
println(show(lionsPlayers.get("Outhalf")))  // outputs: Jonny Sexton
println(show(lionsPlayers.get("InsideCentre"))) // Outputs: Not decided yet
This example doesn't just illustrate pattern matching but another concept known as extraction. The rugby player when matched is extracted and assigned to the rugbyPlayerExt.  We can then return the value of the rugby player's name by getting it from rugbyPlayerExt.  In fact, we can also add a guard and change around some logic. Suppose we had a biased journalist (Stephen Jones) who didn't want any Irish players in the team. He could implement his own biased function to check for Irish players
def biasedShow(x: Option[RugbyPlayer]) = x match {
  case Some(rugbyPlayerExt) if rugbyPlayerExt.country == "Ireland" => 
     rugbyPlayerExt.name + ", don't pick him."
  case Some(rugbyPlayerExt) => rugbyPlayerExt.name
  case None => "Not decided yet ?"
}
println(biasedShow(lionsPlayers.get("Outhalf"))) // Outputs Jonny... don't pick him
println(biasedShow(lionsPlayers.get("Scrumhalf"))) // Outputs Ben Youngs

Pattern matching Collections

Scala also provides some powerful pattern matching features for Collections. Here's a trivial exampe for getting the length of a list.
def length[A](list : List[A]) : Int = list match {
  case _ :: tail => 1 + length(tail)
  case Nil => 0
}
And suppose we want to parse arguments from a tuple...
  def parseArgument(arg : String, value: Any) = (arg, value) match {
    case ("-l", lang) => setLanguage(lang)  
    case ("-o" | "--optim", n : Int) if ((0 < n) && (n <= 3)) => setOptimizationLevel(n)
    case ("-h" | "--help", null) => displayHelp()
    case bad => badArgument(bad)
  }

Single Parameter functions

Consider a list of numbers from 1 to 10. The filter method takes a single parameter function that returns true or false. The single parameter function can be applied for every element in the list and will return true or false for every element. The elements that return true will be filtered in; the elements that return false will be filtered out of the resultant list.
scala> val myList = List(1,2,3,4,5,6,7,8,9,10)
myList: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> myList.filter(x => x % 2 ==1)
res13: List[Int] = List(1, 3, 5, 7, 9)
Now now now, listen up and remember this. A pattern can be passed to any method that takes a single parameter function. Instead of passing a single parameter function which always returned true or false we could have used a pattern which always returns true or false.
scala> myList.filter {
     |     case i: Int => i % 2 == 1   // odd number will return false
     |     case _ => false             // anything else will return false
     | }
res14: List[Int] = List(1, 3, 5, 7, 9)

Use it later?

Scala compiles patterns to a PartialFunction.  This means that not only can Scala pattern expressions be passed to other functions but they can also be stored for later use.
scala> val patternToUseLater = : PartialFunction[String, String] = {
     |   case "Dublin" => "Ireland"
     |   case _ => "Unknown"
      }
What this example is saying is patternToUseLater is a partial function that takes a string and returns a string.  The last statemeent in a function is returned by default and because the case expression is a partial function it will returned as a partial function and assigned to pattenrToUseLater which of course can use it later.

Finally, Johnny Sexton is a phenomenal Rugby player and it is a shame to hear he is leaving Leinster. Obviously, with Sexton's busy schedule we can't be sure if Johnny is reading this blog but if he is, Johnny sorry to see you go we wish you all the best and hopefully will see you back one day in the Blue Jersey.

1 comment:

  1. fib(i) where i<0 is actually not 0.

    fib(i) = fib(i-1) + fib(i-2)
    so fib(i) = fib(i+2) - fib(i+1)

    which means fib(-1) = fib(1) - fib(0)
    which is 1-0 = 1

    fib(-2) = 0-1 = -1
    fib(-3) = 1--1 = 2
    fib(-4) = -1-2 = -3
    ,, ,, = 5
    ,, ,, = -8

    and so on.

    ReplyDelete