25
Nov 09

Merge sort implementation and performance in Scala and Ruby

I’m not trying to turn this into language A vs. B debate, its just that something interesting happened last night. I’m trying to learn both Scala and Ruby. I’m a bit more enthused by Scala at this point, mostly because I somewhat prefer static typing and I’m a long time Java guy. Now, I am also a long time Perl guy, so I’m not necessarily choosing Scala over Ruby because of lack of experience with dynamic languages.

So, with this said, the best way to learn language idioms is to start actually implementing something. A sample application is good, but due to frameworks and relatively low business logic/algorithm ration to these sample apps, at least the ones I can conceive, you mostly spend time learning the framework idioms vs. actual languages ones.

So I thought I’d start by implementing some algorithms in these languages. I decided on Merge Sort. It’s a O(n log n) divide and conquer algorithm. See the link for more info.

Let’s start off with the code. Again, this is not done in an OO and/or reusable manner, this is mostly procedural/functional code to test mostly functional idioms.

Scala code:

  import java.math._
  import scala.util._
  import scala.collection.mutable._
  object MergeSort extends Application  {

    println("Starting...");

    val list = initListWith(1000)

    val start = System.currentTimeMillis()
    println("Sorted " + merge_sort(list.toList).length + " elements.")
    val end = System.currentTimeMillis()
    println("Total time: " + (end - start))

    def merge_sort(list:List[Int]): List[Int] = {
      if (list.length <= 1) {return list}
      var (left: List[Int], right: List[Int]) = list splitAt Math.round(list.length/2.0).toInt
      left = merge_sort(left); right = merge_sort(right)
      if(left.last > right.head)
        merge(left, right);
      else
        left ::: right
    }

    def merge(l:List[Int], r:List[Int]):List[Int] = {
      var result:ListBuffer[Int] = new ListBuffer()
      var left = l; var right = r;
      while(left.length > 0 && right.length > 0) {
        if(left.head <= right.head) {
          result += left.head
          left = left.tail
        } else {
          result += right.head
          right = right.tail
        }
      }
      if(left.length > 0)
        result.toList ::: left
      else
        result.toList ::: right
    }

    def initListWith(limit:Int):List[Int] = {
      val list:ListBuffer[Int] = new ListBuffer()
      val randGen = new Random()
      (0 until limit).foreach( i => list += randGen.nextInt(1000000) )
      return list.toList
    }

  }

And Ruby code:

  def merge_sort(*list)
    return list if list.size <= 1

    result = []
    middle = (list.size/2.0).round
    left = list[0, middle]
    right =  list[middle..-1]

    left = merge_sort(*left)
    right = merge_sort(*right)

    if left[-1] > right[0]
      result = merge(left, right);
    else
      result = left + right
    end
    return result
  end

  def merge(left, right)
    result = Array.new()
    while left.size > 0 && right.size > 0
      if left.first <= right.first
        result << left.slice!(0)
      else
        result << right.slice!(0)
      end
    end

    if left.size > 0
      result.concat(left)
    else
      result.concat(right)
    end
    return result
  end

  list = (1..1000).collect {|i| rand(1000000); }

  puts "Starting..."
  start = (1000 * Time.now.to_f).to_i
  puts "Sorted " + merge_sort(*list).size.to_s + " elements."
  end_time = (1000 * Time.now.to_f).to_i
  total_time = end_time - start
  puts "Total time: #{total_time}"

So all works great, they both perform the operation correctly. But here is the weird part. JVM bytecode (Scala is very similar to Java) is supposed to be super fast. That’s the assumption I’ve always made on various readings as well as empirical data. Now, check out these benchmarks…

$ scalac merge_sort.scala && scala MergeSort Starting... Sorted 10000 elements. Total time: 374 $ ruby merge_sort.rb Starting... Sorted 10000 elements. Total time: 138

I’m not timing the compilation/startup phase, as you can see from the code. In the case of 10K records, ruby performs almost 3 times as fast. It the case of 100K records:

$ scalac merge_sort.scala && scala MergeSort Starting... Sorted 100000 elements. Total time: 31213 $ ruby merge_sort.rb Starting... Sorted 100000 elements. Total time: 7681

Ruby is 4 times faster than Scala in this case, of course the growth is relative to the (n log n) performance.

I’m not focusing on Ruby here, since Scala is what I’m concerned about. I used a List for most operations that didn’t require in place modifications. I used a ListBuffer for mutable lists, which advertises constant time append/prepend operations. I’m not sure what I’m missing. I’m going to play with it a bit more to see where the culprit is, but again, I’m not a Scala expert, though not very familiar with its collections library and the ins/outs of each implementation.

Any ideas?

** Update **

So after Maxim’s comment, I updated Scala to use

while(left.lengthCompare(0) > 0 && right.lengthCompare(0) > 0)

vs.

while(left.length > 0 && right.length > 0)

All I can say, Scala #$@%^ing rocks. So it seems like the length calculation on a mutable collection is the culprit, which makes perfect sense.

Using lengthCompare(l:Int):Int gets rid of the brute force length calculation. Here is what I get from the docs…

Result of comparing length with operand l. This method is used by matching streams against right-ignoring (…,_*) patterns.

I’m going to dig into the source to figure out how it does this comparison in a bit, but I can only guess that if say, we’re comparing it to length 0, all it needs is to get to the first element in the collection to figure out that the result is false, no need to actually get the full length.

Did I say culprit, well, let me say, BIG culrpit. Here are the benchmark results now…

Scala:

$ scalac merge_sort.scala && scala MergeSort Starting... Sorted 100000 elements. Total time: 387

Ruby:

$ ruby merge_sort.rb Starting... Sorted 100000 elements. Total time: 7667

Wow, what a drastic improvement.

*** Disclaimer: This means nothing in terms of benchmarking! Please don’t waste your time disqualifying this benchmark, because I know it really means nothing in this small context as well as it means nothing without _power of a test measurement. Benchmarks are crap unless properly and fairly compared. This post is not about benchmarking languages, rather I wanted to see how the various idioms and libraries perform while learning both languages. It turned out to be very useful, as a small recommended tuning in Scala’s collections lib completely turned things inside out_ ***

Any ruby folks out there care to comment?

Tags: , ,

8 comments

  1. Try to replace the line #28 while(left.length > 0 && right.length > 0) { with the following while(left.lengthCompare(0) > 0 && right.lengthCompare(0) > 0) { Best regards,

  2. Try to replace the line #28 while(left.length > 0 && right.length > 0) { with the following while(left.lengthCompare(0) > 0 && right.lengthCompare(0) > 0) { Best regards,

  3. I’ve run the Ruby benchmark using the different Ruby interpreters, and I wrapped the loop for benchmark like this: require ‘benchmark’ puts Benchmark.measure{ 3000.times do mergesort(numbers) end } Here’s the promising future of Ruby: $ ruby merge_sort.rb 7.520000 0.520000 8.040000 ( 8.307699) $ ruby1.9 merge_sort.rb 2.830000 0.030000 2.860000 ( 2.951124) $ jruby -J-server merge_sort.rb 4.882000 0.000000 4.882000 ( 4.746000) Cheers.

  4. I’ve run the Ruby benchmark using the different Ruby interpreters, and I wrapped the loop for benchmark like this: require ‘benchmark’ puts Benchmark.measure{ 3000.times do mergesort(numbers) end } Here’s the promising future of Ruby: $ ruby merge_sort.rb 7.520000 0.520000 8.040000 ( 8.307699) $ ruby1.9 merge_sort.rb 2.830000 0.030000 2.860000 ( 2.951124) $ jruby -J-server merge_sort.rb 4.882000 0.000000 4.882000 ( 4.746000) Cheers.

  5. Maxim, thanks. I published an update. This change was the answer. Ilya

  6. Maxim, thanks. I published an update. This change was the answer. Ilya

  7. You’re comparing an implementation with a linked list to one with an array. Why not use Array for the Scala implementation? Also, you’re not giving JIT a chance. Both because of running one sort and because your code runs inside the Application constructor which is not JIT. Try using a regular main and running one sort, then timing another one.

  8. You’re comparing an implementation with a linked list to one with an array. Why not use Array for the Scala implementation? Also, you’re not giving JIT a chance. Both because of running one sort and because your code runs inside the Application constructor which is not JIT. Try using a regular main and running one sort, then timing another one.

Leave a comment