Saturday, December 27, 2008

Learning Scala for its Actors

...and very likely staying on for the rest.

I have been looking at the Actor model over the past couple of weeks as an easy way to write safer multithreaded Java programs. My focus has been on Java frameworks so far, mainly because there is a better chance of being able to build complex multithreaded systems at work than in the random stuff I do for fun, and work is a Java-heavy environment. So while I've been peripherally aware of Scala the language for a while now, mainly through Debashish Ghosh's blog, I've always thought of it as another mildly-interesting fringe language that also ran in the JVM.

However, of late, I've been reading a lot about Scala actors, and I had some time over the Christmas holidays, so I decided to bite the bullet and learn the language. In a way, it was similar to my motivation for learning Ruby for its Rails framework. Having spent about a week with it so far, I find that it is both a beautiful and practical language, once you get the hang of it.

Scala's syntax is similar to Java but different - its a bit like a lecture at college where you start with already knowing what the professor is talking about, so you space out for a bit, but when you get back you have no idea whats happening. Having had some similar experiences reading Scala code in the past, I decided to play it safe and read Ted Neward's A Busy Java Developer's Guide to Scala series on DeveloperWorks in date ascending order. I then worked through Martin Odersky's Scala by Example EBook available from the Scala site. Finally, I bought myself the Programming Scala book by Odersky, Spoon and Venners as a Christmas present, which I am still reading.

In any case, having learned some Scala, I decided to write my toy application that I have been using in my last two posts (here and here) in Scala using Scala's Actor. The result is shown below. It probably looks more like Java than Scala, but it is my first Scala program that actually does something I want to do (as opposed to the various tutorial examples).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
// Source: src/main/scala/myjob/ActorManager.scala
package myjob

import java.lang._
import java.util.concurrent.CountDownLatch
import scala.actors._

object ActorManager {

  val latch = new CountDownLatch(3)
  def decrementLatch(): Unit = {
    latch.countDown
  }

  def main(args: Array[String]): Unit = {
    // start the actors
    DownloadActor.start
    IndexActor.start
    WriteActor.start
    // seed the download actor with requests
    val start = System.currentTimeMillis
    for (i <- 1 until 100000) {
      println("Requested " + i)
      DownloadActor ! Message(i, "Requested " + i)
    }
    // ask them to stop
    DownloadActor ! StopMessage
    // wait for actors to stop
    latch.await
    println("elapsed = " + (System.currentTimeMillis - start))
  }
}

case class Message(id:Int, payload:String)
case class StopMessage()

object DownloadActor extends Actor {
  def act() {
    while (true) {
      receive {
        case Message(id, payload) => {
          println("Downloaded " + id)
          IndexActor ! 
            Message(id, payload.replaceFirst("Requested ", "Downloaded "))
        }
        case StopMessage => {
          println("Stopping download")
          IndexActor ! StopMessage
          ActorManager.decrementLatch
          exit
        }
      }
    }
  }
}

object IndexActor extends Actor {
  def act() {
    while (true) {
      receive {
        case Message(id, payload) => {
          println("Indexed " + id)
          WriteActor ! 
            Message(id, payload.replaceFirst("Downloaded ", "Indexed "))
        }
        case StopMessage => {
          println("Stopping Index")
          WriteActor ! StopMessage
          ActorManager.decrementLatch
          exit
        }
      }
    }
  }
}

object WriteActor extends Actor {
  def act() {
    while (true) {
      receive {
        case Message(id, payload) => {
          println("Wrote " + id)
        }
        case StopMessage => {
          println("Stopping Write")
          ActorManager.decrementLatch
          exit
        }
      }
    }
  }
}

This works as expected, interleaving the output of various actors. The poison pill handling is similar to previous examples, and the manager waiting till all the actors have completed processing and terminated are done using the CountDownLatch similar to the Jetlang example. Compiling and running the code are done using the following commands.

1
2
prompt$ scalac ActorManager.scala
prompt$ scala -cp . myjob.ActorManager

Scala also allows actors to work using an event model. Simply changing the "while (true); receive" to "loop; react" (and import scala.actors.Actor._ for its loop function) achieves the switch to the event based model. The times are comparable, although the latter runs in slightly less time for my application. I ran the code above through progressively increasing number of tasks and charted the elapsed wallclock times, similar to the Jetlang and Kilim examples from my previous two posts. Here is the graphs (Scala-receive is blue and Scala-react is cyan) and the numbers (includes numbers for Kilim (red) and Jetlang (green) from previous posts for comparison).

#-TasksKilimJetlangScala
(receive)
Scala
(react)
19104781
10264992105
10071105172176
10006056308941007
100004648599733443317
10000037076464262431420951

Update - 2009-01-02

As you can see from the comments, the numbers shown above are skewed because of the presence of console IO in the code. Removing the println() calls result in completely different performance numbers. Specifically, Jetlang and Kilim perform way faster than Scala. You can see updated code and performance numbers here.

I am no language expert, but Scala seems to have some nice things going for it. Things I liked about it is its Java-like (to begin with, at least) syntax, and that it is that it is a general-purpose language that can run either in the JVM (or .NET's CLR). The other thing is that it allows one to program in either an object oriented or functional manner, depending on need. While I don't see myself migrating from Java to Scala just yet, it may make sense to use it to build parts of applications and integrate with the main Java application. Another thing I notice is the trend of treating Scala as a sandbox for new ideas for Java, so if I am right about the trend, then it is likely that in the future Java will look more and more like Scala, so the investment in learning Scala would pay off not only in new functional idioms to use in Java or the ability to program in Scala, but also the ability to jumpstart oneself into new Scala-like Java features as they become available.

The ability to add new keywords and operators (really just functions, both of them) to the base language (like LISP's macros) is a great feature, although it can be a bit confusing when starting out. For example, the Actor framework with its receive and send (!) operators are not part of the base language but part of the Actor library. But its quite powerful once you get used to the idea.

14 comments (moderated to prevent spam):

Tim said...

Great to see someone compare the various actor frameworks. I have tried to port your example to Actors Guild (actorsguildframework.org), the one I am working on. You can see the result here:
http://pastebin.com/f1e3dee83

Unknown said...

I think your performance numbers for Jetlang are not accurate. I've benchmarked jetlang against Scala actors and it has never been slower.

Try switching the thread pool to a fixed size set to the same number of cores on your test machine.

ExecutorService exec = Executors. newFixedThreadPool(Runtime.availableProcessors());
PoolFiberFactory factory = new PoolFiberFactory(exec);

Or, you can make the code available and I'll take a look at the performance.

Mike

Rajesh Karmani said...

Nice article! If I understand the number of tasks in the graph corresponds to the limit of the loop in main. Is that right? Based on my little experiments, I am surprised Scala is faster than Kilim.

Sujit Pal said...

@Tim: thanks for the link, I was planning to try out Actors Guild to figure out how to use the API, but this is better :-). I haven't looked at Actors Guild to closely, but I see you are doing bytecode enhancement at runtime. Do you think this may impact performance (say compared to Kilim)?

@Mike: I tried your suggestion, but it actually ran a bit slower for large number of tasks. The original was:

exec = Executors.newCachedThreadPool();

and was changed to:

exec = Executors.newFixedThreadPool(
Runtime.getRuntime().
availableProcessors());

For 10,000 tasks, the elapsed time I got with the old code was 7024ms which went up to 7083ms, and for 100,000 tasks, the old code ran in 37,103ms and the new code ran in 39,255ms.
I am going to jar up the Jetlang example and the Scala example and post it to the jetlang user list for you to look at. Please backlink here with your findings.

@rajesh: Thanks, and yes, the number of tasks corresponds to the limit of the for loop in main. I guess I don't know enough to be surprised at the speed differences, but would definitely appreciate a second pair of eyes on the numbers. If you are familiar with Kilim and Scala, would appreciate you going through the examples to keep me honest.

Unknown said...

I copied the example from your post. Is that the example you use for benchmarking? If so, you need to remove all of the System.out's. The Jetlang example took 37 seconds for 100k on my 2ghz single core with the System.out's. After removing them, the same example only took 1.6 seconds.

Mike

Rajesh Karmani said...

It mostly looks right. Few things that are not uniform across Kilim and Scala: 1) The DownloadActor does not print its message in Kilim. 2) The Kilim version creates a new String object before calling replace (which creates another String object). In fact the Kilim version uses Object type in message and that requires few instanceof checks. In contrast the scala version has String type for Message contents.

Rajesh Karmani said...

Also a minor thing: Scala version starts with index 1, instead of 0. About the thread pool size for Kilim, I believe that have a pool of size 2 works against it because there are only 3 actors in the systems and 6 JVM threads. Each pair is contending for the same mailbox and this creates bottlenecks.
I am working on another Actor framework :) This is called ActorFoundry. I will try to post an AF version as well.

Tim said...

@Sujit Pal: AG creates a few classes for every actor the first time it is instantiated. The exact code is described here. It has no impact on performance after the first actor.

I wonder on what machine and JDK you did those benchmarks. On my Core2Duo T7250 (2 Ghz) with JDK6, output redirected to a file, AG needs about 12s for 100000 tasks. (Please note that if you try it for yourself, the 0.5 release was not very stable and sometimes hangs while running the example with a large number of items. I have uploaded a snapshot of the SVN version which should work fine)

Sujit Pal said...

@Mike: Thanks for looking at the code. I thought of the issue of excessive console IO yesterday evening when I replied to your comment, but I figured that we were still comparing similar things because Kilim has the System.out.println() calls also, did not realize that Scala's println() has a different performance characteristic. I will remove the println() calls from all my examples and republish the times in my next post and point to that from this and the previous post.

Removing the printlns() will compare the frameworks themselves, although the caveat is that most non-trivial actors would probably need to block anyway so the differences between the frameworks would become less significant.

Also, thanks for including my example (with your changes) in the Jetlang distribution. Just curious, do you still think that Fork/Join is a better fit for this example?

@rajesh: Based on Mike's observation, I can see now why you may have been surprised at Scala performing faster than Kilim.

Thank you for looking at the Kilim version and pointing out the problems. Based on your observations, I will do the following to the Kilim code before running the numbers again:
1) Remove the println() call from all the actors
2) Use a String payload to match closer with the Scala version.
3) Use the default Scheduler (1 thread per actor), which would still be higher than number of processors on my box though).

I believe Scala loop (1..100) means 1 to 100 both inclusive, so I think that is the same as the Java for (int i = 0; i < 100; i++).

I looked at the ActorFoundry page, would be great if you can port my example to use it, if you can do this by maybe Saturday this week, I can try running it and doing a side by side with the other frameworks.

@Tim: Thanks for the link to the Actor Proxy article. And yes, I would definitely like to run your Actors Guide port of the example on my machine. My machine is a Fujitsu laptop with AMD Turion 800MHz 64-bit dual core CPU and 2GB RAM, running Fedora Core 9, with Java 6.

Would it be possible to mark your pastebin page as permanent, currently its set to expire in a month? If yes, I can simply point to your code, else I will copy your code over into my post (with attribution of course).

Sujit Pal said...

Tim, did you try for a million tasks? I made the same changes to your code like Mike did for the Jetlang example, ie removing the println() calls and only computing and reporting the elapsed times. For 100,000 tasks, it runs in about 5.2s, but with 1,000,000 tasks, it completes in 68.6s, well above the Scala times of 52s and 38s respectively (for loop/receive and loop/react). I also had to up the default JVM size to 2GB for it to run.

Sujit Pal said...

Mike, I updated the Jetlang example to reflect the changes you suggested, as well as the stuff you changed for the Jetlang download example. For 10K tasks, the elapsed time I am getting is 2.3s which is comparable to your 1.6s.

Tim said...

@Sujit Pal: The original code does not scale and needs a huge amount of memory because of that AsyncResult array. I have modified the code to start only 10000 requests at a time: http://pastebin.com/f350b3137.

Performance here is now 1000 items 0.1s, 10000 items 0.3s, 100000 items 2s and 1000000 19s.


(Feel free to do whatever you want with the code, it's your example anyway :)

Sujit Pal said...

Thanks for the new code, Tim. I am getting better results with it. I changed the batch size from 10,000 to 100,000 since with the first, I was seeing pretty low CPU utilizations and it was taking forever to complete. My numbers with ActorsGuild are now as follows:

1000 items 0.3s, 10000 items 1.14s, 100000 items 4.7s and 1000000 38s.

I will post your code from my local copy to my blog so it is useful for other readers who may come by after your pastebin version is expired.

Unknown said...

Hello Sir,
Actually i am doing project in Scala Language in which i am facing problem with Actors and dbc.
I read some of your comments and thought only you are the right person to help us.
My email id is vidyaadke@yahoo.com
i want some help from you related to Actors and dbc .
if possible please mail me on given yahooid ...

Hoping to get reply from you soon.....