Concurrency != Threads

Paul Brown @ 2006-05-16T22:55:00Z

A recent paper from Edward Lee at UC Berkeley takes a position against threads as a model for concurrent systems. From the abstract:

[...] Many technologists are pushing for increased use of multithreading in software in order to take advantage of the predicted increases in parallelism in computer architectures. In this paper, I argue that this is not a good idea. Although threads seem to be a small step from sequential computation, in fact, they represent a huge step. They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. [...]

There is a lot to think about in Lee's paper, but I concur with its thesis. The purpose of a computer language is to express in human-intelligible terms the intended behavior of a machine, and by that measurement, threads fail as a mechanism for expressing concurrency. (I'll take Dijkstra's definition of "concurrency", i.e., that there is the "(possibility) of simultaneous activity".)

See the discussion on LtU for more commentary on the paper. I'll have more to say about concurrency in Java without threads later this week.

(comment bubbles) 0 comments

More on Erlang Performance and Threading

Paul Brown @ 2006-05-11T23:34:00Z

After I saw Robert Sayre's results, I thought that I'd give Rickard Green's Erlang exerciserbig.erl” a go on my four-core (two 2.5GHz G5 processors with two cores each) PowerMac (MacOS X 10.4.6) to see the effect of different numbers of schedulers. (Joe Armstrong posted some benchmark information in his blog, but I don't have a means to reproduce them for direct comparison.)

Eye-Grabbing Plots

There's code below, so as an amuse bouche, here are a couple of plots that illustrate the results. (I used HippoDraw to draw the plots.) This first graphic shows the time to execute the benchmark plotted against the number of processes.

#SchedulersColor
1 orange 
2 red 
4 green 
8 blue 
16 magenta 
  

The green plot illustrates that four schedulers breaks even with one or two schedulers at 800 processes and wins from there out. (I did try a 32 scheduler run but ditched it part way through because the performance was so poor.) Here's another plot that provides an alternative visualization.

In the plot, lighter is faster, and as the number of processes increases, it's visually apparent that the four scheduler sequence is superior.

Interpretation

OK — so what gives here?

In comparison with Robert's results (look for the graph), multiple schedulers provided better performance but much less dramatically versus a single scheduler, and performance degraded much more rapidly with more than the optimum number of schedulers. More than likely, the root cause lies down deep in the core of the MacOS X kernel. Apple has a technote that explains threading in MacOS X, and a cursory read suggests that the application-level pthread threading model is deeply layered over the low-level kernel threading model. My interpretation would be that Mach is doing extra work to spread load across lower-level threads when relatively few schedulers are used, so it wouldn't be surprising if a single scheduler manages to use slightly more than one of the cores.

In terms of what SMP (a.k.a. “symmetric multi-processing”) means for Erlang, MT (for “multi-threaded”) would be a better term. The current version of Erlang, R10B, uses a single scheduler thread to process a queue of runnables, and Erlang R11B uses multiple scheduler threads to manage the same queue. (See, e.g., this presentation.) Under (naively) ideal circumstances, a threads works so hard that it fully consumes the attention of a processor and then other threads are forced onto other processors (i.e., number of threads converges to number of processors), but as this benchmark illustrates, the strength of that convergence is determined by the extent to which the operating system kernel cooperates.

Code Snippets

Here's a little snippet of Erlang to make running the benchmark with different numbers of processes easier and dump data in a convenient format:

-module(bmark).
-export([go/0]).

n() -> element(1,string:to_integer(
                  lists:nth(1,init:get_plain_arguments()))).

plur(1) -> "";
plur(_) -> "s".

runbmark([]) -> done;
runbmark([Head|Tail]) ->
    io:format("~4w ~4w ~6.1f~n",
              [n(), 
               Head, 
               trunc(big:bang(Head)/100000)/10]),
    runbmark(Tail).

go() ->
    N = n(),
    io:format("// Running with ~w scheduler~s.~n",
              [N,plur(N)]),
    runbmark(lists:seq(50,1500,50)),
    io:format("~n",[]),
    halt().

And here's some bash to run 1, 2, 4, 8, and 16 schedulers in succession:

for ((i=0 ; i<5 ; ++i )); do \
 path/to/otp_src_R11B_2006-05-08/bin/erl -smp +S$((1<<$i)) \
-noshell -eval 'bmark:go()' -- $((1<<$i)); echo; done
(comment bubbles) 1 comment

Single Threading Good

Paul Brown @ 2006-05-09T05:39:00Z

Perhaps surprisingly, a post from Robert Sayre, who's been playing with Erlang on a SUN Fire T2000 (lucky bum on both counts), doesn't surprise me: best performance is achieved when the number of Erlang schedulers is equal to the number of hardware threads (i.e., “CoolThreads” in the case of the SUN box). Experience has borne out that using a single low-level thread under lightweight higher-level structures to manage concurrency is usually a good plan, and at least intuitively, the pigeonhole principle says that this should apply to additional low-level threads (or cores) if the higher-level structures are designed properly. (More threads than cores would mean that at any point in time one of the cores was supporting multiple threads and thus slower than it would be if it were only supporting one.)

(comment bubbles) 0 comments

Scalability and Data Partitioning

Paul Brown @ 2006-05-09T05:16:00Z

Dion made a post about “minimal shared data” as a key to scaling an application like Basecamp that got me thinking. This is indeed a pattern in scalability — partitioning. If the data will cooperate, then partitioning makes a good alternative to either fancy hardware (e.g., a SAN) or fancy software (e.g., a big multi-master database cluster).

A few references and tidbits on the topic come to mind:

  • Brad Fitzpatrick's presentation on the LiveJournal backend: “Can't join. Don't care. Never join user data w/ other user data.” He goes on to talk about how they move users between partitions and how they built the individual partitions.
  • A great paper from Microsoft Research about migrating TerraServer from a SAN to commodity hardware in 2003. The paper goes into gory detail, all the way to data loading and backup strategies, rack configurations, and network topology. (Thanks to Dragos for the pointer to the paper.)

Gory detail on system loads, network topology, and hardware configuration is what you'd expect to find in a discussion of scalability, as opposed to the language(s) used to write the software.

If you aren't stuck in a relational rut, there are very cool things out there. There are the online storage services like Amazon S3 and StrongSpace, academic projects like OceanStore and OpenDHT, whatever Microsoft and Google field in the space, and plenty more. In terms of things to either download and play with (as opposed to access over the net), the Bamboo Distributed Hash Table (“DHT”) comes to mind, and the accompanying paper on churn is a good read. In the same vein, there is (yet another) interesting paper from Microsoft Research on a system called “BitVault” designed for reliable long-term storage of massive amounts of data like images (e.g., every check processed or every X-Ray taken in the US) and media that uses a DHT under the covers.

(comment bubbles) 0 comments

Greenies

Paul Brown @ 2006-05-06T19:25:00Z

I read an article somewhere (NYT, maybe?) recently about the fact that even with tax breaks and subsidies, hybrid cars are not a winning economic proposition. Nonetheless, people buy them for non-economic reasons, as a social and political statement. This is certainly not a new behavior, as most people who own fast cars in big cities don't actually drive them fast from stoplight to stoplight. Come to think of it, it's about the opposite — the people I remember driving like maniacs on the Chicago freeways were usually in low-end Japanese cars with lots of primer spots and great big spoilers to metaphorically show their manhood.

Since we're a Northwest-living, organic-food-eating, hybrid-car-contemplating, trash-recycling, sandal-wearing, co-sleeping, photovoltaic array-considering, eco-touring family that would otherwise fall into the yuppie demographic, that now makes us “greenies”. (“Guppie” is becoming an official term, but I like "greenie" better...) I was thinking about greenies as an economic force yesterday when we were at REI with the kid. There were a lot of other families shopping, too, with kids and strollers, except that unlike the usual mix of Graco and the occasional Maclaren in the rest of Seattle, every stroller was a Bugaboo.

I'm happy with “less of better” being a mantra for my generation, but I can't help but think that the majority are in the “more of whatever is on TV” WalMart Nation instead.

(comment bubbles) 0 comments

More on Meeting-Making for Google Calendar

Paul Brown @ 2006-05-06T05:55:00Z

After having posted about how it would be possible to take the Atom feeds from Google Calendar and make a collaborative appointment scheduler (meeting time picker for multiple people), I decided to give it a shot using the Atom parsing library for Ruby from Martin Traverso and Brian McCallister.

The Atom library is slick, and doing some simple extensions to the basic binding to support the Google Data elements is straightforward. For example, here's a Ruby snippet that will read the start time, end time, and reminder settings from the feed:

require 'atom'
require 'xmlmapping'
require 'time'
require 'date'
require 'net/http'
require 'uri'

module GoogleData

  NAMESPACE = 'http://schemas.google.com/g/2005'
  
  def GoogleData.int_or_nil(s)
    if s.nil?
      nil
    else
      s.to_i
    end 
  end
  
  def GoogleData.date_or_datetime(s)
    if s.length == 10
      Date.parse(s)
    else
      Time.iso8601(s)
    end  
  end
  
  class Reminder
    include XMLMapping
    
    namespace NAMESPACE
    
    has_attribute :absolute_time, :name => 'absoluteTime',
      :transform => lambda { |t| Time.iso8601(t) }
    has_attribute :days, :name => 'days',
      :transform => lambda { |s| GoogleData.int_or_nil(s) }
    has_attribute :hours, :name => 'hours',
      :transform => lambda { |s| GoogleData.int_or_nil(s) }
    has_attribute :minutes, :name => 'minutes',
      :transform => lambda { |s| GoogleData.int_or_nil(s) }
  end
  
  class When 
    include XMLMapping

    namespace NAMESPACE
    
    # The following little hack is required because the
    # datatype switches between xs:date for all-day
    # appointments and xs:dateTime for non-all-day
    # appoinments.

    has_attribute :start_time, :name => 'startTime',
      :transform => lambda { |s| GoogleData.date_or_datetime(s) }
    has_attribute :end_time, :name => 'endTime',
      :transform => lambda { |s| GoogleData.date_or_datetime(s) }
    has_attribute :valueString
    
    has_many :reminders, :name => 'reminder', :type => Reminder
  end

  class Entry < Atom::Entry
    namespace NAMESPACE
    has_one :when, :name => 'when', :type => When
  end
  
  class Feed < Atom::Feed
    has_many :entries, :name => 'entry', :type => Entry
  end

The two key tricks above are extending Atom::Feed and Atom::Entry to add explicit handling for the extension elements that we're after. (Without any changes, Atom::Entry does capture an array of extension elements, but I'd prefer to work with objects.) Similar approaches can be applied to the other "kinds" of things in the feed. As an editorial comment, I'm lukewarm about the datatype of an attribute value determining its semantics; normally the semantics would determine the datatype.

To grab the data from the Atom feed of the calendar:

response = Net::HTTP.get_response(URI.parse(GCAL_FULL_URL))

# TODO: Limit the number of redirects to follow.
# TODO: Gracefully handle other non-200's here, too.
while response.kind_of? Net::HTTPRedirection
  response = Net::HTTP.get_response(URI.parse(response['location']))
end

feed = GoogleData::Feed.new(response.body)

feed.entries.each { |event|
  puts '---'
  puts event.title
  puts event.when.start_time.to_s + ' -- ' + event.when.end_time.to_s
}

Back to the original goal of building a "meeting maker" for Google Calendar based on the Atom feeds for participants' calendars, the additional work to properly handle recurrence and recurrenceException makes the problem look quite a bit more complicated (and interesting). (Fortunately, there does appear to be an iCalendar (RFC2445) library available as well...) So this is turning into more than a one-evening project.

With the added complexity of supporting recurring events and exceptions, there is probably a tidy approach that augments the list merge I suggested before with generators and sequence comprehensions for the recurring events — just enumerate possible meeting times from the complement of the merged list of "busy" times for non-recurring meetings and test for overlaps in the union (i.e., "or") of the sequences for each participant. (If I recall correctly, the meeting makers in the usual Exchange clients don't support optimal scheduling of recurring meetings, so that would be a nice feature as well, i.e., schedule the recurring meeting at the time with the fewest conflicts or at least minimize the conflicts for some subset of the participants.)

(comment bubbles) 0 comments

Enjoying (Temporary) Retirement

Paul Brown @ 2006-05-03T21:18:00Z

I left my position at Amazon at the end of March, and after pretty much not taking a break for almost seven years, I've been enjoying some time off. (I'm serious about not taking a break — for example, I was hitting Internet cafes during our honeymoon in Italy to check the status of FiveSight's Series A negotiations and read drafts of agreements.)

The odd thing is, like people who are genuinely retired, I find my time filled anyway, albeit with reconnecting with people I haven't talked to recently, checking out the local parks, tinkering with software, or grabbing breakfast at a sidewalk bakery/cafe while the kid runs around. (FWIW, I've decided that referring to my daughter by name in my blog either is or should be irrelevant to my readers who aren't otherwise friends in the real world.) It's entirely pleasant.

Nonetheless, I think that 60 days of this is about all I can stand, so my “v.next” kicks off in June.

(comment bubbles) 0 comments

All Posts contains 399 items in 57 pages of 7 items each:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57