SOA 2.0: Mud in the Mud Puddle

Paul Brown @ 2006-05-24T20:05:00Z

As I'm getting caught up on some blogs and industry news, I see that the new version of SOA is out, courtesy of Gartner analyst Yefim Natis:

[...] With SOA 2.0, an event-driven architecture is deployed in which software modules are related to business components, and alerts and event notifications are featured. The initial SOA concept has not been event-driven but instead has featured direct calls from one piece of software to another in a client-server process, Natis said. SOA implementations have focused on Web services and subordinates to clients, he said. [...]

As a dyed-in-the-wool pragmatist and skeptic, I've been kind of grumpy about buzzwords and analyst predictions for a while. (I wish that I'd titled that post "Overhype killed the WS-*", but I'm only that clever retroactively.) SOA is a style of decomposing a software system, not a vendor or an analyst property. (Note the lack of a conspicuous "™"...)

A quick search shows that Yefim has had similar views since the Paleolithic (stone tools, yuk yuk...) era of SOA as a buzzword in 2003:

A large part of the integration problem requires implementation of long-running offline business processes. Here, event-driven architecture (EDA), not SOA, is the industry best practice. Unlike SOA, EDA is the design vision for long-running asynchronous processes (SOA is best applied to real-time request/reply exchanges). In EDA, a process node posts an event (in SOA, a process node makes a targeted processing request). In EDA, posting an event reflects results of some past processing (in SOA, making a processing request directs future processing). In EDA, the poster of the event is disconnected from the processors of the event, if any (in SOA, the requestor of service knows the service and depends on its existence and availability).

If nothing else, this sounds like it confuses SOA with RPC and EDA with publish-subscribe. Yefim's on the right track that orchestration (effectively, BPM, once you add the trimmings) is a programming model for building applications in an SOA, but I disagree that SOA or even web services as a specialization of SOA is narrowly scoped to client-server interactions. My minimalist definition of SOA would be:

A service is an autonomous, opaque software unit that consumes and/or produces well-defined messages.
A service-oriented architecture for a software system is a decomposition into services.

Interaction patterns, delivery mechanisms, programming idioms, and data protocols are often aspects of an application of SOA but are neither controlling (in the sense of terminology in a contract) nor defining. That isn't to say that it isn't worthwhile to think about those aspects and approaches. For a detailed, broadly-scoped, pattern-driven perspective on SOA, compare the definition (and dozens of accompanying patterns) from Dragos Manolescu and Boris Lublinsky. (Among other things, they reference the repository of definitions of software architecture from the Software Engineering Institute at Carnegie Mellon.)

(comment bubbles) 1 comment

A Commandline Nano Interface for the Slimserver

Paul Brown @ 2006-05-21T23:34:00Z

We've become almost exclusively digital music consumers. It's easily been six months since we bought our last CD, and a Squeezebox has replaced a CD player as the primary means of playing music in some parts of the house. After some experimentation with SoftSqueeze as a client for the Slimserver that streams music to the Squezebox, I settled on a combination of the headless squeezeslave player with some shell scripts that use the web interface to the Slimserver (via curl) to play, pause, skip, and shuffle, e.g.:

#!/bin/sh
curl -s http://slim.internal:9000/status_header.html\
\?p0=pause\&player=172.16.1.201 > /dev/null

So far, so good with leaving iTunes behind. Now I just need to write a Quicksilver integration...

(comment bubbles) 0 comments

No Need for First-Class Continuations in Java

Paul Brown @ 2006-05-21T23:03:00Z

Gilad Bracha posted a longish entry about continuations:

[...] I’ve thought about this a bit, and here’s my take on why we really shouldn’t add continuations to the JVM. It’s bound to stir up controversy and annoy people, which is a good reason to post it. By far [t]he most compelling use case for continuations are continuation-based web servers. [...]

At least from my point of view, I don't miss continuations in Java, and considering that I program competently in languages that do offer continuations, I will claim that it's not out of ignorance. (I do frequently miss Java not being a functional language.) Every time I've had a real use for continuations, it was worth implementing something specific that wouldn't have been served by a language feature, but then I've only occasionally felt an urge to compact my code down into the smallest and least comprehensible form. (For what it's worth, I think I've used Java's weak goto, i.e., labeled statements, exactly once in seven years of writing Java code...) I've had a customer ask for something to use XML or J2EE or web services, but I've never had a customer ask if I could make sure that continuations get used on the covers. (At least for my money, the coolest thing about Seaside being implemented in Smalltalk is the debugging functionality.)

Continuations are a valid architectural approach to building a participant in one or more stateful conversations, but I don't necessarily see that aligning to a language-level feature. A web server is one example, where the next request is handled by a continuation of whatever handled the previous request, but you'd have to argue with me that snapshotting the stack is the best way to snapshot a session. Process and workflow engines are another example of a system amenable to implementation with continuation-style programming, where a continuation handles the next message or event (e.g., a timeout) to an instance. In the engine case, the execution state is the state of the engine and not necessarily the state of the underlying programming language runtime (e.g., the call stack) and has properties (e.g., durability) not normally provided by the execution state of a traditional programming language.

Many situations in Java where an anonymously created Runnable is passed (e.g., in Swing GUI programming with invokeLater) to be run sight-unseen are essentially instances of a continuation. (Yes, this is more of a closure, since it's local variables that are getting snapshotted and not really the stack...) Other instances where a continuation might be used, e.g., in implementing a generator for a sequence, are a little awkward in Java because it isn't a functional language, but still possible by wrapping up an Iterator the right way. Also, jumping out of nested loops can be accomplished with labeled statements.

Also, from a purely pragmatic perspective, if a given feature of a system really demands an approach that uses continuations at the language level, do I need Java? For one thing, I can get all sorts of fancy language features from other JVM languages like Scala (which is functional), Groovy (which has closures), Jython (which has generators), and JRuby (which is slated to have continuations in v0.9). For another, I could just implement a simple service (SOAP, POX, REST, XML-RPC, etc.) to encapsulate the required functionality or (gasp) write in a language that compiles to native code and get at it using JNI.

(btw, here's a thread at LtU on the topic of Gilad's post.)

(comment bubbles) 3 comments

Most Impressive...

Paul Brown @ 2006-05-21T04:37:00Z

Via BoingBoing, a story about a townhouse in Utah where the occupant accumulated ~70,000 empty cans of Coors Light. (As mentioned in the story, that's a case of beer every day for eight years, although I find it more impressive to think someone drinking a roughly 20 foot by 10 foot, three-foot-deep pool full of beer...)

This reminds me of some silliness from my senior year at Reed College where my roommate Nathaniel Thurston and I managed to accumulate 2,248 empty Coke cans over a year's time. While we were proud of ourselves at the time, it pales in comparison to the guy stockpiling Silver Bullets. We also made a large “funnelator” that used two chimneys on the roof of Old Dorm Block as uprights, and I hope that one doesn't get one-upped anytime soon.

On that subject, the dynamics of the überfunnelator were interesting. The balloons often left the “muzzle” elongated, and some spun end-over-end stretched out into a dumbbell shape. I'd be happy to entertain explanations, although my thinking was that the surface of the balloon was sticking to the funnel as the water was thrown forward, resulting in the stretched out shape.

(comment bubbles) 2 comments

Note to Readers on Categories and Tags

Paul Brown @ 2006-05-21T04:28:43Z

I've pruned the number of categories down to four — business, personal, software, and technology. My interpretation of things like categories is that if you can avoid content (e.g., the Personal category) that you don't have an interest in.

Typo allows the categories to be subscribed to individually, and I have FeedBurner feeds for each of them. The feeds here will shortly be 302'd to the FeedBurner versions (to keep traffic off of this server), but here's the URL format:

http://mult.ifario.us/xml/atom/category/[CATEGORY]/feed.xml

I'm also now tagging all of my posts within Typo, and individual tags can be subscribed to as well. For example, you could subscribe only to posts about BPEL or about Erlang or about entrepreneurship. I won't be creating FeedBurner equivalents for the tag feeds; here's the URL format:

http://mult.ifario.us/xml/atom/tag/[TAG]/feed.xml

There is presently no way to subscribe to multiple tags or multiple categories or a combination of both, but you could always build a frankenfeed with something like Suprglu.

(comment bubbles) 0 comments

Concurrency != Threads

Paul Brown @ 2006-05-16T22:55:00Z

A recent paper from Edward Lee at UC Berkeley takes a position against threads as a model for concurrent systems. From the abstract:

[...] Many technologists are pushing for increased use of multithreading in software in order to take advantage of the predicted increases in parallelism in computer architectures. In this paper, I argue that this is not a good idea. Although threads seem to be a small step from sequential computation, in fact, they represent a huge step. They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. [...]

There is a lot to think about in Lee's paper, but I concur with its thesis. The purpose of a computer language is to express in human-intelligible terms the intended behavior of a machine, and by that measurement, threads fail as a mechanism for expressing concurrency. (I'll take Dijkstra's definition of "concurrency", i.e., that there is the "(possibility) of simultaneous activity".)

See the discussion on LtU for more commentary on the paper. I'll have more to say about concurrency in Java without threads later this week.

(comment bubbles) 0 comments

More on Erlang Performance and Threading

Paul Brown @ 2006-05-11T23:34:00Z

After I saw Robert Sayre's results, I thought that I'd give Rickard Green's Erlang exerciserbig.erl” a go on my four-core (two 2.5GHz G5 processors with two cores each) PowerMac (MacOS X 10.4.6) to see the effect of different numbers of schedulers. (Joe Armstrong posted some benchmark information in his blog, but I don't have a means to reproduce them for direct comparison.)

Eye-Grabbing Plots

There's code below, so as an amuse bouche, here are a couple of plots that illustrate the results. (I used HippoDraw to draw the plots.) This first graphic shows the time to execute the benchmark plotted against the number of processes.

#SchedulersColor
1 orange 
2 red 
4 green 
8 blue 
16 magenta 
  

The green plot illustrates that four schedulers breaks even with one or two schedulers at 800 processes and wins from there out. (I did try a 32 scheduler run but ditched it part way through because the performance was so poor.) Here's another plot that provides an alternative visualization.

In the plot, lighter is faster, and as the number of processes increases, it's visually apparent that the four scheduler sequence is superior.

Interpretation

OK — so what gives here?

In comparison with Robert's results (look for the graph), multiple schedulers provided better performance but much less dramatically versus a single scheduler, and performance degraded much more rapidly with more than the optimum number of schedulers. More than likely, the root cause lies down deep in the core of the MacOS X kernel. Apple has a technote that explains threading in MacOS X, and a cursory read suggests that the application-level pthread threading model is deeply layered over the low-level kernel threading model. My interpretation would be that Mach is doing extra work to spread load across lower-level threads when relatively few schedulers are used, so it wouldn't be surprising if a single scheduler manages to use slightly more than one of the cores.

In terms of what SMP (a.k.a. “symmetric multi-processing”) means for Erlang, MT (for “multi-threaded”) would be a better term. The current version of Erlang, R10B, uses a single scheduler thread to process a queue of runnables, and Erlang R11B uses multiple scheduler threads to manage the same queue. (See, e.g., this presentation.) Under (naively) ideal circumstances, a threads works so hard that it fully consumes the attention of a processor and then other threads are forced onto other processors (i.e., number of threads converges to number of processors), but as this benchmark illustrates, the strength of that convergence is determined by the extent to which the operating system kernel cooperates.

Code Snippets

Here's a little snippet of Erlang to make running the benchmark with different numbers of processes easier and dump data in a convenient format:

-module(bmark).
-export([go/0]).

n() -> element(1,string:to_integer(
                  lists:nth(1,init:get_plain_arguments()))).

plur(1) -> "";
plur(_) -> "s".

runbmark([]) -> done;
runbmark([Head|Tail]) ->
    io:format("~4w ~4w ~6.1f~n",
              [n(), 
               Head, 
               trunc(big:bang(Head)/100000)/10]),
    runbmark(Tail).

go() ->
    N = n(),
    io:format("// Running with ~w scheduler~s.~n",
              [N,plur(N)]),
    runbmark(lists:seq(50,1500,50)),
    io:format("~n",[]),
    halt().

And here's some bash to run 1, 2, 4, 8, and 16 schedulers in succession:

for ((i=0 ; i<5 ; ++i )); do \
 path/to/otp_src_R11B_2006-05-08/bin/erl -smp +S$((1<<$i)) \
-noshell -eval 'bmark:go()' -- $((1<<$i)); echo; done
(comment bubbles) 1 comment

All Posts contains 397 items in 57 pages of 7 items each:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57