Actors, Scala, and JaCOb

Paul Brown @ 2006-07-13T08:06:04Z

LtU carried an announcement of a paper from Martin Odersky and Phillipp Haller at EPFL called “Event-Based Programming without Inversion of Control”. To quote a snippet:

The central idea is as follows: An actor that waits in a receive statement is not represented by a blocked thread but by a closure that captures the rest of the actors computation. The closure is executed once a message is set to the actor that matches one of the message patterns specified in the receive. The executing closure is “piggy-backed” on the thread of the sender. If the receiving closure terminates, control is returned to the sender as if a procedure returns. If the receiving closure blocks in a second receive, control is returned to the sender by throwing a special exception that unwinds the receiver's call stack.

Scala has had actors in the form of scala.concurrent.Actor since version 1.x, but as extensions of java.lang.Thread they are thread-dependent. Thread-dependent means that a Scala Actor might be preferable to a Java Thread for semantic reasons but that there is no performance advantage. (In fact, a benchmark in the paper shows that it's a disadvantage, but semantics are often more important than raw performance.)

The PXE BPEL engine relies on a little-publicized Java-based actor framework called JaCOb (short for “Java Concurrent Objects”) that makes some different choices than Martin and Phillipp did but that is ultimately based on hooking closures to model the receipt of messages in the future. For a quick tour of JaCOb, there's either the source code or a detailed tutorial written by Matthieu Riou on the Apache Ode incubation wiki.) For one, JaCOb's syntax would be cleaner in Scala, as Java lacks delegates. JaCOb does not use exceptions to break the stack but instead relies on passing communication channels. (Using exceptions for control flow is generally regarded as a bad thing to do. In addition to the philosophical reasons, throwing an exception is involved and, depending on the JVM implementation (e.g., implementing exceptions as signals), can be expensive.) JaCOb doesn't include explicit support for distributed operation as Martin and Phillipp describe for their framework, but a suitable flavor of JaCOb's soup could be implemented.

I'm looking forward to checking out the event-based approach from Martin and Phillipp when it comes out with Scala 2.1.7, and in the meantime, I should finish up the JaCOb tutorial that's been mouldering on my to-do list.

(comment bubbles) 0 comments

PXE Going to Apache

Paul Brown @ 2006-02-16T06:10:00Z

Ismael pinged me today with an email that he sent to the Apache Incubator list:

[...] Our company [Intalio] would be interested in participating to the Ode project through a donation of the PXE BPEL 2.0 engine and the dedication of development resources to the project. [...]

There is already the Agila BPM project under incubation, and there was recently the contribution of a BPEL4WS 1.1 implementation by Sybase. It's my opinion that adding PXE to the mix should actually help clear things up and shorten the time to market for the combined effort — PXE is a complete implementation of the WS-BPEL 2.0 standard with a great team behind it, and it will round out the set of ideas and concepts that the Twister/Agila and Sybase codebases embody. Bill Flood, who represents Sybase on the OASIS BPEL committee, put it in perspective succinctly:

It's all good news for the community at large!
(comment bubbles) 0 comments

Populating a Java Object Model from XML

Paul Brown @ 2006-02-05T01:02:00Z

This post describes an approach to populating a Java object model from an XML document. It's an approach that I came up with when working on a particular parsing problem.

Updates: A couple of people have mentioned XStream and XMLBeans, but those fail my tests below. XStream is a serialization tool (as its docs say), and XMLBeans was chubby in terms of the size of the libraries. For what it's worth, if I were willing to suffer a large dependency, XMLBeans version 2 looks pretty good in that it provides a token-oriented interface and location information (via XmlLineNumber).

A closet full of clothes, and not a thing to wear...?

My self-imposed requirements were as follows:

  • Populate a pre-existing Java object model from SAX events.
  • Support multiple XML dialects mapping directly to a single object model.
  • Both the XML dialects and the object model are specified a priori.
  • Impose zero additional dependencies beyond SAX; ideally, the implementation will be just a fancy ContentHandler.
  • Expose SAX location information from the parse (e.g., line and column) to the target object model during construction.
  • Expose namespace context from the parse so that expressions like QNames and XPaths in attribute values can be properly post-processed.
  • Use programmatic configuration, not properties files or XML or annotations to the schema.

For my particular application, the XML documents would be BPEL processes in either flavor (1.1 or 2.0), and the target object model would be PXE's BPEL Object Model or “BOM”. There were additional requirements around handling extensions, but those aren't directly relevant to the approach that I settled on.

Now, surely someone else has had the same or a similar set of requirements, more or less the same sensibilities, and the altruism to post it as open source...

Of the various approaches to XML binding (the bindmark project has a good list), I didn't find any that fit the requirements. Many tools generate an object model from a schema, and while the generated models don't usually meet my taste for API ergonomics (JAXB 1.0 had a particularly rank code smell when applied to the BPEL4WS 1.1 schema...), that would be one way to go if additional dependencies were acceptable. (The idea would be to use the generated object models as data transfer objects and then maintain multiple mappings onto the internal object model as domain objects.) JiBX looked particularly interesting, but it requires XPP3 and uses bytecode enhancement, which would rule out the simultaneous support for multiple XML dialects without intermediate object models. Digester has approximately the right flavor, but the target object model wasn't particularly JavaBean-ish and location information wasn't exposed.

One of the flaws in schema-driven bindings is that XML schema rarely (if ever) encapsulates all of the semantics of the XML language that it can be used to (loosely) validate, so automated or generated bindings do at most a partial job.

The Idea and Outcomes

So I came up with a different approach. The basic idea was to construct a graph of event consumers that closely resembles the grammar for the XML document and use SAX events to walk the graph. Each edge of the graph is decorated with a function that accepts a single SAX event and returns true or false, e.g., a QName with or without an attribute mask, or a non-whitespace characters event. The edges incident to a vertex are ordered, and events are matched (or not) according to the ordering.

From another perspective, this uses the XML parser like a lexer and the graph like a parser.

From yet another perspective, the idea is rather like Haskell's pattern matching, in which case the whole thing could be looked at as a collection of functions that accept a list of SAX events and return an object. Each function consumes the head event from the list, selects another function to pass the tail of the list to, and adds the result of the call to the current object. (The presumption is that objects know how to add various kinds of children or metadata to themselves.) Of course, Haskell wasn't an option. (And of the two Jaskells, Jaskell has a few too many moving parts in the toolchain for my taste, and Jaskell doesn't have pattern matching.)

My first-pass implementation in Java (PXE's bpel-parser module) did the job nicely but wasn't quite as pretty at the code level as I might have liked, as it required a a good amount of boilerplate. That said, and in-line with the lexer/parser observation above, the boilerplate and transition set could easily be generated from a RELAX NG grammar.

Considering that JAXB 2 looks slick, has a non-regressive license, will be part of both Java EE 5 and Java SE 6, and supports passing through some XML fragments in raw form, it would be a difficult call if I faced the same problem at present. (Like JAXB 1, JAXB 2 doesn't expose location information, but location information can be added to the XML document as content using some SAX tricks, but that's a hack.)That said, the need to support semantics beyond those present in the schema might very well drive me down the same path again.

(comment bubbles) 6 comments

Intalio Acquires FiveSight

Paul Brown @ 2005-12-08T07:02:00Z

On December 6, Intalio announced that it acquired FiveSight, the company that I founded in 1999 and then led through six years of ups and downs.

SteveShu has a few comments on the FiveSight experience. (Steve has always been too humble for his own good; he was a vital member of the team from the moment he joined and played a key role in every win that the company had, including our relationship with Union Pacific and our distribution partner in Japan.) I can still remember Steve's first day on November 6, 2000. He showed up at my apartment in Chicago, and we went over FiveSight's "books" (an Excel sheet) at the kitchen table. That was just before FiveSight kicked-off in earnest with the original crew of six in a big loft in downtown Chicago, complete with a coffee maker, a pool table, and a fish tank.

Looking back, I like to think of FiveSight like a piece of software. We had a pretty good v1.0 release with a good product and some great customers, all driven by the moxie to turn minimal funding and maximal ambition into a going concern. FiveSight v1.0 won awards for rapid growth (426% for 2002, not half bad...) from Deloitte & Touche and the Chicago Software Association. We hit growing pains in 2003, trying to get over the $3M/year revenue hump against an increasingly competitive Java EAI market and the IT spending doldrums that followed the bubble burst and the 9/11 attack.

PXE was FiveSight's v2.0 business. We took stock in the middle of 2003, ramped-down FiveSight v1.0, and decided to bet on BPEL as a standard that would decouple design and protocols from execution. The first version of the BPM component for our previous platform morphed into Maciej's elegant design for the PXE execution core, and we were off to the races with an OEM-focused strategy that emphasized our strengths in low-level Java and high-level, targeted sales focused on the build/buy proposition. (I picked Sleepycat (for their OEM business) and Zope (for their community) as businesses to emulate, but I also started with the assumption that the market for BPEL execution and the overall market context would be completely different — FiveSight v2.0 would have to find its own way.) Open source was part of the thinking from the beginning but not part of the public strategy until June of 2005, when we posted PXE as open source to accompany SUN's tooling demo at JavaOne.

I'll be watching Intalio's progress from my seat on the sidelines, but more on that later.

(comment bubbles) 2 comments

SUN Provides BPMN Tooling for PXE

Paul Brown @ 2005-11-26T07:01:00Z

Via Charles Dietzel and others, the upcoming Java Studio Enterprise contains BPMN tooling and integration with PXE! The getting started guide for BPEL functionality has some nice screen shots and a walkthrough of a classic travel reservation example process.

(comment bubbles) 0 comments