Going Hiking with Google Earth

Paul Brown @ 2006-02-16T05:41:00Z

Now that I can run Google Earth on my Mac (finally), I tried looking up some of the more out of the way places that I've been and was pleasantly surprised to find that trails in some of the National Parks (e.g., North Cascades National Park) are marked! It's hardly a replacement for a good topographical map, but it does help put things in proper perspective. For example, the mountain goats below are standing just about where the trail goes over the pass. (There were a bunch of marmots running around that day, too, but I didn't get good pictures of them.)

 
(comment bubbles) 0 comments

Ph.D. Holding Me Down

Paul Brown @ 2006-02-13T22:07:00Z

From some spam today:

Have you ever thought that the only thing stopping you from a great job and better pay was a few letters behind your name?
(comment bubbles) 0 comments

Zillow, Redfin, and Househunting

Paul Brown @ 2006-02-11T06:03:00Z

I finally checked out Zillow today after seeing the NYT article, and the accuracy and inaccuracy of the “Zestimate” surprised me. The details for our current house in Seattle were completely wrong (number of bedrooms, bathrooms, square footage, etc.), as was the last purchase price, but once the details were updated, the estimated value was within $1,000 of the price that we paid. Zillow wasn't out of stealth when we were house hunting, but we did make extensive use of Redfin to look at listings and past sales.

At least from where I sit, I'd use Zillow if I was selling a home and Redfin if I was buying one. Zillow would be useful for setting a price and assessing the values of improvements, although various Seattle-specific strategies (e.g., slight underpricing to encourage competition) would still apply to final pricing. In contrast to Zillow's approximations, Redfin draws straight from the MLS, and the sale prices and dates are dead-on. Redfin also includes the pictures from the MLS for current listings.

Commissions may decrease or switch to flat fees as home prices increase (and services like Redfin's “buy it online” may help force the issue), but I don't know if any amount of Web 2.0 juiciness is going to take away the value of the advocacy of a good realtor. (We had great experiences with Jenny Ames in Chicago and Melissa Veilleux-Kiser in Seattle.)

(comment bubbles) 0 comments

First Steps

Paul Brown @ 2006-02-06T07:00:00Z

The kid took her first steps today, about eight of them, to walk from Hien to me. Even without us praising her (which we did immediately), she had a big smile on her face and a definite sense of accomplishment.

During the first few sleepless months, we smiled at and silently cursed anyone who claimed that children grow up quickly, but I'm starting to see what they mean.

(comment bubbles) 2 comments

Populating a Java Object Model from XML

Paul Brown @ 2006-02-05T01:02:00Z

This post describes an approach to populating a Java object model from an XML document. It's an approach that I came up with when working on a particular parsing problem.

Updates: A couple of people have mentioned XStream and XMLBeans, but those fail my tests below. XStream is a serialization tool (as its docs say), and XMLBeans was chubby in terms of the size of the libraries. For what it's worth, if I were willing to suffer a large dependency, XMLBeans version 2 looks pretty good in that it provides a token-oriented interface and location information (via XmlLineNumber).

A closet full of clothes, and not a thing to wear...?

My self-imposed requirements were as follows:

  • Populate a pre-existing Java object model from SAX events.
  • Support multiple XML dialects mapping directly to a single object model.
  • Both the XML dialects and the object model are specified a priori.
  • Impose zero additional dependencies beyond SAX; ideally, the implementation will be just a fancy ContentHandler.
  • Expose SAX location information from the parse (e.g., line and column) to the target object model during construction.
  • Expose namespace context from the parse so that expressions like QNames and XPaths in attribute values can be properly post-processed.
  • Use programmatic configuration, not properties files or XML or annotations to the schema.

For my particular application, the XML documents would be BPEL processes in either flavor (1.1 or 2.0), and the target object model would be PXE's BPEL Object Model or “BOM”. There were additional requirements around handling extensions, but those aren't directly relevant to the approach that I settled on.

Now, surely someone else has had the same or a similar set of requirements, more or less the same sensibilities, and the altruism to post it as open source...

Of the various approaches to XML binding (the bindmark project has a good list), I didn't find any that fit the requirements. Many tools generate an object model from a schema, and while the generated models don't usually meet my taste for API ergonomics (JAXB 1.0 had a particularly rank code smell when applied to the BPEL4WS 1.1 schema...), that would be one way to go if additional dependencies were acceptable. (The idea would be to use the generated object models as data transfer objects and then maintain multiple mappings onto the internal object model as domain objects.) JiBX looked particularly interesting, but it requires XPP3 and uses bytecode enhancement, which would rule out the simultaneous support for multiple XML dialects without intermediate object models. Digester has approximately the right flavor, but the target object model wasn't particularly JavaBean-ish and location information wasn't exposed.

One of the flaws in schema-driven bindings is that XML schema rarely (if ever) encapsulates all of the semantics of the XML language that it can be used to (loosely) validate, so automated or generated bindings do at most a partial job.

The Idea and Outcomes

So I came up with a different approach. The basic idea was to construct a graph of event consumers that closely resembles the grammar for the XML document and use SAX events to walk the graph. Each edge of the graph is decorated with a function that accepts a single SAX event and returns true or false, e.g., a QName with or without an attribute mask, or a non-whitespace characters event. The edges incident to a vertex are ordered, and events are matched (or not) according to the ordering.

From another perspective, this uses the XML parser like a lexer and the graph like a parser.

From yet another perspective, the idea is rather like Haskell's pattern matching, in which case the whole thing could be looked at as a collection of functions that accept a list of SAX events and return an object. Each function consumes the head event from the list, selects another function to pass the tail of the list to, and adds the result of the call to the current object. (The presumption is that objects know how to add various kinds of children or metadata to themselves.) Of course, Haskell wasn't an option. (And of the two Jaskells, Jaskell has a few too many moving parts in the toolchain for my taste, and Jaskell doesn't have pattern matching.)

My first-pass implementation in Java (PXE's bpel-parser module) did the job nicely but wasn't quite as pretty at the code level as I might have liked, as it required a a good amount of boilerplate. That said, and in-line with the lexer/parser observation above, the boilerplate and transition set could easily be generated from a RELAX NG grammar.

Considering that JAXB 2 looks slick, has a non-regressive license, will be part of both Java EE 5 and Java SE 6, and supports passing through some XML fragments in raw form, it would be a difficult call if I faced the same problem at present. (Like JAXB 1, JAXB 2 doesn't expose location information, but location information can be added to the XML document as content using some SAX tricks, but that's a hack.)That said, the need to support semantics beyond those present in the schema might very well drive me down the same path again.

(comment bubbles) 6 comments

Fifteen Minutes to Burn

Paul Brown @ 2006-01-31T17:41:00Z

I would prefer to blow my 15 minutes all at once on something major, but here's a snap that the barrista at the local Tully's snapped of Pat Helland and me on our way back from some pho:

(comment bubbles) 0 comments

Modulus is Versatile

Paul Brown @ 2006-01-30T17:10:00Z

I recently came across a post to a mailing list that asked how to truncate a Java double to two decimal place. The usual answers, either using a formatter and parser or multiplying by 100 and using floor() and then dividing by 100, will work, but there's an even simpler one:

double x = 12345.6789d;
double y = x - (x % 0.01);

As expected, y is 12345.67.

(comment bubbles) 0 comments

All Posts contains 397 items in 57 pages of 7 items each:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57