Scripting for the Cloud

Paul R. Brown @ 2007-12-18T20:50:59Z

Paul Fremantle posted a brief analogy between classic UNIX pipelines and scripting for web services, and I posted a comment about some work going on with Ode that deserves a bit broader audience. Matthieu, Tammo, et al are working on a Javascripty language called SimPEL that maps to BPEL. They're making good progress, and the straw man examples are looking good, both for terseness and for legibility. It's still a way off, but it's on its way.

XML languages are known to be awful (hard on the hands, hard on the eyes, underspecified by document- or data-oriented schemata), and an orchestration DSL that maps onto BPEL has been on various wish lists for a while. (See, e.g., Brian's thought.) WSDL and XSLT should also be on the chopping block, and there are at least a few general approaches to abbreviating an XML syntax from which to draw inspiratiom. (See., e.g., RELAX NG compact syntax and Tom Moertel's PXSL.)

(comment bubbles) 0 comments

Move, Rename, and Ne'er the Twain Shall Meet

Paul R. Brown @ 2007-12-12T20:27:05Z

The mv command in Unix has some subtlety to it in that it either moves or renames the file in question (never mind that it uses rename(2) to do it), but that subtlety is a good thing in that mv usually does what you expect it to do.

Not so with IntelliJ. I use IntelliJ for Java development, and while I do occasionally dally with Eclipse or Netbeans, I always come back. Anyway, with a bunch of packages to rearrange across a multi-module project, I tried using the "move" refactor, e.g., to change com.foo.x to com.foo.y. Contrary to my expectations, this produces com.foo.y.x, but not noticing immediately, I ended up with goofy package names like com.bar.impl.impl.impl. ((in classic Monty Python falsetto) That doesn't have much impl in it...) Grrrr.

A little commandline action will get things cleaned up in terms of file layout, e.g.:

find src -type file \
| sed -E '{h;G;s:\n: :;s/(impl\/)+/(impl\/)/;}' \
| xargs -n 2 mv

The first part of the sed program is a recipe for echoing the input as the first hunk of the line.

(comment bubbles) 0 comments

Just a Scoatch More Memory

Paul R. Brown @ 2007-12-12T20:23:05Z

After I saw post about a WTF-worthy comment a while back, I had meant to post this gem, which I'll paraphrase from memory:

foo *f;
f = (foo *) malloc(sizeof(foo)+4); // seems about right

The comment was the tip that code elsewhere was abusing the allocated memory, and the developer had added the padding to avoid segmentation faults...

There were quite a few similar gems lurking around in that particular app, like a catch block wrapped around printing "Whoops!" to stdout, which wouldn't be so bad if it hadn't been cut-and-pasted all over the place, making it difficult to determine where a "Whoops!" on stdout for the app actually came from...

(comment bubbles) 0 comments

Analyst Predictions, Pricing, and Open Source

Paul Brown @ 2006-11-02T21:25:48Z

Dave Rosenberg, the CEO of the recently funded MuleSource, wrote an op-ed for Sandhill.com about wrapping a business around the Mule project. (And, of course, way to go Ross! Another great project incubated at the Codehaus.) Two of Dave's comments struck a chord with me, since I'd gone over the same ground many times over the six years I spent on FiveSight.

On market sizing:

Market data is fairly easy to find. Analyst firms such as Gartner and IDC frequently publish data that provide a base for your research. In our case we looked at the broadest market opportunity for our product over the next five years. We were able to determine that the aggregate market was $8.5 billion.

Of course, looking at their track records in terms of correct predictions, you realize that folks like Gartner, Forrester, and IDC are usually wrong, and most VCs, being smart folks, know this, too. Moreover, injecting change into a market will change the size and dynamics of the market, and exerting pressure on incumbents will cause them to change their tactics. As an example, one of the things that we ran into was "red" (Oracle) or "blue" (IBM) companies where all-you-can-eat licensing combined with single-sourcing initiatives made selling into those companies impossible, and trying to back those numbers out of broad predictions was pretty much impossible without gathering new data. Selection, adoption, and installation cycles further complicate making naive estimates, and this especially true if your revenue is concentrated in one phase of the customer lifecycle (e.g., up front with training or in back with production support). Nonetheless, the market size slide with at least a $1B market is part of the obligatory small talk that's part of any funding pitch; I just have trouble doing it with a straight face.

On the other hand, you can attack the sizing challenge from the bottom up, and Dave hints at that — the community around the project is all of the data you need. What's the composition of your community? At what rate and under what conditions are community participants converted to customers? What products are people asking for? Part of the beauty and magic of open source is that your customers come to you, and preserving the polarity of that relationship is important. The community activity is really the first stage in the sales funnel for an open source customer, and you can choose the levers that you want to pull (evangelism, training, partnering) to alter the rate and composition of the flow. My preference would be to use this approach combined with rough market segment sizes (number of servers, etc.) to build projections. In FiveSight's case (BPEL execution component), the data told us that our product offering didn't have the breadth (e.g., full BPM platform, full ESB, full integration product) to appeal to a large market but that we could build a profitable OEM-oriented business, and that's the direction that led us to an (intended and expected) exit via acquisition.

On pricing:

Pricing remains one of the great mysteries of any business. Open source companies tend to look at the cost of their nearest competitor and price their offering at some percentage discount.

From my perspective, open source is a method of packaging and delivery of software, which if you're not selling the software, is irrelevant to pricing. Services, support, training, and access to information have the same value as they do for "proprietary" vendors, and the fine line for the open source vendor to walk is in charging for access to information while reinforcing and nurturing the community.

(comment bubbles) 0 comments

Doneness and the Boundary Between Development and Operations

Paul Brown @ 2006-09-10T19:33:00Z

Engineers have a variety of perspectives on ready-for-production, from "It works on my desktop" to "The functional and integration suites all pass" to "The theorem prover and state space traversal heuristics assert that it's perfect". Ultimately, "Are we ready to go to production?" is the wrong question. The right questions are more business-oriented:

  • How would it impact the customer and what would it cost us to flip the switch right now?
  • How can we respond to issues and get the system back on its feet?

The first question is for the business owner to answer outside of the context of what the development team has to say about the state of the system. Failure is a fact, and managing the probability of a failure requires time and investment. It's the business owner's prerogative to choose the points on the curves of magnitude of customer impact and ongoing investment in development — independent of asserted outcome of the development effort. It's a bit like cooking a steak à point in that you want the software to be just barely done.

The second question is for development leads and architects to answer, both at design time and during implementation. Recovery options need to be matched to failure modes and possible inconsistencies, and failures need to be detectable other than via confused or angry customers. The design-time imperative is to pick the right design patterns — stateless when possible, idempotent always, and minimal horizontal co-footprint per unit of traffic come to mind. (I'm saying "co-footprint" because "footprint" isn't quite the right term. A customer spread across a couple of tables is no big deal, but needing to have all of the customers in a single database is constraining.) The development solution is to engineer-in instrumentation and plug-points where data or operations can be cleanly injected into the system, ideally to the point that an operator can use the system in the same ways that the system uses itself. (The body of thought on recovery-oriented computing is a good read on the subject of designing systems so that failures are just bumps in the road.)

In all practical terms, the developers are the only capable and knowledgeable operators for the system, and refining and documenting the system to the point that first-tier and second-tier support can exist is an organic process that needs to happen in response to observed issues, not a phase on a project plan. To extend the steak analogy, this would be bien cuit, which is only suitable as a chew toy for a dog or for the manufacture of footwear. In my mind, intimacy with the system is the bond that links development to operations, but just handing out pagers to software engineers without the proper groundwork is both illegitimate and destructive.

In a maxim, system failures are to be expected, but business failures are unacceptable.

(comment bubbles) 1 comment