Crunching Java Class Versions with bash-fu

Paul R. Brown @ 2010-08-20T22:59:49Z

I recently needed to root out the JDK 6 classes lurking in an application that was supposed to run on JDK 5, and it turns out that it's not that difficult with a little bash-fu. After unpacking all of the constituent JAR files:

$ find . -name *.class | tee -a classes | xargs -n 1 head -n 1 | \
  cut -b 8 | xargs -IX printf '%d\n' "'X" | \
  paste -d ' ' - classes | grep '^50'

Et, voila! I have the culprit:

50 ./jlayer-1.0.1.jar/javazoom/jl/converter/Converter$PrintWriterProgressListener.class
50 ./jlayer-1.0.1.jar/javazoom/jl/converter/Converter$ProgressListener.class
[...]

A rebuild of the JLayer library, and all's well again.

(comment bubbles) 1 comment

Come Work for Me

Paul R. Brown @ 2010-05-12T09:39:56Z

Multifarious has been running comfortably and profitably for the past couple of years with just me and the occasional subcontractor, but it's time to grow the business: I'm looking for an engineer to come and work for me full-time.

The basic "win" for the position is obvious — great pay, great environment, challenging work — but the bigger picture should be compelling as well. As I described last year, Multifarious is intended to be a springboard for an ongoing series of business experiments, and this is a place where the right candidate can gain knowledge and experience that they wouldn't otherwise have access to in a purely engineering context.

(comment bubbles) 2 comments

Splitting XML Well with XSLT 2

Paul R. Brown @ 2009-09-30T18:25:32Z

I recently had the need to split up a result set from a Solr query into a collection of smaller groups of add requests for POSTing into a different core. There are some ways to make the split work with text processing tools (split and friends), but it's always an open question whether an ad hoc approach will trip over some markup — it's just better to use XML tooling. By no coincidence (based on features missing from ), XSLT 2 makes it easy to do the right thing.

First up is grouping in chunks of 2000 records:

<xsl:for-each-group select="/response/result/doc"
                    group-by="round(position() div 2000)">
...
</xsl:for-each-group>

Outputting each hunk to a file named for the index of the group is also a one-liner:

<xsl:result-document href="{current-grouping-key()}_out.xml">
  <add>
    <xsl:for-each select="current-group()">
      <doc>
        <xsl:apply-templates />
      </doc>
    </xsl:for-each>
  </add>
</xsl:result-document>

And that's it. The only trick is choosing an XSLT  processor, and the superlative Saxon (from Saxonica) is my default choice.

(comment bubbles) 0 comments

Commandline Puzzler

Paul R. Brown @ 2009-09-25T19:39:39Z

Suppose that you have to files that consist of records, one per line, and you want to ensure that none of the records in the second file appear in the first. How do you do it with only the text processing commandline tools commonly available on *nix systems?

(comment bubbles) 7 comments

NoNoSQL

Paul R. Brown @ 2009-09-24T05:38:08Z

Ben Black, one of the organizers of no:sql(east). conference, tweeted, and I twote:

my current vote for renaming #nosql is #altdb. what are your ideas?

Chris Williams, another no:sql(east). organizer, has had similar sentiments, but what really needs to happen is for people to stop using the "NoSQL" term. I originally proposed "dbng" for next-generation database (and with an intended allusion to RELAX NG), but I'm warming up to Ben Black's suggestion of "altdb" for the hint of Usenet alt.* if nothing else.

I propose a new movement called the NoNoSQL movement. It is a movement for those interested in alternative and next-generation databases but not in the inaccurate "NoSQL" neologism.

Seems like some cool altdb schwag (t-shirts, mugs, etc.) is in order — "Not your daddy's database." or "Joiners need not apply." or...

(comment bubbles) 2 comments

Voldemort-Based Twitter Clone Talk at OSCON

Paul R. Brown @ 2009-07-24T00:08:00Z

Dan and I just finished up our talk at OSCON. You can download the slides or view it on Slideshare. I'll probably take it down at some point in the near future, but the sample system from the presentation is up and running for the moment.

We got started on the material for the talk several months back with the Twitter one-to-many publishing problem as a motivating problem to play with various non-relational data stores, and after some dabbling with Cassandra and HBase, we ended up focusing on Voldemort as an initial backend for the system. It is very likely that we'll craft some additional backends, and I'd particularly like to get to a more forgiving model for storing lists. (I'm already part way there on Dynomite with Osmos as the storage engine.)

The system described in the talk uses a small (two nodes) Voldemort cluster and a small cluster of web nodes (JAX-RS with a jQuery front-end) to implement enough microblogging functionality to be interesting — users, follow/followed, publishing — along with a simple dashboard implemented with Cacti and rudimentary deployment automation. The source is out on GitHub if you want to take a look. (Feel free to fork with it...)

[dashboard snapshot]

Dan's blog entry on the presentation is here.

(comment bubbles) 0 comments

If you have nothing to say, say nothing

Paul R. Brown @ 2009-06-05T20:18:16Z

There is never a good reason to announce that you're going to make an announcement. This rule came to mind when I saw this tweet scroll by this morning:

[screenshot of tweet]

This belongs in the same category of non-actions as a blog post to say you haven't been blogging, telling people about your "stealth" startup, or a statement like "with all due respect".

(comment bubbles) 0 comments

All Posts contains 399 items in 57 pages of 7 items each:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57