Apache2 + mod_fcgid + Rails

Paul Brown @ 2006-11-05T20:39:46Z

With a default Typo configuration running on lighttpd, it was producing more output in the error log than in the access log with something about "missing cleanup", so I finally got around to migrating over to apache2 and mod_fcgid. (The short-term solution was a couple of cron jobs to keep the lighttpd process up.) The usual configuration work (see, e.g., James Duncan Davidson's write-up) was a dead-end, but turning up the logging verbosity on the apache2 instance had the solution:

[Sun Nov 05 18:13:19 2006] [info] mod_fcgid: read data timeout in 5 seconds

Initialization for one of the processes takes way longer than that, so putting some more sane values into the mod_fcgid configuration along the lines of what Spencer Miles used was the solution. At least so far, the apache2 and mod_fcgid combination is much snappier and more stable than the lighttpd configuration.

(comment bubbles) 2 comments

Sure, it runs like a top. How does it idle?

Paul Brown @ 2006-11-04T03:59:00Z

A couple of weeks back, I wrote a simple but well-instrumented Java framework to handle SEDA-like use cases (thread pools linked by queues) for a consulting customer. The java.util.concurrent package and friends makes this sort of thing much easier than it used to be, and it was surprisingly easy to crank it out. As a smoke test, I set up a torture test for a simple configuration and left it running over night, and it appeared solid — no memory or thread leaks, no lock-ups.

Someone working on a different problem needed something similar and took the framework for a quick test drive, ending up with an out of memory error after a night of doing nothing! It turns out that there was a bug that meant that the poll(long,TimeUnit) on an empty LinkedBlockingQueue leaks, and I'd never run across it in testing because I hadn't tested what happens when the system has no load for an extended period.

The lesson is that no load doesn't mean that the system is actually doing nothing, and it's an important scenario to add to a test plan.

(comment bubbles) 2 comments

Analyst Predictions, Pricing, and Open Source

Paul Brown @ 2006-11-02T21:25:48Z

Dave Rosenberg, the CEO of the recently funded MuleSource, wrote an op-ed for Sandhill.com about wrapping a business around the Mule project. (And, of course, way to go Ross! Another great project incubated at the Codehaus.) Two of Dave's comments struck a chord with me, since I'd gone over the same ground many times over the six years I spent on FiveSight.

On market sizing:

Market data is fairly easy to find. Analyst firms such as Gartner and IDC frequently publish data that provide a base for your research. In our case we looked at the broadest market opportunity for our product over the next five years. We were able to determine that the aggregate market was $8.5 billion.

Of course, looking at their track records in terms of correct predictions, you realize that folks like Gartner, Forrester, and IDC are usually wrong, and most VCs, being smart folks, know this, too. Moreover, injecting change into a market will change the size and dynamics of the market, and exerting pressure on incumbents will cause them to change their tactics. As an example, one of the things that we ran into was "red" (Oracle) or "blue" (IBM) companies where all-you-can-eat licensing combined with single-sourcing initiatives made selling into those companies impossible, and trying to back those numbers out of broad predictions was pretty much impossible without gathering new data. Selection, adoption, and installation cycles further complicate making naive estimates, and this especially true if your revenue is concentrated in one phase of the customer lifecycle (e.g., up front with training or in back with production support). Nonetheless, the market size slide with at least a $1B market is part of the obligatory small talk that's part of any funding pitch; I just have trouble doing it with a straight face.

On the other hand, you can attack the sizing challenge from the bottom up, and Dave hints at that — the community around the project is all of the data you need. What's the composition of your community? At what rate and under what conditions are community participants converted to customers? What products are people asking for? Part of the beauty and magic of open source is that your customers come to you, and preserving the polarity of that relationship is important. The community activity is really the first stage in the sales funnel for an open source customer, and you can choose the levers that you want to pull (evangelism, training, partnering) to alter the rate and composition of the flow. My preference would be to use this approach combined with rough market segment sizes (number of servers, etc.) to build projections. In FiveSight's case (BPEL execution component), the data told us that our product offering didn't have the breadth (e.g., full BPM platform, full ESB, full integration product) to appeal to a large market but that we could build a profitable OEM-oriented business, and that's the direction that led us to an (intended and expected) exit via acquisition.

On pricing:

Pricing remains one of the great mysteries of any business. Open source companies tend to look at the cost of their nearest competitor and price their offering at some percentage discount.

From my perspective, open source is a method of packaging and delivery of software, which if you're not selling the software, is irrelevant to pricing. Services, support, training, and access to information have the same value as they do for "proprietary" vendors, and the fine line for the open source vendor to walk is in charging for access to information while reinforcing and nurturing the community.

(comment bubbles) 0 comments

Hack for Remotely Quitting Apps on MacOS X

Paul Brown @ 2006-11-01T16:15:00Z

Every so often, I'm away from my primary home machine (accessible via secure shell on a non-standard port) and I want to shut down Mail or Adium or RemoteDesktop or some other application. The following hack provides a more graceful shutdown than a kill from the commandline:

$ osascript
tell application "Mail"
        quit
end
<CTRL-D>

Where the "Mail" could be "Adium" or whatever.

(comment bubbles) 0 comments

Solitaire Cipher in Haskell

Paul Brown @ 2006-10-25T23:59:00Z

Jim Burton started a thread on haskell-cafe about working the Ruby Quiz problems in Haskell, and I decided to give it a go. I can't say that I'll work them all, but here's my solution to the first problem — implementing Bruce Schneier's Solitaire encryption algorithm. Among other things, a solution provides a quick walk-through of using Haskell's built-in Enum classes and list operations.

Step 1: A Deck of Cards

One of the ingredients for the cipher is a deck of 52 cards, numbered bridge-style from the ace of clubs through the king of spades and then followed by two jokers with suits "A" and "B". I'd like to implement the deck as a 2-tuple of a suit Enum, where the two jokers come from different suits, and a face Enum, like so:

data Suit = Clubs | Diamonds | Hearts | Spades | A | B
            deriving (Enum, Show, Bounded, Eq)

data Face = Ace | Two | Three | Four | Five | Six | Seven 
          | Eight | Nine | Ten | Jack | Queen | King | Joker
            deriving (Enum, Show, Bounded, Eq)

The "deriving" expression is worth some explanation after a 30-second, 30,000-foot look at Haskell's type system. A class in Haskell is a set of assertions of the form "there exists a function f with signature..." and potentially some default definitions, and a type can be an instance of the class if it has functions that meet the assertions. For example, the Eq class is defined:

(==), (/=) :: a -> a -> Boolean

x /= y = not (x == y)
x == y = not (x /= y)

For a given type that would play the role of the a, it's up to the implementer to supply (==) and (/=) functions with the correct signatures. The second and third statements mean that if the implementer only defines one of the two, the other is defined in the standard way. Nonetheless, the precise semantics of the functions — e.g., whether == remotely resembles "equals" or whether x==y implies not(x/=y) — are up to the implementer.

Back to the Suit and Face enumerated type definitions, the deriving tells Haskell that the type is an instance of the listed classes by inheriting default implementations. In simplest terms:

  • An instance of Enum has (at least) functions that convert from and to integer indices.
  • An instance of Bounded has a least element and a greatest element.
  • An instance of Show has a function to convert to a String.
  • An instance of Eq has (at least) an == operator.

(The links above are to the Zvon Haskell reference.) Haskell supplies these functions by numbering the enumerated elements starting at 0. A quick example with ghci:

*Main> Ace
Ace
*Main> succ Ace
Two
*Main> succ it
Three
*Main> fromEnum Queen
11
*Main> Ace == Two
False

(In ghci, it refers to the last result.)

Now, with a little more effort, we can create a Card type that enumerates the deck as tuples of (Suit,Face), except that we want to supply a custom enumeration, either using dictionary ordering for a normal card or a custom index for the jokers:

data Card = Cd Suit Face
          deriving Eq

As above, this means that Haskell will supply an == for us, and it's important to have, e.g., to use functions like elemIndex:

Eq a => a -> [a] -> Maybe Int

I'll come to the Maybe monoid below, but the Eq a => means that the a in the definition must be an instance of Eq. Next up are a couple of convenience functions to access the components of a Card:

suit :: Card -> Suit
suit (Cd s _) = s

face :: Card -> Face
face (Cd _ f) = f

The Solitaire cipher imposes the bridge dictionary ordering on the deck with the A Joker and B Joker coming after the king of spades in the default order. So, the instance declaration that makes Card into an Enum:

instance Enum Card where
    toEnum 53 = (Cd B Joker)
    toEnum 52 = (Cd A Joker)
    toEnum n = let  d = n `divMod` 13
               in Cd (toEnum (fst d)) (toEnum (snd d))
    fromEnum (Cd B Joker) = 53
    fromEnum (Cd A Joker) = 52
    fromEnum c = 13* fromEnum(suit c) + fromEnum(face c)

Among other things, an instance of Enum makes the arithmetic sequence notation .. can be used to construct ranges, so the whole deck would be:

[(Cd Clubs Ace) .. (Cd B Joker)]

Note that typing this into ghci will result in an error. The type doesn't implement Show, so Haskell doesn't know how to display the elements of the list. This is easy enough to fix up:

show_suit :: Suit -> String
show_suit s = (take 1) (show s)

show_face :: Face -> String
show_face f = (take 1) (drop (fromEnum f) "A23456789TJQK$") 

instance Show Card where
    show c = (show_face (face c)) ++ (show_suit (suit c))

Now we can get a look at our deck:

*Main> [(Cd Clubs Ace) .. (Cd B Joker)]
[AC,2C,3C,4C,5C,6C,7C,8C,9C,TC,JC,QC,KC,
 AD,2D,3D,4D,5D,6D,7D,8D,9D,TD,JD,QD,KD,
 AH,2H,3H,4H,5H,6H,7H,8H,9H,TH,JH,QH,KH,
 AS,2S,3S,4S,5S,6S,7S,8S,9S,TS,JS,QS,KS,
 $A,$B]

(The linebreaks are added.) We're almost done, but the Solitaire cipher assigns different values to the cards than our enumeration does, so we wrap that up in a function:

value :: Card -> Int
value (Cd B Joker) = 53
value c = fromEnum c + 1

Step 2: Implement Shuffling

The Solitaire cipher uses a shuffling algorithm to generate a sequence of letters from the cards in the deck (thus the name for the cipher), and the next step is to implement the shuffling algorithm on top of the Card data type. There are three fundamental operations:

  • "Move down" moves a card down in the deck. The deck is imagined to be circular, so moving a card "down" really involves swapping it with the card immediately below, where the card below the bottom of the deck is the top of the deck.
  • "Triple cut" fixes the (inclusive) interval between two cards and swaps the top and bottom portions.
  • ""Count cut" takes a number of cards off the top of the deck equal to the value of the card on the bottom of the deck and inserts those above the bottom card.

One approach would be to model these three operations as functions:

m :: Card -> [Card] -> [Card]             -- "move down"
t_cut :: Card -> Card -> [Card] -> [Card] -- "triple cut"
c_cut :: [Card] -> [Card]                 -- "count cut"

With these in hand, the shuffle algorithm is:

c_cut ( (t_cut ja jb) ( (m jb) ((m jb) ( (m ja) ( deck )))))

where I'm using ja for (Cd A Joker) and jb for (Cd B Joker).

The whole implementation, complete with some inelegant bits for improvement, is here (or pretty-printed code here) and works:

*Main> encode "Code in Ruby, live longer!"
"GLNCQMJAFFFVOMBJIYCB"
*Main> decode it
"CODEINRUBYLIVELONGER"

Not all of the code is that pretty (I got a little bored toward the end...), so I'll just include a snippets here that demonstrate basic list handling and Maybe.

Maybe is a convenience that sidesteps the null return type problem in other languages. For example, here's a function that splits a String into five-character groups with all non-letters removed, all letters capitalized, and the last group padded:

cleanse :: String -> String
cleanse c = (map toUpper) ((filter isAlpha) c)

pad :: Int -> Char -> String -> String
pad n c s | length s < n = s ++ (replicate (n-length s) c)
pad n c s = s

maybe_split :: String -> Maybe(String,String)
maybe_split [] = Nothing
maybe_split s | w == "" = Just (pad 5 'X' s,w)
              | True = Just (take 5 s, w)
              where w = drop 5 s

quintets :: String -> [String]
quintets s = (unfoldr maybe_split) (cleanse s)

The Nothing value is just that, while Just wraps a real value. (Note that Nothing is outside of the normal value space of the wrapped type, so unlike null, this makes the semantics of "no return value" explicit.) The unfoldr function is a way to generate a list by repeatedly applying a function. It appends the first component of the return value to the list and then applies the function to the second component until the function returns Nothing. The quintets is almost the pretty-print routine discussed in the quiz and in the cipher:

*Main> quintets "That was an interesting exercise."
["THATW","ASANI","NTERE","STING","EXERC","ISEXX"]
*Main> concat (intersperse " " it)
"THATW ASANI NTERE STING EXERC ISEXX"

That said, the pretty-printed version is useless for computing the cipher...

I can think of a few ways to make this more elegant and efficient, and maybe I'll give that a shot later. In the meantime, hopefully it's an entertaining example.


Update. There is now a page on the Haskell wiki devoted to solutions.

(comment bubbles) 0 comments

Da da, More or Less

Paul Brown @ 2006-10-25T00:38:00Z

The kid has started talking in earnest. She started using a few made-up words for things ("mum mum" was food, "boo-ey" was book) as much as a year ago, but now she understands dozens of words and uses a few dozen — car, baby, cookie, etc. — that are close enough to pass for the real thing. This morning, she started saying "da da" to refer to me, which was a nice way to kick off the day.

Tonight, she pointed out a package of Brawny paper towels. The strapping, flannel-clad man on the package is "Dada". (I'll admit a certain resemblance, at least in wardrobe.) Then Paul Allen is on the cover of Seattle magazine, and he's "Dada", too. Who knows how many other Dada's could be lurking around...? I'm sure that we can get a finer point on just who "Da da" is and isn't in a couple of days, and it's good be one her Dadas at any rate.

(comment bubbles) 0 comments

Bye bye TextDrive, Hello Linode

Paul Brown @ 2006-10-21T01:11:27Z

The TextDrive folks are (not unjustifiably) cracking down on people who bump into their resource limits (e.g., a hard limit of 48Mb resident and 80Mb virtual per process), both in the form of a watchdog process that kills off processes that cross the line and in the form of emails out to folks (like me) who use a process nanny to restart their killed processes. Unfortunately, TextDrive doesn't sell more per-process memory with any of their plans short of the $250/month price point, so I packed up and moved over to Linode.

Linode provides user-mode Linux virtual private servers with reasonable specs, and so far so good. I dropped in an Ubuntu 6.0.6 distribution, got things configured, and had the typo instance moved over with a total of about 30 minutes of time invested, most of which was invested fiddling libraries needed by mysql and sqlite gems. (The Debian/Ruby team has an official position about why RubyGems isn't included, but it should be apparent that they've already lost to momentum in the Ruby community.) For $20/month instead of $12/month, Typo has space to run (until I'm done writing its replacement), 500s are a thing of the past, and I've got root access instead of a goofy control panel. Much better.

(comment bubbles) 1 comment

All Posts contains 397 items in 57 pages of 7 items each:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57