Beust Sequence Ruminations

Paul R. Brown @ 2008-07-03T06:10:59Z

Cédric posted a nice puzzle on his blog:

[W]rite a counter function that counts from 1 to max but only returns numbers whose digits don't repeat.

Bob tweeted about a minimal solution in terms of execution time, although (like the guy in the cartoon) I still like my Haskell version of the sequence (up to some details with use of Data.Int.Int64 and typing the enumeration):

f k = [ n | n <- [1..k], (length . show $ n) == (length . nub . show $ n) ]

And the function to compute the maximum gap:

drop_tail k = reverse . (drop k) . reverse
by_pairs u = zip (drop 1 u) (drop_tail 1 u)
g k = maximum . map (uncurry (-)) . by_pairs . f $ k

This approach, while visually appealing, is unacceptably slow, even with some of the usual optimizations applied. The fact is that any solution that is implemented in terms of testing all of the numbers between 0 and k will perform orders of magnitude more poorly than the recursive style that Bob's using. But how would I know that...?

Subfactorials, Derangements, and Chooses

A derangement is a permutation with no fixed points, e.g., (123) is a derangement of the set {1,2,3}, but (12) is not because it maps 3 to 3. Like the number of permutations of an n-element set (n!), the number of derangements has its own function called the subfactorial and written !n. MathWorld has a decent write-up, with the major takeaway for present purposes being that !n ~ n!.

The number of elements in Cédric's set, for a fixed radix n is the sum from k=0 up to n of !k (n choose k), and that's much smaller than nn. (This reminds me of the whole "bad shuffle" meme from a long time ago.)

Just how much less work does the enumeration approach require than the iteration approach? Here's a quick snippet to compute the number of derangements and the corresponding Beust number for a given radix:

fac :: Integer -> Integer
fac 0 = 1
fac n = n * (fac (n-1))

choose :: Integer -> Integer -> Integer
choose n k = (fac n) `div` ((fac k) * (fac (n-k)))

d :: Integer -> Integer
d 1 = 0
d 2 = 1
d n = (n-1) * ( d (n-1) + d (n-2) )

b :: Integer -> Integer
b n = sum [ (d k) * (n `choose` k) | k <- [ 1..n ] ]

Then, b10 is 3628799, so a solution like Bob's should have around a 2500-fold advantage over a more brute force method. And that advantage gets huge in short order. As reminder of how much bigger nn is than n! and friends, 2626 is roughly 15,264,691,107 times larger than b26...

(comment bubbles) 0 comments

Getting bash Completion Magic on OS X

Paul R. Brown @ 2008-06-25T07:18:06Z

One of the many nifty features of bash is that it provides context-sensitive completion, but for some reason this capability isn't part of the bash that ships with Mac OS X, at least as of 10.5.3, which is what I'm presently using.

To rectify the oversight, first install the bash_completion port via MacPorts:

$ sudo port install bash-completion

And then modify your ~/.profile as directed, e.g., by adding:

if [ -f /opt/local/etc/bash_completion ]; then
    . /opt/local/etc/bash_completion
fi

To load your own local collection of completion hooks, create the directory ~/.bash_completion.d and then put the following in ~/.bash_completion (essentially cut/pasted out of /opt/local/etc/bash_completion):

if [ -d $USER_BASH_COMPLETION_DIR -a -r $USER_BASH_COMPLETION_DIR -a \
     -x $USER_BASH_COMPLETION_DIR ]; then
        for i in $USER_BASH_COMPLETION_DIR/*; do
                [[ ${i##*/} != @(*~|*.bak|*.swp|\#*\#|*.dpkg*|.rpm*) ]] &&
                        [ \( -f $i -o -h $i \) -a -r $i ] && . $i
        done
fi
unset i

Next, above the other block added to ~/.profile, add:

export USER_BASH_COMPLETION_DIR=~/.bash_completion.d

As for the contents of .bash_completion.d, I put the completion files supplied with darcs and cabal-install there, so in addition to the usual niceties like completion of paths on remote servers (e.g., for scp), I also get convenient completion behavior at the commandline, e.g., with cabal:

$ cabal install hsc[TAB]
hsc2hs hsc3 hsclock hscolour hscurses
(comment bubbles) 0 comments

No, Dad, I'm Lowly Worm...

Paul R. Brown @ 2008-06-17T19:46:44Z

Kid #1 is doing well enough that sometimes I forget that she's just a little over three years old, but then she'll whip out a malaprop or mangle an idiom to remind me. For example, while she was explaining to me that she was Lowly Worm and I was Huckle the Cat, she tripped and fell flat on her face, and the following conversation ensued:

Dad: Whoa! Are you alright?

Kid #1: No, Dad! [somewhat exasperated] I'm Lowly Worm...

(comment bubbles) 0 comments

A Simple Asynchronous Handler for hslogger

Paul R. Brown @ 2008-05-29T18:34:23Z

The hslogger package provides a few handlers, e.g., for files and for Syslog, but the implementation of the file logger uses an MVar as a lock for writing log events. This isn't a bad thing, per se, since the GHC runtime will do a good job of queuing waiters for that lock, but it had a noticable impact on performance for perpubplat when I added some initial logging.

To decouple the log event dispatch and disk access, I whipped up a quick AsyncLogHandler (source) that does the job. Here is a quick tour in snippets, starting with a base data structure:

type Timestamp = String

type LogMessage = String

data AsyncLogHandler
    = AsyncLogHandler { channel :: Chan (LogRecord, LogMessage, Timestamp)
                      , level :: Priority }

An admittedly imperfect instantiation for the LogHandler type class:

instance LogHandler AsyncLogHandler where
    setLevel alh p = alh { level = p }
    getLevel alh = level alh
    emit alh lr msg = do { n <- now
                         ; writeChan (channel alh) $! (lr,msg,n) }
    close _ = return () -- make this better

This is less than perfect because it doesn't deal with altering the priority level or closing the stream on the fly, but I'm not concerned with either of those things for the moment. (To make that work, I'd need an additional level of encapsulation on the Chan that wraps log messages and control requests.) I'm on the fence with the necessity of making the writeChan strict in the second argument, but I believe it's necessary to ensure that the timestamp is computed when the message is dispatched as opposed to when it's output.

Setup just requires forking a lightweight thread to pull log messages out of the channel:

asyncHandler :: Int -> Handle -> Priority -> IO AsyncLogHandler
asyncHandler n h pri = do { c <- newChan
                          ; forkIO $ append n 0 h c
                          ; return $ AsyncLogHandler { channel = c
                                                     , level = pri } }

And then a tail-recursive function to pull the events, format output, and periodically (every n messages) flush the stream:

append :: Int -> Int -> Handle -> Chan (LogRecord, LogMessage, Timestamp) -> IO ()
append n i h c = do { ((p,m),l,ts) <- readChan c
                    ; (hPutStrLn h $ ts ++ " [" ++ (show p) ++ "] " ++ l ++ " - " ++ m)
                      `CE.catch` (printex h)
                    ; if i == n then
                          do { (hFlush h) `CE.catch` (printex h)
                             ; append n 0 h c }
                      else
                          append n (i+1) h c }

printex :: Handle -> CE.Exception -> IO ()
printex h e = hPutStrLn System.IO.stderr $ "Error writing to log handle "
              ++ (show h) ++ ": " ++ show e

And that's it.

(comment bubbles) 1 comment

Just cancel, OK?

Paul R. Brown @ 2008-05-20T23:36:03Z

Usability according to Abbott and Costello:

Are you sure you want to cancel? OK or Cancel.

(comment bubbles) 0 comments

Fatherhood, Take Two

Paul Brown @ 2008-04-09T06:00:00Z

Our son was born today.

(comment bubbles) 4 comments

Conditional GET Support for perpubplat

Paul R. Brown @ 2008-03-05T08:34:12Z

As part of being a good netizen, I added conditional GET support (per 9.3 in the HTTP 1.1 spec) to perpubplat in the form of ETag (MD5 of feed URI and last modified date) and Last-Modified headers on generated Atom feeds and corresponding If-None-Match and If-Modified-Since headers on requests for Atom feeds with proper precedence. (For precedence, the spec dictates that a successful If-None-Match assertion means that any If-Modified-Since assertion is ignored.) A quick curl experiment shows that things appear to work:

$ curl -i -s -o - http://mult.ifario.us/f/t/haskell/atom.xml | \
> egrep \^ETag\|\^Last-Modified\|\^HTTP\/
HTTP/1.1 200 OK
Last-Modified: Sat, 23 Feb 2008 23:55:23 GMT
ETag: 78790a6a7d6bddd10f6f9c412f2aba97
$ curl -H 'If-None-Match: 78790a6a7d6bddd10f6f9c412f2aba97' \
> -i -s -o - http://mult.ifario.us/f/t/haskell/atom.xml | \
> egrep \^HTTP\/
HTTP/1.1 304 Not Modified
$ curl -H 'If-Modified-Since: Sat, 23 Feb 2008 23:55:23 GMT' \
> -i -s -o - http://mult.ifario.us/f/t/haskell/atom.xml | \
> egrep \^HTTP\/
HTTP/1.1 304 Not Modified
$ curl -H 'If-None-Match: foo' -H 'If-Modified-Since: Sat, 23 Feb 2008 23:55:23 GMT'
> -i -s -o - http://mult.ifario.us/f/t/haskell/atom.xml | \
> egrep \^HTTP\/
HTTP/1.1 200 OK
(comment bubbles) 0 comments

All Posts contains 397 items in 57 pages of 7 items each:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57