No Arrow Notation in Java, Please

Paul Brown @ 2006-12-28T05:41:00Z

Via Dion (via Kirk, via Danny), the syntax

a -> foo = b -> foo

as shorthand for

a.setFoo(b.getFoo())

is on the table. I'll just be blunt and say that I don't like it. First off, it makes me think of Haskell, where <- and -> have meaning consistent with the use of arrows in mathematical notation, and I can't help but read the fragment above as "a maps to foo assigned to b maps to foo". (Yes, it also makes me think of the the C syntax dereferencing a member of a structure, ptr ->member as shorthand for (*ptr).member.) Second, what does it really save? At least in the example above, it saves two characters for each property ("->" in place of ".set" or ".get") and a total of four parentheses. Unlike oil and polar bears, we have an infinite supply of parentheses, so any arguments of efficiency or efficacy aren't going to win me over.

On the other hand, if some more visually appropriate syntax were to be added that supported indirect access, e.g.:

String foo = "bar";
a !! foo = b !! "bar"

you might get me interested, since this would provide a useful idiomatic syntax that would save me work with dynamic proxies and reflection.

(comment bubbles) 2 comments

...and boy are my arms tired.

Paul Brown @ 2006-12-27T15:39:00Z

I woke up this morning with my right hand stiff, sore, and roughly claw-shaped, but it took me a minute to groggily recall the cause. We drove up from visiting family in Portland (i.e., the one in Oregon) yesterday, and the drive was about as bad as it could have been: almost six hours straight of either crawling through congestion in Tacoma or blearily staring at the tail lights of other cars through driving rain. I tend to tighten my grip on the wheel if we hit standing water, and that was pretty much the whole trip. Maybe we'll take the train next time...

(comment bubbles) 1 comment

Secret Santas in Haskell III: Lather, Rinse, Repeat

Paul Brown @ 2006-12-22T08:47:00Z

In Part I and Part II, I introduced the problem and observed some properties of the problem. This part exploits some of those observations and describes a straightforward but possibly (probably) inefficient solution.

No Experimenting on People

Experimenting with Participants is awkward, and I introduced a Marked class in Part I in the interest of working with easier-to-manipulate data:

class Marked a where
    marker :: a -> String

to that end, I'd like to use the classic combinatorial experimental subject of colored balls instead:

data Ball = Red | Blue | Green | Purple | Orange
            deriving (Show,Enum,Bounded,Eq)

instance Marked Ball where
    marker b = [(head.show) b]

The following convenience functions provide either a randomly colored ball or a list of randomly colored balls:

ball :: Int -> Ball
ball = toEnum

random_ball :: IO Ball
random_ball = do { r < newStdGen
                 ; let (x,_) = randomR (fromEnum(minBound::Ball),
                                        fromEnum(maxBound::Ball)) r
                 ; return (toEnum x) }
                  
random_balls :: Int -> IO [Ball]
random_balls n = do { r < newStdGen
                    ; let rs = randomRs (fromEnum(minBound::Ball),
                                         fromEnum(maxBound::Ball)) r
                    ; return (map toEnum (take n rs)) }

For example, in ghci:

Main> :load secretsantas.lhs
[1 of 1] Compiling Main             ( secretsantas.lhs, interpreted )
Ok, modules loaded: Main.
*Main> x < random_ball
Green
*Main> y < random_balls 10
[Blue,Purple,Red,Purple,Blue,Purple,Green,Blue,Blue,Orange]
*Main> concat (map marker y)
"PPBORGOPOR"

The "<-" notation at the prompt is for binding a symbol to the result of an action in the IO monad. (This is explained in the ghci documentation.)

Idea of the Solution

Let's start with a raw list of marked items:

The first step is to gather together the consecutive entries with the same marker, like so:

Next, take one element from each segment and put the resulting list aside. Combine the leftovers together, eliminating any gaps and combining any adjacent groups with the same marker. Note that we're working on a circular list, so the leftmost and rightmost groups are adjacent. Continue until either all of the groups are gone or there is only a single group (all one marker) remaining. Here's a pictorial example:

In steps:

  1. Collect one element from each group. No leftover groups are empty, so continue.
  2. Collect one element from each group.
  3. One leftover group is empty, so eliminate it. No adjacent leftover groups are the same marker, so continue.
  4. Collect one element from each list.
  5. Two leftover groups are empty, and once they are removed, the adjacent groups have the same marker (orange), so combine them.
  6. Collect one element from each group.
  7. Two leftover groups (purple and rightmost light green) are empty, so eliminate them. No adjacent groups are the same marker, so continue.
  8. Collect one element from each group. No leftover groups are empty, so continue.
  9. Collect one element from each group. The one leftover group consists of elements all with the same marker.

With the collected groups assembled, the last group of leftovers either fits or does not, and in this case, it's works out:

If it hadn't worked out, then there would have been no solution.

The Grouping Step in Haskell

Haskell has a well-chosen selection of list operations, so this is almost a one-liner:

segments :: Marked a => [a] -> [[a]]
segments = groupBy (\ x y = (marker x == marker y))

A quick note on notation: the backslash notation (it's supposed to look like a λ) is Haskell's syntax for inline function definition, and the expression defines the segments function by currying the first argument of groupBy with the function defined in-line. This isn't quite right, however, as the marker of the first group and the marker of the last group might be the same, so what we really want is a version that simulates grouping on a circular list:

align :: Marked a => [a] -> [a]
align x = let q = marker(last x) in
          f(g (q,[],x))
              where
                f (q,[],y) = y
                f (q,y,z) = z ++ (reverse y)
                g (q,y,[]) = (q,y,[])
                g (q,y,z@(w:ws)) | marker w == q = g (q,w:y,ws)
                g (q,y,z) = (q,y,z)

segments :: Marked a => [a] -> [[a]]
segments = (groupBy (\x y -> (marker x) == (marker y))).align

This is both a little ugly and has the desired behavior:

*Main> y < random_balls 10
[Blue,Green,Orange,Purple,Green,Blue,Blue,Orange,Blue,Blue]
*Main> segments y
[[Green],[Orange],[Purple],[Green],[Blue,Blue],[Orange],[Blue,Blue,Blue]]

It also behaves correctly for the two edge cases:

*Main> let y = [Blue,Blue,Blue] 
*Main> align y
[Blue,Blue,Blue]
*Main> let y = [Purple]
*Main> align y
[Purple]

But what about the other edge case, align []? The initial definition of q looks like it should cause an exception, as last [] is undefined. (Actually, [] isn't sufficiently specific to allow Haskell to evaluate it in ghci; the right thing is to qualify it as an empty list of Marked items, i.e., []::[Marked].) We could (and probably should) specify the behavior of align [], but laziness means that the definition of q is never evaluated in this case:

align [] = f(g (q,[],[]))
         = f((q,[],[]))
         = []

The Extraction and Simplification Step in Haskell

Back to the problem at hand, so long as the collected list has more than one entry, one element from each of the lists would meet the secret santa constraint because no two consecutive elements have the same marker, by construction. Stripping an element from each segment gives us a valid list fragment and a new list of segments, some of which may be empty. Implementing the "remove empties and combine common" step in Haskell is equivalent to flattening the leftover lists and re-executing the segment operation. Here's one function that both captures the first item in each group and prepares a new set of groups:

reap :: Marked a => [[a]] -> ([a],[[a]])
reap x = ((map head) x,(segments (concat ((map tail) x))))

And this works more or less as it should (with some line breaks added for legibility):

*Main> y < random_balls 25
[Green,Green,Green,Blue,Red,Purple,Purple,Purple,Orange,Purple,
 Orange,Purple,Blue,Green,Blue,Purple,Orange,Purple,Red,Blue,
 Blue,Green,Orange,Green,Blue]
*Main> reap (segments y)
([Green,Blue,Red,Purple,Orange,Purple,Orange,Purple,Blue,Green,
  Blue,Purple,Orange,Purple,Red,Blue,Green,Orange,Green,Blue],
 [[Green,Green],[Purple,Purple],[Blue]])

We're almost ready for a recursive solution, but not quite. We need to be able to collect the good lists reaped at each pass, and this uses the observation from Part II about combining orbits together. Combination of orbits in Haskell:

compatible :: Marked a => [a] -> [a] -> Bool
compatible x y = ((marker (head x)) /= (marker (last y))) &&
                 ((marker (head y)) /= (marker (last x))) 

combine :: Marked a => [a] -> [a] -> [a]
combine [] y = y
combine x [] = x
combine x y | compatible x y = x ++ y
combine x y = (reverse x) ++ y

And now a slightly fancier, recursive version of the reap function:

reap :: Marked a => ([a],[[a]]) -> ([a],[[a]])
reap (w,[]) = (w,[])
reap (w,x) | length x == 1 = (w,x)
reap (w,x) = reap (w `combine` ((map head) x),
                         (segments (concat ((map tail) x))))
*Main> y < random_balls 25
[Green,Orange,Purple,Green,Green,Purple,Orange,Blue,Blue,
 Green,Red,Green,Blue,Green,Purple,Orange,Blue,Blue,Orange,
 Purple,Green,Purple,Green,Purple,Purple]
*Main> reap ([],segments y)
([Green,Orange,Purple,Green,Purple,Orange,Blue,Green,Red,
  Green,Blue,Green,Purple,Orange,Blue,Orange,Purple,Green,
  Purple,Green,Purple],
 [[Green],[Blue,Blue],[Purple]])
*Main> reap it
([Green,Orange,Purple,Green,Purple,Orange,Blue,Green,Red,
  Green,Blue,Green,Purple,Orange,Blue,Orange,Purple,Green,
  Purple,Green,Purple,Green,Blue,Purple],
 [[Blue]])

Intermingling the Leftover Items in Haskell

Assuming that reap is run until either there are no leftovers or all of the leftovers are the same color, the next piece is a function that Maybe intermingles the leftover group (one Blue ball in the above example) into the list in the first coordinate. The intermingle function below models the circular list as two halves, and moving one place in the circular list is modeled as moving the head of the third coordinate to the head of the second coordinate; the list of leftovers is passed in the first coordinate.

intermingle :: Marked a => ([a],[a],[a]) -> ([a],[a],[a])
intermingle ([],y@(y1:y1s),y2) = ([],reverse(y) ++ y2,[])
intermingle ([],[],y) = ([],y,[])
intermingle ((x:xs),y,[]) = ((x:xs),y,[])
intermingle (x:xs,[],y@(y1:ys)) | (marker x /= marker y1) &&
                                  (marker x /= marker (last y))
                                      = (xs,[],x:y)
intermingle ((x:xs),y@(y1:y1s),(y2:y2s)) | (marker x /= marker y1) &&
                                           (marker x /= marker y2)
                                               = intermingle (xs,y2:(x:y),y2s)
intermingle (x,y,(y2:y2s)) = intermingle (x,y2:y,y2s)

With the example from above with the one leftover Blue, intermingle returns the one Blue at the head of the list, as the third rule would define:

*Main> intermingle ((snd it)!!0,[],fst it)
([],[Blue,Green,Orange,Purple,Green,Purple,Orange,Blue,Green,
 Red,Green,Blue,Green,Purple,Orange,Blue,Orange,Purple,Green,
 Purple,Green,Purple,Green,Blue,Purple],
 [])

The intermingle function also does the right thing when there is no solution:

*Main> intermingle ([Red,Red,Red],[],[Green,Green])
([Red],[Green,Red,Green,Red],[])

Putting it All Together

Now we can pull the pieces together into a function that solves the original problem:

result :: Marked a => ([a],[a],[a]) -> Maybe [a]
result ([],y,[]) = Just y
result ((x:xs),_,[]) = Nothing
result (_,_,z) = error "Not expecting to see data in the third coordinate."

santa_solve :: Marked a => [a] -> [a]
santa_solve x  = case outcome of { Nothing -> error "No solution."
                                 ; (Just w) -> w }
    where
      reaped = reap ([],segments x)
      leftovers = concat(snd reaped)
      partial_solution = fst reaped
      outcome = result (intermingle (leftovers,
                                     [],
                                     partial_solution))

Our <50 lines of Haskell work like a charm:

*Main> santa_solve [Red,Green,Blue,Red,Red]
*** Exception: No solution.
*Main> y < random_balls 300
**Main> concat ((map marker) (santa_solve y))
"OROGORPBPGRGRPBGRPGOBGORPOBGOBPGPORGBORGBOPRGBPRGBGOPOPGOGP
 RBRGRGBOGBPRPRPRPOGBPGPGPGORBOBGRBPBPRGBRPGBROBRBGPRPORGPRG
 PBGPGORGBPORBRPRPGPGORORGBPGPGOBRPGROPOROGOROBPGBGBGBOGOPOB
 ORBPOPORPRPRPOGOPBPOGROBOGBPOGOPGPBRPGPGBROBPBRGBOBPRBPGOGP
 RPOPROBPRORGORPGPBRGRPRBOROBOROPBRPRBRGORGBPGRGPOPOGBPGPOGP
 GPOGP"

Including the data structure and parsing code for Participants from Part I (but not including the code to generate random lists of Balls), a nicely-formatted Haskell solution weighs in at under 100 lines without sending email.

This implementation is more than fast enough for most practical purposes, but it's still far from optimal. On my development box, running a compiled santa_solve on 106 balls takes 1-2 seconds, and processing 107 takes ~50 seconds. If I have time before I get distracted, I'll come back to a more efficient approach. (For what it's worth, one path to a more efficient solution would be to sort the initial list and do away with the successive flattening and grouping.)

(comment bubbles) 1 comment

Oh no you didn't...?!

Paul Brown @ 2006-12-21T00:19:56Z

My blog (i.e., this blog) suddenly stopped working today, so I went to the logs to figure out why. The logs provided:

ActionView::TemplateError (undefined method `first' for nil:NilClass) on line #9
  of vendor/plugins/flickr_sidebar/views/content.rhtml:
[... lots and lots of stuff ...]

Ah, OK. I haven't gotten Typo current since I decided to roll my own (which I'm still doing...), so it serves me right if some bug is biting me. Off to look at the Flickr sidebar, and this chestnut is what's choking things:

def image
    description.scan( /(http:\/\/(static|photos).*?\.jpg)/ ).first.first
end

Gah! Using a regular expression to capture a link to an image out of escaped HTML markup passed in the RSS feed... Not something I would recommend. Flickr changed their URLs so that the regular expression no longer matched, and the error was caused by scan returning nil. It's easy enough to fix by adding some namespaces:

@@NS = {"media" => "http://search.yahoo.com/mrss/" }

And then, a couple of tweaks to extract the thumbnail from the appropriate extension element:

picture.thumbnail = XPath.match(elem, "media:thumbnail/@url",
  namespaces=@@NS).to_s

And we're back in business.

(comment bubbles) 2 comments

Where did the Exception go...?

Paul Brown @ 2006-12-20T03:03:23Z

It's easy to get used to throwing away return values in Java. For example, I don't usually need or use the return value from Map#put(Object) or Map#remove(Object). Innocuous for collections, this can have unintended consequences (or, rather lack of consequences) if the return value that you're throwing away is a Future.

For example, suppose that you do something like:

Executors.newSingleThreadExecutor().submit(new Runnable() {
  public void run() {
    throw new UnsupportedOperationException();
  }
});

The exception is lost and gone forever. There are four choices, at least in the case of a ThreadPoolExecutor as above:

  1. Use execute(...) instead of submit(...), in which case the exception will be thrown as though we'd just started a thread to run the Runnable.
  2. Capture the Future returned by submit(...), which is actually a FutureTask in this case, and deal with completion.
  3. Subclass the ThreadPool and supply an afterExecute(Runnable,Throwable) implementation.
  4. Perform any exception handling (logging, retry, etc.) in the body of the run() method.

My preference is for number 4, since it doesn't involve stack traces on the console, a fragile cast, or subclassing, but I'm open to other opinions.

(comment bubbles) 2 comments

Secret Santas in Haskell II: Orbits and Lists

Paul Brown @ 2006-12-18T03:17:48Z

Part I set up a data structure and covered input and output, so it's back to the problem at hand. This post establishes a useful equivalence and establishes a strong condition on the existence of a solution.

From Orbits to Lists

From a purely academic perspective, it's worth observing that there aren't always solutions to the Secret Santa problem. For example, the degenerate case of only a single family is one unsolvable case, but it's also the case that if a single family comprises more than half of the participants, then there is no solution (by the pigeonhole principle). A more useful observation is that the problem of a mapping can easily be turned into a problem of lists.

First, observe that if there is a circular list that contains each of the members of P (the set of participants from Part I) exactly once where no two consecutive list elements have the same last name, then the successor function is a secret santa function. Conversely, if there is any secret santa function, it can be used to generate such a list, as follows. (It is not the case that every secret santa function arises from a list.) For a secret santa function s, the orbits of elements of P are a finite number of disjoint subsets, each of which has the form a of a circular list of elements of P where no two consecutive elements have the same last name. If there are at least two orbits O1 and O2, combine the two of them by breaking each orbit at a randomly selected element and concatenating the two lists head-to-tail and tail-to-head to get a new circular list, like so:

If O1 merged with O2 violates the successor condition, then O1 merged with O2 in reverse order does not. Sketch the cases on a piece of paper if this isn't obvious; in brute-force terms, it would look something like this:

The two colors squares in a column or row heading represent the "ends" of the broken circular lists, and concatenation is performed by gluing the two entries touching the table and the two entries not touching the table; a yellow square indicates that one of the lists should be reverse prior to gluing. For example, the upper left square would be R...G + G...R or, after reversing the second list, R...G + R...G, which is a legal combination.

This proves that if there is a secret santa function that decomposes the set into n>1 orbits, then there is a secret santa function that decomposes it into n-1 orbits. Thus, to create a solution, it's enough to create a circular list that exhausts P and where no two markers are repeated.

Orbits and Drafting

The observation above about there being no solution when one family is larger than the sum of all of the other families can be strengthened: that is the only case when there is no solution. The idea of the proof is to sort the families by size and then draft members from the smallest family into the second largest family, as depicted below:

In the image, each layer is an orbit, and the diagram up to the top of the yellow-green (i.e., second largest) family represents a valid secret santa function. The yellow-green family can continue to draft members from the smallest family until either the yellow-green and dark red families are the same size or until the smaller families are exhausted, i.e., the dark red family contains more members than the other families combined.

As an aside, a similar strategy works to obtain an as-rectangular-as-possible arrangement:

In Practice

Up next, a couple of posts (like this one, in the secretsanta tag) with solutions implemented using these ideas.

(comment bubbles) 1 comment

Secret Santas in Haskell I: Preliminaries

Paul Brown @ 2006-12-18T03:03:00Z

A while back I posted a solution to the first Ruby Quiz in Haskell. The next quiz problem is about "secret santas", which I'll rephrase as follows:

Given a set P of people, each with a first name, last name, and email address, define a bijective "secret santa" function s :: P -> P such that (last . s)(p) /= last(p) for all p in P, where last :: P -> String maps a person to their last name.

There are also some requirements about reading the names from a file and sending email, and I'll deal with those up front, here in Part I. Subsequent parts (all available under the secretsanta tag, in case you choose to follow the thread) contain some additional background on the problem (including a simple, strong characterization of when solutions exist) and a couple of solution techniques.

Reading and Writing People

Haskell has the Read and Show type classes that provide, respectively, the ability to read and write a type. Writing out one of the secret santa participants is straightforward:

data Participant = Participant (String,String,String)

first_name :: Participant -> String
first_name (Participant (f,_,_)) = f

last_name :: Participant -> String
last_name (Participant (_,l,_)) = l

email_addr :: Participant -> String
email_addr (Participant (_,_,a)) = a

instance Show Participant where
    show p = (last_name p) ++ ", " ++ (first_name p) ++ " <"
             ++ (email_addr p) ++ ">"

(Show and Read can be obtained in this case by deriving their implementations, but the representation is practical but not that appealing.) To read in the same representation, a very simple parser is preferable to implementing the low-level functions that Haskell requires (see, e.g., 8.3 of the GITH). Parsec makes it simple:

normalize :: String -> String
normalize = unwords.words

participant_parser :: Parser Participant
participant_parser = do { last_n <- many1 (letter <|> space)
                        ; char ','
                        ; first_n <- many1 (letter <|> space)
                        ; char '<'
                        ; email_a <- many1 (letter <|> oneOf ".-_@+")
                        ; char '>'
                        ; return (Participant (normalize first_n,
                                               normalize last_n,
                                               normalize email_a))
                        }

Parsec also makes it simple to parse a list of participants from a flat file, like so:

participants_parser :: Parser [Participant]
participants_parser = participant_parser `sepEndBy` (many1 newline)

load_participants :: String -> IO [Participant]
load_participants filename = do { result <- parseFromFile participants_parser filename
                                ; case (result) of 
                                    Left err -> (error (show err))
                                    Right val -> return val
                                }

If you're not a Haskell person, it's worth stopping to scratch your head about why the load_participants function creates values of type IO [Participant] instead of [Participant]. When I first started learning Haskell, I searched in vain for a function with a signature like f :: IO a -> a, but the nature of Haskell dictates that no such function can exist. (This is not a general trait of monads, as it's straightforward to unwrap Maybe or Either.) In a nutshell, IO exists to both separate and integrate the purely functional world where referential transparency reigns and the external world, and the list doesn't exist in the functional world until the I/O operation (reading the file) is performed, thus the need to deal with the inner value within an IO context, e.g., within a main function, in an interactive session, or by lifting functions into the IO monad.

Sending Email

There are two easy ways to send email. The simplest is to use the System.Cmd module included with the core libraries and the system function. The other way is the sendmail function from the Network.Email.Sendmail module in MissingH. Neither one is particularly exciting...

A Little Abstraction Goes a Long Way

Before I actually get going in the other parts, I'd like to introduce a layer of abstraction:

class Marked a where
    marker :: a -> String

instance Marked Participant where
    marker = last_name

This gets me out of having to type Participant everywhere and makes the functions easier to reuse, e.g., for balls where marker is color or animals where marker is species or music tracks where marker is artist and album, or...

(comment bubbles) 1 comment

All Posts contains 399 items in 57 pages of 7 items each:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57