How to Monitor Java Applications on EC2 with Cacti

Paul R. Brown @ 2009-05-19T05:53:21Z

As part of a scale-out effort for a customer moving from a single node hosted on Slicehost to a multi-node environment hosted in the US and EU on Amazon EC2, I wanted a way to introduce a combination of application and host-level monitoring for the nodes. I settled on the combination RRDTool graphs served by Cacti and an alive check provided by a third party (Monitis), but there was no immediately obvious way to bridge the gap between the Java services and the Cacti convenience wrapper around RRDTool.

This was before the recent announcement by Amazon of monitoring functionality for EC2 nodes, but that service wouldn't meet the primary use case of application versus host monitoring. A tool like JConsole didn't make sense because I was interested in getting a single portal view across the fleet and in having retrospective data to make visual day-to-day or week-to-week comparisons.

This post describes how to bring the pieces together, and the technique is equally applicable to non-Java systems — any system that can serve HTTP requests can be instrumented. In the end, about a day's worth of experimentation and work was enough to get me the level of instrumentation I was after.

Host Configuration Requirements

Each of the nodes in the fleet runs on a slightly modified CentOS 5.2 AMI (based on one (ami-1363877a) provided by Rightscale), and getting basic host information exposed over SNMP is straightforward:

$ yum install net-snmp
[... lots of output ...]
$ mv /etc/snmp/snmpd.conf /etc/snmp/snmpd.conf-old
$ echo 'rocommunity public' > /etc/snmp/snmpd.conf
$ /etc/init.d/snmpd restart

The underlying assumption, of course, is that the instance was launched under a security group that exposes UDP ports 161 and 162 to the host that will be running Cacti. This can all be made to work without assigning elastic IP numbers to the nodes and to the Cacti host, but it's easier.

For the Cacti host, more or less any modern Linux distribution (e.g., Ubuntu or CentOS) will do, and I'd recommend following Eric Hammond's very nice tutorial about setting up MySQL on an EBS volume before doing the Cacti install. For the same reason it makes sense to have MySQL on the attached EBS volume (survive instance termination, support backups, etc.), it makes sense to store RRDTool's backing data there as well.

Instrumentation and Collection

The Java application in question (SmartFox) has no explicit support for exporting metrics and no MBeans exposed for access via JMX, but it does provide some API-level support for basic information and an embedded servlet container (Jetty, of course). (SmartFox does bundle a Flash-based administrative tool, but like JConsole it's single-node and does not provide much beyond in the way of retrospective data.)

After some poking around (i.e., reading PHP source code) in Cacti, I found that Cacti's standard "Script/Command" data input method consumes data as space-separated name/value pairs on a single line:

name1:value1 name2:value2 ...

So I put together a simple servlet to grab the server singleton object from the SmartFox API and print metrics out on a text/plain response. This could just as easily be done with an MBean instance looked up via the JVM's default JMX infrastructure or a metric facade injected into the servlet as part of the overall web application — the point is that the single line of name/value pairs is the required interface to Cacti.

The data is then accessed via a curl invocation templated for variables:

curl http://<host>:<port>/<webapp>/sfs-status?zone=<zone>

The fields in angle brackets are input fields that will be filled-in by other objects in Cacti, and the output fields for the data input method should be named to match the names in the name/value pairs from above.

The downside of this approach is that there is quite a bit of configuration that goes on top of this one-liner (graphs instantiate graph templates and pull from data sources that reference data templates that in turn reference data input methods, or something like that), but it more or less just works. (Even at that, it is less painful and more forgiving than some other tools I've worked with, e.g., ZenOSS.) A couple of hours of experimentation should be enough to get a decent set of basic graphs customized for the application at hand.

[an RRD graph]

As mentioned above, it goes without saying that the EC2 security groups for the instances need to be set up so that this data is not generally accessible but can still be seen by the Cacti host.

Tips and Tricks

The only real issues that I encountered in the process were some disconnects between what Cacti allows you to enter and what RRDTool accepts as input. Once you're done with the necessary setup or some tweaks, if your graphs either don't appear or disappear, there's a good chance that RRDTool doesn't like what Cacti is asking it to do. In that case, turn on the "graph debug" option to see what Cacti is sending to RRDTool and adjust your configuraiton accordingly.

(comment bubbles) 1 comment

Product Management for the Busy Entrepreneur

Paul R. Brown @ 2009-03-16T18:59:23Z

I was talking with a budding entrepreneur in the open source "big data" space (with my Entreprementor hat on), and we talked about his sales pipeline and potential customers. He had a list of a couple dozen companies that had expressed some interest, and that much was great. Just the same, I asked about the elephant in the room: Interest in what? Interest in your wonderful open source doohickey might get you Internet Famous like the Star Wars Kid but isn't likely to pay your bills let alone be something to build a company on.

This brings me to the subject of product management for the busy entrepreneur.

Product Versus Business

It's easy to get confused about the difference between a product and a business. A business is a machine that turns something you have into money. (One that produces more money than it consumes is a good business...) A product is a describable, sellable thing that your business can produce over and over and customers can consume over and over without too many changes. Products can be broad, like professional services, or very specific, like machine parts, but the aspects of commonality in description and delivery need to be there.

The point of a product is that the commonalities enable scale in a business's internal processes, from production to sales to accounting. It is also that commonality that makes a business an investment prospect because you can make reasonable inferences about capital in versus capital out (ergo value). Defining a product involves a bit of intuition and guesswork, but refining that definition is simple: Ask potential customers if they would spend money on it. I've never been able to understand the relative reluctance of some entrepreneurs to get on the phone or hit the street with an idea, maybe out of a reluctance to have their idea trashed by reality, but the customer's money is the one source of Truth. If it's difficult to explain, if it's not something that the customer's business can readily consume, or if the customer doesn't "get" it, then it needs to change.

Rows and Columns

Once you have a panel of potential customers assembled, it's time to sit down with a spreadsheet and figure things out. Customers go down column "A", and potential products go across row 1. Put an "x" and maybe a note in a cell if that customer would pay for that product, and then look for the column with the most x's. Alternatively, you can use the price that the customer would pay as the value for the cell in the column and try to make a more refined decision based on the profitability of the offerings, but the idea is the same — Get real data on what customers want.

There's an aspect to survey design that's important when interviewing a potential customer. To get a real response, ask specific questions and set the expectation with that potential customer that you'll very likely be back to get a check from them. You should expect to iterate on the process a few times, as customers may help you add columns to the spreadsheet, but you should avoid open-ended questions.

Real product management is quite a bit more involved and detailed but equally necessary as your product and base of customers grows and evolves. Nonetheless, this should be enough to get started.

Basic product management is one of those times where stating the obvious is useful: Data is helpful in making decisions.

(comment bubbles) 3 comments

Up to What I Am

Paul R. Brown @ 2009-02-25T06:47:16Z

After a short stint at Amazon.com in 2005, I took some time off, where "time off" means not being an entrepreneur or having open-ended commitments — other than at home. I had a lot of fun getting back to basics as a software developer building applications with some great customers (learning some new industries in the process), working closely with a few entrepreneurs, and with being a hands-on Dad for my kids.

At the start of 2009, with kid #2 up and crawling, kid #1 finally sleeping decently (most of the time, knock wood), and a fresh calendar year ahead, I sat down to think about what to do next. I made a list of things that I think are interesting and things that I do well and/or enjoy. Next, I brainstormed and ranked concepts for businesses with rank roughly defined by the combination of my level of interest, the impact of the idea, and the value both to and from my network and experience in making it a success.

All of the business concepts were subject to the following constraints:

  • Minimal startup costs. Getting going should take no more than an LLC ~($200), some assorted licenses (<$100), a domain (~$10), Google Apps for email, and a modicum of non-free infrastructure if required (no more than $50/month total — private repository on GitHub, and maybe CRM/SFA like PipelineDeals, etc.).
  • Short path to gauge interest and engage customers. Typical customers, partners, and advisors/boosters should all be present within my personal network and if not, easily identified, enumerated, and contacted
  • Strong collaborators. The reinforcement and feedback that comes from working closely (and occasionally butting heads with) collaborators is important.
  • Head start or unique angle. The business should have a built-in competitive advantage in the form of knowledge, relationships, or intellectual property.

I ended up with a few dozen things and about half that many business concepts. Some obvious things made the list of, e.g., open source, simpler and lighter middleware, big data, mathematics (including statistics and probability), functional languages, visualization, consuming less, and being data-driven in everyday life. Some less obvious things made the list, too, e.g., teaching/mentoring, generative music, ultra-local agriculture (in your yard or even home), and scholarly communications. I intend to revisit the list of things and businesses as the year unfolds and my perspective evolves.

The first two businesses that percolated to the top of the list were FasterXML and Fremont Analytics. Both are now their own LLCs and spinning up. I am interested in the combination of open source, middleware, and big data in the mold of Cassandra, CouchDB, and Voldemort, but I'm not ready to place a bet there just yet.

(comment bubbles) 0 comments

A Little Processing

Paul R. Brown @ 2009-02-24T09:40:30Z

I wanted a visual representation of the usual continued fraction expansion of the irrational number e that used concentric rings to represent the successive anapests of the tail of the expansion; with grouping added for emphasis:

2, 1,1,2, 1,1,4, 1,1,6, ...

The first ring would have four sectors; the second would have six, the third eight, etc. Something like this:

The question was how to get it drawn, and after a little thought, I settled on writing a Processing program to generate the image. The language doesn't include a primitive for drawing sectors, but it's possible to represent a second as a larger filled arc and then a smaller filled arc that covers the same angle but filled with background color:

fill(foreground);
arc(x0,y0,r1,r1,theta0,theta1);
fill(background);
arc(x0,y0,r0,r0,theta0,theta1);

Or, in my case, just stacking up pie charts with the smallest on top is sufficient. The source is here.

(comment bubbles) 0 comments

What Chess Club Should Teach You About Pitching

Paul R. Brown @ 2009-02-02T20:27:09Z

Cross-posted on nPost.

The fundamental goal of selling is to clearly communicate to your audience, and this applies equally to pitching a company to investors or pitching a product to potential customers. It's obvious that confusing the audience conflicts with that fundamental goal, but it can be surprisingly difficult to avoid.

Your goal should be to provide your audience with exactly the information they need to make a decision, no more and no less. It's easy enough to fix providing too little information — just provide more. Providing too much information, probably confusing your audience in the process, is much more difficult to cure, and this is where a lesson from chess club comes in.

Chess Clock by Keeping the feedback loop tight is the best way to throttle the flow of information, and using a chess clock — imagined or actual — is one way to do that. When you start talking, think about tapping the clock to start your turn, or if you're on the phone or otherwise not in front of the customer, feel free to use a stopwatch or countdown timer. Twenty to thirty seconds is a reasonable benchmark, and if you've got other people in on the pitch with you, it's up to you to set the ground rules for them to ensure that everyone's bound by the clock. It's enough time to communicate an idea but not so much time that your audience starts reaching for their Blackberries while you ramble.

(comment bubbles) 4 comments

.editrc Tidbit for ghci

Paul R. Brown @ 2008-12-03T05:48:03Z

The only thing that I lost in the transition between readline in GHC 6.8 and editline in GHC 6.10 was backwards history search in ghci bound to ^R. I use that particular feature quite a bit, so that had a big negative impact on my productivity.

Here's how to get it back. Create a file in your home directory called .editrc with the following contents:

edit on
bind ^R em-inc-search-prev

The analogous setting for forward search (bind ^S em-inc-search-next) doesn't work for some reason, but that's not one that I'm going to miss.

(comment bubbles) 1 comment

LinkedIn Group for Mathematicians

Paul R. Brown @ 2008-11-24T21:38:41Z

I can understand that some might find a bit of irony in a social network for mathematicians, but I created a LinkedIn group for "current and former mathematicians" late last week.

Mathematicians, where my working definition is people who have significant formal training in the form of a Ph.D. or A.B.D. in mathematics or a similar field like theoretical physics or computer science, usually start out with a very narrow career trajectory — academia. Graduate mathematics programs have the job of preparing students to be researchers. There is barely an acknowledgment of alternatives, but the realities of the economy and academic job market will continue to draw (or drive) mathematicians to other fields. (Programs like MISI at UIC, with which I was involved when I was on the faculty at UIC, are a notable exception.)

The reality is that there is no irony. The mathematicians that I know are equally distributed and successful across academia, industrial applications (e.g., quantitative finance, marketing analysis, etc.), and entrepreneurship; but formal training in mathematics isn't one of the axes that LinkedIn or other social networks support for search or networking. I hope that the group is a combination of:

  • a support structure for mathematicians of any age pursuing or just considering non-academic career tracks or even just extra-curricular consulting;
  • a place to share interesting problems and opportunities;
  • a virtual tea time, which I miss.
(comment bubbles) 1 comment

All Posts contains 397 items in 57 pages of 7 items each:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57