Testing is great, but instrumentation is both critically important and often ignored. After all, failures always come from untested scenarios.
Basic instrumentation (like heap data) is good, but basic instrumentation is also a bit like taking a picture of a race car. Other than from context, it's impossible to know whether it's going 200MPH or just parked in the middle of the racetrack. For example, the average velocity of a response time parameter indicates whether response times are getting longer or shorter or staying about the same, and the average rate of change of the average rate of change indicates whether a linear extrapolation is likely to miss high or low.
Here's a simple approach to get thumbnail rates of change for a
stream of values; I'll explain how to integrate it with JMX a little
later. The first ingredient is a useful little utility class that
computes the moving average of a stream of long values.
(Clearly the values can't be that large, ideally no larger than
MAX_LONG/2*size.)
public class CircularWindow {
private long sum;
private long last;
private long[] x;
private int ptr;
private int size;
private boolean full;
public CircularWindow(int size) {
x = new long[size];
this.size = size;
}
public void add(long y) {
sum += (last = y) - x[ptr];
x[ptr] = y;
if (++ptr == size) {
ptr = 0;
full = true;
}
}
public long average() {
return full?(sum/size):((ptr!=0)?(sum/ptr):sum);
}
public long last() {
return last;
}
}
It is built with the assumption that the average will be requested frequently but not necessarily every time a value is added. (If the average was requested infrequently versus the size of the buffer, it would make more sense to compute the sum only when the average was requested.)
To get at rates of change, chain up two of the windows and add values like so:
D_s.add(value - s.last()); s.add(value);
Now, what does this do? For a hint, try this:
CircularWindow x = new CircularWindow(100);
CircularWindow y = new CircularWindow(100);
CircularWindow z = new CircularWindow(100);
for (long i=0; i < 50000; ++i) {
long value = i*i;
z.add(value - y.last() - x.last());
y.add(value- x.last());
x.add(value);
if (i > 49990) {
System.out.println("x av = " + x.average());
System.out.println("y av = " + y.average());
System.out.println("z av = " + z.average());
}
}
The series y ticks up two every time, and the z series is constant 2. It looks like the cleverly-named D_s series from above approximates the first derivative of the s series, and presuming that the values in the series come from a suitable well-behaved differentiable function, that's obvious if you write it out:
D_s(n+1) ~ (s(n+1) - s(n))/((n+1) - n) = s(n+1) - s(n)
The main window and the second window for tracking the rate of change can be encapsulated conveniently:
public class VelocityWindow {
private CircularWindow s;
private CircularWindow D_s;
public VelocityWindow(int size) {
s = new CircularWindow(size);
D_s = new CircularWindow(size);
}
public void add(long x) {
D_s.add(x - s.last());
s.add(x);
}
public long average() {
return s.average();
}
public long velocity() {
return D_s.average();
}
public long ticksToClear() {
if (D_s.average >= 0) return -1;
return s.last() / D_s.average();
}
}
The fully reusable path for integration into an application's JMX
instrumentation is via a MonitorMBean
implementation. A MonitorMBean typically uses a timer
(e.g., a TimerTask)
to sample an attribute and then either exposes attributes or fires
notifications. (The GaugeMonitor
is similar in flavor but not really the same thing.)
I haven't ever gotten around to taking the
MonitorMBean approach, since the quick and dirty path is
simply to embed the VelocityWindow directly in the MBean
and expose average and velocity as
attributes. Another derived metric that is useful when tracking
backlogs is ticks to clear, which would either be infinite if
D_s.average() is non-negative or
-1*s.last()/D_s.average() otherwise.
I got a SPAM this weekend for "Blogger and Podcaster Magazine". In all likelihood, this is just a case of someone identifying a demographic where the advertising revenues from a magazine could exceed the cost of printing and mailing a bunch of free copies, but this strikes me as a case of unclear on the concept. A hard copy, snail-mailed magazine targeted at people who are deeply involved in blogging, podcasting, and other purely on-line activities? That's just the kind of content that you can't find on-line... [sic.]









