You can find the original web page that I pointed you to here.

Gratuitous Power Law in Color

I removed the key entry that described the equation for the regression line to save space. If anyone asks, the regression line has a slope of -2.54.

You can contrast the green pluses (method bodies) with the yellow triangles (Java class headers) and brown stars (Java interface headers) as an example of the split from interface to implementation.

Connect to HOT Talking Points

HOT (Highly Optimized Tolerance) is John Doyle's theory that engineering a system to avoid failures only transfers the failure possibilities into a smaller, rarer part of the system. Failures become rarer, but more severe.

I'm making the connection between HOT and software evolvability by equating "failure" with "change to the source code".

By splitting the system into interface and implementation, you can reduce the expected size of the change to source code, but at the expense of risking rare, major changes when the interface changes.

In HOT, failure rate is related to failure intensity by a power law. If you squint at the chart, you can see one there.

likelihood = 18.7 * impact^(-2.54).

Improved Package Plot

The gnuplot png driver seems to be ignoring my line thickness commands. Let me know if you want me to take a closer look at figuring out a way to do it.

Graph Paritioning Talking Points

The Kernighan-Lin algorithm is a heuristic for solving the graph paritioning problem.

Don't confuse this with the Lin-Kernighan algorithm, which is a heuristic for the Travelling Salesperson Problem.

Graph partitioning attempts to divide a graph's nodes into two equal sets while minimizing the number of cross edges. It's been shown to be NP-hard.

Kernighan-Lin does a form of hillclimbing, but if the best move available to it is downhill, it'll take it in hope of finding an uphill move later.

GECCO 2002 Talking Points

The GECCO 2002 paper demonstrated that a software engineering technique (code factoring) could improve a piece of software's evolvability.

The basic experiment did symbolic regression on the equation
y = A * sin(A * x)
where A is a real constant that was chosen at random from [0, 6] every 5 generations.

When we gave the genomes an ADF, they evolved a factored representation that improved in fitness faster, with fewer changes to the tree.

The big trick here is that a factored representation only needs to mutate one part of the solution, rather than two. Thus, the possibility of a fortuitous mutation was made much more likely.

Real-World Software Systems

Jikes RVM is a research Java virtual machine written in Java.

Jakarta Tomcat is the servlet container for Jakarta Apache, a web server written in Java.

Net Beans is an integrated development environment written using Java Beans technology.