Shear Evolvability Experiments

August 16, 2002

The basic task:  track a moving target, composed of an n-dimensional vector of real numbers.  The target moves in space based on mutations specified by an evolvability component.  Each genome is also composed of an n-dimensional vector of real numbers and an evolvability component.  In the process of tracking the target, we hope that the genomes will adopt evolvability components that behave the same was as the target's.

I studied the following evolvabilities:

Matrix
This is simply an n-by-n matrix of real numbers.  Multiplying it by a vector of gaussian distributed random numbers (N(0,1)) gives us a multivariate normally distributed vector.  This vector either replaces the genotype, or is added to it.

The correlation between dimensions i and j is equal to the (i, j)th component of the matrix multiplied by its transpose, and then divided by the standard deviation of both variable i and variable j.

Columns
Evolvability is represented by a vector of integers, (c1, c2, ..., cn).  Row i of the evolvability matrix is composed of a vector of all 0's, except for a single 1 in column ci.

The correlation between dimensions i and j is 1 if ci == cj, and 0 otherwise.

Object-Oriented
Evolvability is a vector of integers, (s1, s2, ..., sn), where si is the superclass of i.  If i is a root, then si == i.  If the vector represents a tree, then it is valid.  The target evolvability is selected repeatedly until a valid one comes up.  Genomes with an invalid evolvability are given minimal fitnesses.

The correlation between dimensions i and j is 1 if both inherit from the same root, and 0 otherwise.

Blend
Defines each modification in terms of a linear combination of a random variable, plus all the entries with a lower index:

x_i = c_i * R_i + Sum(j=1..i-1) c_ij * x_j

where the c_i's and c_ij's are encoded in the evolvability.

I haven't defined a correlation for this one yet. Instead, I use the direct coefficient c_ij for the "correlation" between i and j, and I use c_i for the "correlation" between i and itself.

This essentially means that when Blend is used, the correlation distance is actually a real distance between the target evolvability and the genome's.

All the experiments used the following settings:
 
 
Population Size 1000
Selection Type Tournament, Generational
Epoch Size 10
Dimensions 7

Mostly I applied the experiments to a single target architecture that correlated dimensions 0 and 6, 1 and 2, 4 and 5, and dimension 3 was uncorrelated with anything.  For the object-oriented case, this corresponded to the following type hierarchy:

0 <- 6
1 <- 2
3
4 <- 5

Matrix Experiments

Baseline settings were the following:
 
 
Tournament Size 7
Epoch Count 500
Prob Mutate Matrix Entry 0.001
Matrix Mutate Method A[i,j] *= exp(gauss(0.0, MatrixStdDev))
MatrixStdDev 1.0
Initial Matrix Min, Max 0, 1
Normalize Rows false
Target Genotype Mutation Method g = A * r
Random Numbers for r Gaussian(0, 4)

That produces this graph:

 Matrix Baseline

I was concerned that the system might just be learning to hang out around the zero vector, so I changed the target mutation method to be the same as is used for the genomes themselves:
 
 
Target Genotype Mutation Method g += A * r

 Brownian Target

This became the new baseline, so all subsequent Matrix experiments used Brownian targets.

Expanding out to 1000 epochs:

 1000 Epochs

Part of the initial evolution gets rid of that part of the initial population that doesn't lie on the proper axis.  Let's try to eliminate that effect by setting all genotypes to the zero vector (in the target as well).  This guarantees that correlated dimensions will always have the exact same value.

 Population and Target start at 0

This formed the new baseline.  All subsequent Matrix experiments started at 0.

Expanding out to 2000 epochs:

 2000 Epochs

Testing out the hypothesis that larger tournament sizes lead to better evolvability:

 Tournament Size 14
 Tournament Size 28
 Tournament Size 112
 Tournament Size 3

Hypothesis is incorrect in this case.  As the tournment size goes down (lower selection pressure), there's a slight improvement in the final correlation distance.

I knew from initial tests that normalizing the rows was actually hurting performance, but here's the official experiment to show it.

 Normalizing the rows

Switching over to a binary matrix, where all matrix entries are constrained to be either 0 or 1.  Mutation of an entry now just replaces that entry with a random 0 or 1.

 Binary Matrix

I thought perhaps reducing the probability of mutation would help.
 
Prob Mutate Matrix Entry 0.0001

 P = 0.0001

Seems to just make things worse.

I'm noticing that there seems to be a lower limit to how close the correlation distance can come.  It always seems to bottom out around 2.5.  The runs that look like they're doing the best on correlation distance are simply starting at a higher value.

Changing the epoch size seems to have an effect on the evolvability.  When the epoch is longer, the genomes have more of a chance to cluster around the target, which means that they'll be less fit once the target shifts.  Compare with the Brownian baseline:

 Epoch Size = 5
 Epoch Size = 20

Columns

Baseline settings were the following:
 
Tournament Size 7
Epoch Count 500
Prob Mutate Column Entry 0.001
Column Mutate Method replace by random column
Target Genotype Mutation Method g[i] = r[column[i]]
Random Numbers for r Gaussian(0, 4)

producing this graph:

 Baseline Columns

I tried going with a uniform target:
 
Random Numbers for r Uniform(-4, 4)

but this produced much worse results:

 Uniform Target

Then tried Brownian target:
 
Target Genotype Mutation Method g[i] += r[column[i]]

 Brownian Target
 

Object Oriented

Baseline settings:
 
Tournament Size 7
Epoch Count 500
Prob Mutate Superclass 0.001
Superclass Mutate Method replace by random superclass
Target Genotype Mutation Method g[i] = r[root[i]], where root is ultimate superclass
Random Numbers for r Gaussian(0, 4)

produced this graph:

 Baseline OO

I noticed that the mutation method was producing one big tree a lot of the time, which resulted in everything being correlated with everything else.  So I introduced a ProbRoot parameter.
 
Superclass Mutate Method super[i] = i if R(0,1) < ProbRoot
super[i] = random otherwise
ProbRoot 0.5

which led to improved results:

 ProbRoot = 0.5

I wondered how typical the target was, so I did an experiment where the target was chosen at random and fixed for every run.  If the target was invalid, I kept selecting until I got a valid one.

 Random targets

Seemed to get similar results, suggesting that the target architecture I was using was fairly typical.

Since OOGA keeps the same implementations the same, I did the initialization to 0 here as well as in the Matrix experiments:

 Population and Target start at 0

Didn't seem to make much of a difference.
 

Perfect Evolvabilities

For one series of experiments, I fixed the evolvability component of each genotype to be a clone of the target's evolvability component. Then I set the mutation rate for the evolvability to 0. Theoretically, this should provide perfect evolvability to all the genomes. Here's the results:

 Matrix, perfect evolvability
 Columns, perfect evolvability
 OO, perfect evolvability

The interesting thing to note is that even when the correlation difference is 0, the system can't achieve perfect fitness.

More Experiments

OK, I'm starting fresh again, instead of attemping to classify these experiments in the previous framework. Most of them have to do with the Matrix evolvability class, simply because I got the best results with it.

But first, here is an attempt at using the object-oriented framework on a problem with double the number of dimensions. The targets were selected at random at the beginning of each run.

OO, perfect, 14 dimensions

OO, not perfect, 14 dimensions, epoch=20

Took a look at increasing the epoch length, to see if that helped achieve perfect fitness: Matrix, perfect, epoch=20

Matrix, perfect, epoch=20, zero init

Matrix, prefect, epoch=40

Strangely, not. This suggests that the problem doesn't have anything to do with the amount of time at our disposal. I then tried changing each random variable independently of the others. That is, I set only one of the random variables to a random value. All the rest were set to 0. This is known as serial dimensions.

Matrix, perfect, epoch=20, serial

Matrix, perfect, epoch=40, serial

Seems to work a little better.

Here's the case where the target doesn't move at all:

Matrix, static

Matrix, static, serial

OK then, how does serial dimensions work on regular problems?

Matrix, not perfect, serial

Matrix, not perfect, serial, epoch=20

Matrix, not perfect, serial, epoch=80

Doesn't seem to help much.

A new approach. The standard deviation of the noise in the organisms is far too large. Reduce it down to something about an order of magnitude smaller than that of the target, to give it more fine-tune control.

Matrix, organism=1.0, target=4.0

Matrix, organism=0.5, target=4.0

Matrix, organism=0.1, target=4.0

Combine with serial dimensions:

Matrix, organism=0.1, target=4.0, serial

Here's the norm of the matrix. It's just the euclidean distance of the entries in the matrix itself. Note that it blows up substantially. This only seems to happen for serial dimensions. Hypothesis is that the multiplicative mutation blows up the entries when they're not in use.

Norm

Here's the smaller organism standard deviation applied to Columns and OO.

Columns, organism=0.1, target=4.0

OO, organism=0.1, target=4.0

Flirted briefly with the idea of putting a sigmoid function in.

Matrix, organism=0.1, target=4.0, sigmoid

Norm

Still More Experiments

Matrix, organism=genomic, target=4.0 -- Norm -- Entries -- Online Entries

Matrix, organism=0.1, target=4.0 -- Norm

Matrix, organism=genomic, target=4.0, serial dimensions -- Norm -- Entries -- Online Entries

Matrix, organism=genomic, target=4.0, serial dimensions, normalize rows -- Norm -- Entries -- Online Entries

Matrix, organism=0.5, target=4.0, Local Population -- Norm

Matrix, organism=genomic, target=4.0, Local Population -- Norm -- Entries -- Online Entries

See what the effect of using a local population on the problem yields. Doesn't seem to do much.

Matrix, 14 dimensions, organism=0.5, target=4.0, Local Population -- Norm

Here's the local population on the static case.

Matrix, Static, organism=0.5, target=4.0, Local Population -- Norm

And on the perfect evolvability case, 7 dimensions.

Matrix, organism=0.5, target=4.0, Perfect, Local Population -- Norm

Expanding to 14 dimensions...

Matrix, 14 dimensions, Perfect, organism=0.5, target=4.0, Local Population -- Norm

Now without perfect evolvability, and a regular panmictic population

Matrix, 14 dimensions, organism=0.5, target=4.0, Panmictic -- Norm

...and with the local population.

Matrix, 14 dimensions, organism=0.5, target=4.0, Local Population -- Norm

Introduced the Blend representation, where each class's inheritance is a blend of previous classes. This is a baseline.

Blend, organism=0.5, target=4.0 -- Norm

Expanded out to 4000 epochs...

Blend, organism=0.5, target=4.0 -- Norm

Trying out a local population on the situation.

Blend, organism=0.5, target=4.0, Local -- Norm

There seems to be a lot of dips in the fitness, corresponding to spikes in the norm. Try a heavy-handed approach: No entry in the matrix can rise above 5. This seems to tame the dips a little.

Blend, organism=0.5, target=4.0, clamping at 5 -- Norm -- Entries

Use genomic standard deviation for the R-values as a way of compensating for the clamping at 5 on the matrix entries.

Blend, organism=genomic, target=4.0, clamping at 5 -- Norm -- Entries -- Online Entries

Trying a slightly different architecture, with one tree, one chain, and one independent variable.

Blend 2, organism=genomic, target=4.0, clamping at 5 -- Norm -- Entries -- Online Entries

Restrict the clamping further down to [0,1], keep the genomic standard deviation to compensate.

Blend, organism=genomic, target=4.0, clamping at 1 -- Norm -- Entries -- Online Entries

I'm suspecting that the spikes are due to the possibility of extreme changes in the matrix entries in our multiplicative scheme. Let's reduce that by a factor of 10, while increasing the probability of a mutation by a factor of 10 to keep things "constant". In this run we don't do any clamping, just to see how much this can be relied upon to keep the matrix entries under control.

Blend, organism=0.5, PMutate=0.01, MatrixStdDev=0.1 -- Norm -- Entries -- Online Entries

As before, but re-introducing the clamping at [0,1]

Blend, org=genomic, clamp(1), PMutate=0.01, StdDev=0.1 -- Norm -- Entries -- Online Entries

OK, screw it. Instead of multiplicative mutations to the evolvability, we'll try additive. No genomic standard deviation either.

Blend, org=0.5, clamp(1), PMutate=0.01, StdDev=0.1, additive -- Norm -- Entries -- Online Entries

We'll go one step further with the additive, decreasing standard deviation by a factor of 100, while increasing probability of mutation by the same amount.

Blend, org=0.5, clamp(1), PMutate=0.1, StdDev=0.01, additive -- Norm -- Entries -- Online Entries

Same standard deviation as before, but we'll bring probability of mutation back to baseline level.

Blend, org=0.5, clamp(1), PMutate=0.001, StdDev=0.01, additive -- Norm -- Entries -- Online Entries

Mutations are too rare. Let's try 1/100 for both.

Blend, org=0.5, clamp(1), PMutate=0.01, StdDev=0.01, additive -- Norm -- Entries -- Online Entries

Let's go back to basics here. What's the best that Blend can do with an optimal matrix and a genomic standard deviation?

Baseline, 7 dims, perfect -- Norm

And this is what the problem is properly reduced down to: a 4-dimensional problem with completely independent entries in the vectors.

Baseline, 4 dims, independent, perfect -- Norm

Going back to the Matrix model, we'll try normalizing the rows. Not bad.

Matrix, organism=genomic, target=4.0, normalize rows -- Norm

Normalizing the rows when we're doing additive mutations to the entries.

Matrix, organism=genomic, normalize rows, additive -- Norm

OK, we're getting close to the target evolvability, but the fitness values aren't all they could be. Let's take an evolvability matrix from the previous experiment and give it to all genomes. They're not allowed to change it. Notice that the fitness values plummet. Very strange.

Pinned down the evolved evity from exp71 -- Norm

Because, you see, if we use the perfect evolvability matrix, we do fine. It looks like very small imperfections in evolvability have a big influence on fitness values.

Pinned down the perfect evity, compare with exp73 -- Norm

Here's the perfect evolvability when the standard deviation is fixed at 0.5 instead of being genomic.

Pinned down the perfect evity, organism=0.5 -- Norm

So I tried out a number of simplifications of the near-perfect evolvability matrix. I got it so that adding in a single 0.1 into the matrix causes the plummet.

Pinned down the almost-perfect evity, organism=0.5 -- Norm

Reducing that 0.1 down to 0.01 still causes the effect, but only a 10th as severe.

Pinned down the almost-perfect evity, organism=0.5 -- Norm

Same effect happens in the Column model, except more so. False dependence is catastrophic, but false independence isn't so bad. In fact, fitness using independent dimensions isn't that far off from perfect.

Pinned down the almost-perfect evity, organism=0.5 -- Norm -- All Runs

So let's start off the population with evolvabilities where all dimensions are independent. Much better results than before.

Columns, Start the population out with independence -- Norm

To do this with Matrix, we have to go to additive, since entries at 0 are nailed there under multiplicative mutations.

Matrix, Start the population out with independence, additive(0.1) -- Norm

Here's the Object-Oriented version, starting at independence (all nodes are roots). Doesn't seem quite so happy as Columns.

OO, Start the population out with independence -- Norm

Gearing up to a 16-dimensional Columns world, where each block of 4 dimensions is dependent on each other in the target. They just don't seem to be able to hold onto any blocks that they generate in the correlation matrix.

Columns, Independent, 16-dimensional -- Norm

Here's the Matrix model using multiplicative mutations. I achieved this by setting the 0 values to 0.001 instead.

Matrix, Start the population out with independence, multiplicative -- Norm

Trying out inclusive fitness. In this run, the fitness of each node is equal to the average fitness of it and 10 mutated potential offspring.

Columns, Inclusive Average, N=10 -- Norm

Same as before, but we're setting fitness to be the best of parent's fitness and 10 potential offspring.

Columns, Inclusive Best, Parent Included -- Norm

Same as before, but we don't count the parent's fitness into the contest for best.

In all these cases, using inclusive fitness seems to make absolutely no difference to the graphs.

Columns, Inclusive Best, Parent Excluded -- Norm

Take the baseline Blend experiment (exp56), and add row normalization.

Blend, organism=0.5, target=4.0, Normalize Rows -- Norm

Take the baseline Blend experiment (exp56), and set the genomes to start out with independent dimensions.

Blend, organism=0.5, target=4.0, Independent -- Norm

Take the baseline Blend experiment (exp56), and add row normalization. Also add genomic standard deviation, which might be necessary to compensate for the normalization.

Blend, organism=genomic, target=4.0, Normalize Rows -- Norm

See what happens when we use both independent dimensions and normalized rows.

Blend, organism=0.5, target=4.0, Normalize Rows, Independent -- Norm

Big kahuna experiment with 16-dimensions representing 4 different trees, but no independent start.

Blend, 16 dim, organism=0.5, target=4.0, Normalize Rows -- Norm

As before, but add in independent start.

Blend, 16 dim, organism=0.5, target=4.0, Normalize Rows -- Norm

New baseline. No population initialization to 0. Additive noise with standard deviation of 0.1. Independent start. Normalized rows.

Blend, organism=0.5, Normalize, Independent, additive(0.1), No Zero -- Norm

New baseline + incremental fitness

Exp94 (Blend) with incremental fitness -- Norm

New baseline + incremental fitness

Exp94 (Blend) with incremental fitness and no independent start -- Norm

I'm surprised I never did this experiment before. Set the evolvability to be optimal, but allow it to mutate. Let's see how much the system will wander away from perfection as a gauge of how strong the pressure is to stay.

Blend, baseline with perfect evity that isn't pinned -- Norm

Original experiments with a uniform, replaced scheme failed, I think, because I applied those qualifiers to both target and organism. I'm trying on just the target, with organisms crept via gaussian, as before.

Blend, baseline, target is uniform and replaced -- Norm

Replacement + Uniform, but perfect evolvability to test upper bound.

Blend, perfect, target is uniform and replaced -- Norm

16-dimensional blend with independent start and uniform/replaced target.

Blend, 16 dim, organism=0.5, Normalize Rows, target replace/uniform -- Norm

Same replacement + uniform with perfect evolvability as before, but we allow the evolvability to drift. Do we get the same type of drifting as before? How about the dips?

Blend, perfect driftable, target is uniform and replaced -- Norm

Hmmmm, I wonder whether we get the same major plummets from a near-perfect evolvability matrix we were getting with a crept target?

Blend, near-perfect pinned, target is uniform and replaced -- Norm