Galapagos : how does it react against current best solution?


(Olivier Stocker) #1

After reading David Rutten’s blog “ieatbugforbreakfest”, i didn’t manage to understand how galapagos is reacting to the current best solution (or the x% of current best)

  1. They have an higher chance of reproduction, but will they stay alive ?
  2. What is the killing mecanisme galapagos is using ?

These question came to me after i faced some optimization “drop” where the bests solutions seems to desapear !

(David Rutten) #2

The N% weakest genomes are culled without being allowed to reproduce.* From the remainder, everyone is in principle allowed to reproduce, although fitter genomes are more likely to be randomly selected.

Once selected, the breeding genome will pick a partner from the remaining population based on genomic distance. I.e. it doesn’t care much about the fitness of its partner, it just cares that the gene values are close to but not too close to itself (too close is incestuous, too far is zoöphilic, neither being healthy).

Then, at the end of all the breeding, the K% weakest individuals in the current generation are compared to the K% fittest individuals from the previous generation and swapped out if they perform worse. If K% and the population size are both small, rounding may cause zero individuals to carry over from one generation to the next.

The inbreeding option controls what percent similarity is considered attractive as a mate, the maintain value is K%.

Do note that Galapagos keeps a full record of all genomes+fitnesses and it will never create a new genome which is identical to one that occurred before. Especially if the total possible number of different genomes (i.e. a small number of slider, each with only a small number of possible states) is low, you’ll get some weird effects over time as patently unfit genomes are computed anyway, because all the good ones have already been done.

* Although looking at the interface, N is either a constant or may even be hard-coded to zero. I don’t remember whether the bias towards fitter genomes is the only bias left in the most recent version.

(Olivier Stocker) #3

i had a 75 pop size, with 5% maintaining (3.75 ind, rounded to 3) it’s not comming from here !

What can explain that the median fitness value seems to decrease some times ?

(David Rutten) #4

Indeed, I’m surprised the top edge of the graph went down in a few occasions.

The drop in mean fitness is just one of those things that happens over time. Perhaps some local maximum was exhausted, forcing the genomes downslope into worse and worse states, until they found another local maximum and were able to climb a new peak (the fitness going up again in the last part of the graph).

The population at large isn’t really that interesting to Galapagos, it really only wants to find a single good answer. It doesn’t care if the other individuals are all flunkies.

(Olivier Stocker) #5

As my problemtics are quite complex, the gene arn’t totally independant from each other. It maybe can explain the fitness change, but still have no clue about that best fitness drop.

I also use a little trick to have a true/false for each solution : i put a stream filter to make sure no fitness value will be registered (it register “NaN”) if the sol doesn’t mtach my criteria. It seems to be harmless to the optimization process (reagrdless the fact i need to have large pop size). Can you confirm this (even more about the comparison between %K weakest of gen i and %k fittest of gen i-1)

(Olivier Stocker) #6

@DavidRutten After some tests i’m quite sure that galapagos isn’t capable of handling empty fitness “NaN”. It’s maybe the comparasion between %K weakest of gen i and %k fittest of gen i-1 that fail and result in the suppression of the fittest solution.

Since a stoped using my trick, i didn’t faced fittest solution drop anymore :

(David Rutten) #7

Yeah NaNs will definitely be a problem. Unless handled specifically (which I don’t think happens) they really mess with equality tests.