The genesis file is the main configuration file for Avida. With this file, the user can setup all of the basic conditions for a run. Below are detailed descriptions for some of the settings in genesis, with the ones you should be most concerned about colored in green. The non-colored entries you will probably never need to change unless you are doing a very specialized project.
This section covers all of the basic variables that describe the Avida run. This is effectively a miscellaneous category for settings that don't fit anywhere below.
MAX_UPDATES MAX_GENERATIONS END_CONDITION_MODE | These settings allow the user to determine for how long the run should progress in generations and in updates, and determine if one or both criteria need to be met for the run to end. The run will also end if ever the entire population has died out. A setting of -1 for either ending condition will indicate no limit. End conditions can also be set in the events file, as is done by default, so you typically won't need to worry about this. |
WORLD-X WORLD-Y | The settings determine the size of the Avida grid that the organisms populate. In mass action mode the shape of the grid is not relevant, only the number of organisms that are in it. |
MAX_CPU_THREADS | At the moment, I think this feature isn't working. Ideally, it determines the number of simultaneous processes that an organism can run. That is, basically, the number of things it can do at once. |
RANDOM_SEED | The random number seed initializes the random number generator. You should alter only this seed if you want to perform a collection of replicate runs. Setting the random number seed to zero (or a negative number) will base the seed on the starting time of the run -- effectively a random random number seed. In practice, you want to always be able to re-do an exact run in case you want to get more information about what happened. |
DEFAULT_DIR | This entry allows the user to enter a directory name where Avida can find the other needed configuration files if they are not local. |
INST_SET EVENT_FILE ANALYZE_FILE ENVIRONMENT_FILE START_CREATURE | These settings indicate the names of all of the other configuration files used in an Avida run. See the individual documents for more information about how to use these files. |
These settings control how creatures are born and die in Avida.
BIRTH_METHOD | The birth method sets how the placement of a child organism is determined. Currently, there are six ways of doing this -- the first four (0-3) are all grid-based (offspring are only placed in the immediate neighborhood), and the last two (4-5) assume a well-stirred population. In all non-random methods, empty sites are preferred over replacing a living organism. |
DEATH_METHOD AGE_LIMIT | By default, replacement is the only way for an organism to die in avida. However, if a death method is set, organisms will dies of old age. In method one, organisms will die when they reach the user-specified age limit. In method 2, the age limit is a multiple of their length, so larger organisms can live longer. |
ALLOC_METHOD | During the replication process, parent organisms must allocate memory space for their child-to-be. Before the child is copied into this new memory, it must have an initial value. Setting the alloc method to zero sets this memory to a default instruction (typical nop-A). Mode 1 leaves it uninitialized (and hence keeps the contents of the last organism that inhabited that space; if only a partial copy occurs, the child is a hybrid if the parent and the dead organism, hence the name necrophilia). Mode 2 just randomizes each instruction. This means that the organism will behave unpredictably if the uninitialized code is executed. |
DIVIDE_METHOD | When a divide occurs, does the parent divide into two children, or else do we have a distinct parent and child? The latter method will allow more age structure in a population where an organism may behave differently when it produces its second or later offspring. |
GENERATION_INC_METHOD | The generation of an organism is the number of organisms in the chain between it and the original ancestor. Thus, the generation of a population can be calculated as the average generation of the individual organisms. When a divide occurs, the child always receives a generation one higher than the parent, but what should happen to the generation of the parent itself? In general, this should be set the same as divide method. |
These place limits on when an organism can successfully issue a divide command to produce an offspring.
CHILD_SIZE_RANGE | This is the maximal difference in genome size between a parent and offspring. The default of 2.0 means that the genome of the child must be between one-half and twice the length of the parent. This it to prevent out-of-control size changes. Setting this to 1.0 will ensure fixed length organisms (but make sure to also turn off insertion and deletion mutations). |
MIN_COPIED_LINES MIN_EXE_LINES | These settings place limits on what the parent must have done before the child can be born; they set the minimum fraction of instructions that must have been copied into the child (vs. left as default) and the minimum fraction of instructions in the parent that must have been executed. If either of these are not met, the divide will fail. These settings prevent organisms from producing pathological offspring. In practice, either of them can be set to 0.0 to turn them off. |
REQUIRE_ALLOCATE | Is an allocate required between each successful divide? If so, this will limit the flexibility of how organisms produce children (they can't make multiple copies and divide them off all at once, for example). But if we don't require allocates, the resulting organisms can be a lot more difficult to understand. |
REQUIRED_TASK | This was originally a hack. It allows the user to set the ID number for a task that must occur for a divide to be successful. At -1, no tasks are required. Ideally, this should be incorporated into the environment configuration file. NOTE: A task can fire without triggering a reaction. To add a required reaction see below. |
IMMUNITY_TASK | Allows user to set the ID number for a task which, if it occures, provides immunity from the required task (above) -- divide will proceede even if the required task is not done if immunity task is done. Defaults to -1, no immunity task present. |
REQUIRED_REACTION | Allows the user to set the ID number for a reaction that must occur for a divide to be successful. At -1, no reactions are required. |
DIE_PROB | Determines the probability of organism dieing when 'die' instruction is executed. Set to 0 by default, making the instruction neutral. |
These settings control how and when mutations occur in organisms. Ideally, there will be more options here in the future.
POINT_MUT_PROB | Point mutations (sometimes referred to as "cosmic ray" mutations) occur every update; the rate set here is a probability for each site that it will be mutated each update. In other words, this should be a very low value if it is turned on at all. If a mutation occurs, that site is replaced with a random instruction. In practice this also slows avida down if it is non-zero because it requires so many random numbers to be tested every update. |
COPY_MUT_PROB | The copy mutation probability is tested each time an organism copies a single instruction. If a mutation occurs, a random instruction is copied to the destination. In practice this is the most common type of mutations that we use in most of our experiments. |
INS_MUT_PROB DEL_MUT_PROB | These probabilities are tested once per gestation cycle (when an organism is first born) at each position where an instruction could be inserted or deleted, respectively. Each of these mutations change the genome length. Deletions just remove an instruction while insertions add a new, random instruction at the position tested. Multiple insertions and deletions are possible each generation. |
DIVIDE_MUT_PROB DIVIDE_INS_PROB DIVIDE_DEL_PROB | Divide mutation probabilities are tested when an organism is being divided off from its parent. If one of these mutations occurs, a random site is picked for it within the genome. At most one divide mutation of each type is possible during a single divide. |
This section covers tests that are very CPU intensive, but allow for avida experiments that would not be possible in any other system. Basically, each time a mutation occurs, we can run the resulting organism on a test CPU, and determine if that mutations was lethal, detrimental, neutral, or beneficial, as well as the type of mutation it was. This section allows us to act on this. (Note that as soon as anything here is turned on, the mutations need to be tested. Turning multiple settings on will cost no additional speed decrease)
REVERT_FATAL REVERT_DETRIMENTAL REVERT_NEUTRAL REVERT_BENEFICIAL | When a mutation occurs of the specified type, the number listed next to that entry is the probability that the mutation will be reverted. That is, the child organism's genome will be restored as if the mutation had never occurred. This allows us both to manually manipulate the abundance of certain mutation types, or to entirely eliminate them. |
STERILIZE_FATAL STERILIZE_DETRIMENTAL STERILIZE_NEUTRAL STERILIZE_BENEFICIAL | The sterilize options work similarly to revert; the difference being that an organism never has its genome restored. Instead, if the selected mutation category occurs, the child is sterilized so that it still takes up space, but can never produce an offspring of its own. |
FAIL_IMPLICIT | If this toggle is set, organisms must be able to produce exact copies of themselves or else they are sterilized and cannot produce any offspring. An organism that naturally (without any external effects) produces an inexact copy of itself is said to have implicit mutations. If this flag is set, explicit mutations (as described in the mutations section above) can still occur. |
AVE_TIME_SLICE | This sets the average number of instructions an organism should execute each update. Organisms with a low merit will consistently obtain fewer, while organisms of a higher merit will receive more. |
SLICING_METHOD | This setting determines the method by which CPU time is handed out to the organisms. Method 0 ignores merit, and hands out time on the CPU evenly; each organism executes one instruction for the whole population before moving onto the second. Method 1 is probabilistic; each organism has a chance of executing the next instruction proportional to it merit. This method is slow due to the large number of random values that need to be obtained and evaluated (and it only gets slower as merits get higher). Method 2 is fully integrated; the organisms get CPU time proportional to their merit, but in a fixed, deterministic order. |
SIZE_MERIT_METHOD | This setting determines the base value of an organism's merit. Merit is typically proportional to genome length otherwise there is a strong selective pressure for shorter genomes (shorter genome => less to copy => reduced copying time => replicative advantage). Unfortunately, organisms will cheat if merit is proportional to the full genome length -- they will add on unexecuted and uncopied code to their genomes creating a code bloat. This isn't the most elegant fix, but it works. |
TASK_MERIT_METHOD | This toggle determines if merit can be increased by performing tasks. Ideally, this should just be taken care of in the environment file. |
MAX_LABEL_EXE_SIZE | Labels are sequences of nop (no-operation) instructions used only to modify the behavior of other instructions. Quite often, an organism will have these labels in their genomes where the nops are used by another instruction, but never executed directly. To represent the executed length of an organism correctly, we need to somehow count these labels. Unfortunately, if we count the entire label, the organisms will again "cheat" artificially increasing their length by growing huge labels. This setting limits the number of nops that are counted as executed when a label is used. |
MERIT_TIME | When should merit be updated for an organisms? A 0 here indicates that every time a task is completed, the merit should immediately be updated to reflect that task. A 1 means that the merit is only updated on a divide (taking into account all the merit earned over the organisms lifetime) and passed on to both the parent and child for their next gestation cycle. Since there are such radical merit changes over the lifetime of an organism, method 0 can cause some odd effects where once an organism builds up enough merit they can have lots of offspring rapidly, but most organisms dies in infancy. Method 1 keeps merit constant for an organisms' entire life, but innovations are only rewarded one generation removed. At some point we could add an option of the highest of the two. |
MAX_NUM_TASKS_REWARDED | This setting allows the user to limit the total number (but not magnitude) of rewards that and organism gets by performing tasks. This is another quick hack, and should probably be incorporated into the environment configuration file. |
These settings control how avida monitors and deals with genotypes, species, and lineages.
THRESHOLD | For some statistics, we only want to measure organisms that we are sure are alive, but its not worth taking the time to run them all in isolation, without outside effect (and in some eco-system situations that isn't even possible!). For these purposes, we call a genotype "threshold" if there have ever been more than a certain number of organisms of that genotype. A higher number here ensures a greater probability that the organisms are indeed "alive". Recently, we've been shifting away from using threshold genotypes and instead finding other, more accurate testing methods. |
GENOTYPE_PRINT | Should all genotypes be printed out upon reaching threshold? Each will receive its own file in the genebank directory, so this can get very hard disk intensive. Many runs will have in the millions of organisms. |
GENOTYPE_PRINT_DOM | Printing only the dominant genotype keeps track of the most successful individual genotypes without costing a huge amount of memory. The number you place here is the total number of updates that a genotype must remain dominant for it to be printed out. A 0 turns this off. |
SPECIES_THRESHOLD | In Avida, two organisms are said to be of the same species if you can perform all possible crossovers between them, and no more than a certain threshold (set here) fail to be viable offspring. The crossovers are done in isolation, and never affect the population as a whole. |
SPECIES_RECORDING | This entry sets if and how species should be recorded in avida. A setting of 0 turns all species tests off. A setting of 1 means that every time a genotype reaches threshold, it is tested against all currently existing species to determine if it is part of any of them. If so, its species is set, and if not, it becomes the prototype of a new species. Finally, a setting of 2 only tests a new threshold genotype against the species of its parent (since each species test can take a long time) and if that fails immediately creates a new species. In practice, methods 1 and 2 produce similar results, but method 1 can take a lot longer to run. |
SPECIES_PRINT | Toggle: Should new species be printed as soon as they are created? |
TEST_CPU_TIME_MOD | Many of our analysis methods (such as species testing) require that we be able to run organisms in isolation. Unfortunately, some of these organisms we test might be non-viable. At some point, we have to give up the test and label it as non-viable, but we can't give up too soon or else we might miss a viable, though slow replicator. This setting is multiplied by the length of the organism's genome in order to determine how many CPU-cycles to run the organism for. A setting of 20 effectively means that the average instruction must be executed twenty times before we give up. In practice, most organisms have an efficiency here of about 5, so 20 works well, but for accurate tests on some pathological organisms, we will be required to raise this number. |
TRACK_MAIN_LINEAGE | In a normal avida run, the genebank keeps track of all existing genotypes, and deletes them when the last organism of that genotype dies out. With this flag set, a genotype will not be deleted unless both it and all of its descendents have died off. This allows us to track back from any genotypes to its distant ancestors, monitoring all of the differences along the way. Once this information is being saved, see the events file for how to output it. |
Log files are printed every time a specified event occurs. By default, all logs settings are 0 (i.e. the logs are turned off). Each time a logged event is printed, the update and identifying information on the individual that triggered it is always included. There are more entries listed in the genesis file than here, but I think all of the rest are deprecated.
LOG_CREATURES | If toggle is set, print an entry to "creature.log" whenever a new organism is born. Include position information, parent organism, and a link to it genotype so the run can be reconstructed. This gets very large. |
LOG_GENOTYPES | If toggle is set, print an entry to "genotype.log" whenever a new genotype is created. Includes information on its parent genotype. |
LOG_THRESHOLD | If toggle is set, print an entry to "threshold.log" whenever a genotype reaches threshold. Includes information on what species it is. |
LOG_SPECIES | If toggle is set, print an entry to "species.log" whenever a new species is created. Includes information on the genotype the triggered the creation. |
LOG_LINEAGES | Lineages can be given unique identifies and printed (into the file "lineage.log") whenever they are created. Includes details about the event that created the lineage. |
LINEAGE_CREATION_METHOD | Details when lineages are created. This should probably be listed in an earlier section, but there is way too much descriptive information in the genesis file that should probably go instead into a manual. Basically, this allows you to decide under exactly which conditions a new lineage will be created. I'll let you read the genesis file itself to see the methods. |