IT-377 Assignment #4

Due Date: Tuesday March 24th (This is your final assignment before the project)

The object of this assignment is to use genetic programming (GP) to evolve neural networks which will be used as controllers for simple simulated organisms. The organisms, called "gridwalkers", will move around a 2d grid and consume whatever quantity is in the cells that they visits. The cells will contain a value between 0 and 2, with the following "nutritional values":

0: 0 food units (empty cell); 1: 1 food unit; 2: -1 food units (toxic).

So clearly, a smart gridwalker will visit as many 1 cells as possible while avoiding 2 cells. A gridwalker's fitness is simply the amount of resource it acquires during k simulation steps. After a gridwalker consumes the food or toxin in a cell, the cell's value is set to 0.

Each gridwalker should be run by itself through the food grid for k simulation steps. On each step, the agent must determine the value of 12 boolean variables, input these 12 variables into its neural network, and receive as network outputs 4 boolean values that determine the next action.

The 12 inputs test the 3 state conditions (empty, food, toxin) for all 4 neighbors (up, down, left, right):

EU, ED, EL, ER: Is there an empty cell up, down, left and/or right of the agent's current cell.
FU, FD, FL, FR: Is there food (a value of 1) up, down, left and/or right of the agent's current cell.
TU, TD, TL, TR: Is there toxin up, down, left and/or right of the agent's current cell.

The 4 neural-net outputs specify movement in each of 4 directions (up, down, left and right): MU, MD, ML, MR. If ML and MR (or MU and MD) are both true, then they should cancel so that no horizontal movement occurs. Similarly, MU and MD will cancel if both are true. If one horizontal and one vertical movement are specified, then they should be resolved in a deterministic manner as either a) one of the specified directions or b) a diagonal movement; i.e. if the network says to move up and left, the robot should either move up or left or diagonally up and to the left. You decide. The agent should consume the

The easiest approach to neural network evolution via GP is to use Koza's P/W formalism, as discussed in the last lecture. This requires strong typing in GP, and EVOKE has been modified accordingly, so check section 6.2 of the updated EVOKE manual and download the latest version of EVOKE.

In LISP, you will need to write a wrapper macro for the P/W s-expressions. The wrapper should bind the 12 input variables, all of which should be part of the terminal list. A 13th terminal should be *R*. This will cause EVOKE to generate a random real number in the closed range [-1 1]. These numbers can then be used as weights on the network arcs (i.e. inputs to the W function). Your s-expressions should be strongly typed so that all of them begin with the function LIST4, which should take in 4 elements and produce a list containing all 4. These will correspond to the 4 network output actions (move up, down, left and/or right). LIST4 should not appear anywhere else in the s-expressions.

Test your system on a 10x10 grid with k=100 and the cells randomly initialized to 0, 1 or 2. The GP trees will need to be of sufficient depth, probably 12 levels or more. This means that running your system on a PC could be too time consuming. If so, talk with the student assistants about running on Unix. Also test your system on a structured input grid of some sort, where, for example, all food is in one region of the grid but the agent must navigate through a region containing many toxic cells on the way to the food. Are there any differences in the fitnesses achieved in the random versus the structured grids? Discuss why or why not.

The main purpose of this assignment is to get a "hands on" understanding of the connections between evolutionary computation and neural networks. Koza's P/W model for neural networks is far from the optimal neural net representation for this task, but it's quite straightforward and conforms easily to the GP formalism. However, there is no guarantee that P/W networks using the 12 inputs and 4 outputs described above will be sufficient to produce highly intelligent grid walkers. To receive full credit on this assignment, you need only show that you've combined GP and P/W models to evolve grid walkers, and you must test them on at least the two grids described above and describe what happened. Beyond that, you are free to change the conditions of the experiment by using larger or smaller grids, including more, less or different inputs to the network, getting different types of output from the network, etc. Other possibilities include running all agents at once and including an input bit telling whether or not another agent is in each of the 4 agent cells. Moving into an occupied cell could incur some "fighting" cost, for example. Feel free to modify the relative numeric values of food and toxin, also. This assignment can also be extended in various ways to form an interesting final project!

You are not required to write a graphic interface, but something that shows the best of generation individuals moving through the grid can often be helpful in debugging and quite fun to watch!

You have an extra week to work on this assignment due to (a) the possible added headaches of moving to UNIX, (b) the considerably longer runtime of this type of application, and (c) the possible need to read a little extra material on neural networks.