Stephanie Forrest

Stephanie Forrest - DNA

CANNOT INCLUDE FILE links.org

DNA Fragment Assembly


This project is in collaboration with Rebecca Parsons. We are applying the genetic algorithm to a computational biology problem—the assembly of DNA sequence fragments from a parent clone whose sequence is unknown into a consensus sequence corresponding to the parent sequence. This is called the ``fragment assembly problem'' and was chosen as one of two challenge problems for the 1994-95 DIMACS competition. It is a permutation problem, similar to the Traveling Salesman Problem (TSP), and similarly, NP-complete, but with some important differences (circular tours, noise, and special relationships between entities). In this work we have studied two different representations and many different genetic operators. We have compared the two representations and their associated operators on problems ranging from 2 kilobases to 34 kilobases. After experimenting with different combinations, we were able to solve a 10 kilobase sequence, consisting of 177 fragments, with no manual intervention. We recently extended our work to a 752 fragment set and achieved very close agreement with the published consensus sequence.