2 The Genome

Robert H. Waterston, John E. Sulston, Alan R. Coulson


Our knowledge of the Caenorhabditis elegans genome has increased substantially since the publication of the 1988 C. elegans book (Emmons 1988); even the genome size has changed from an estimated 80 × 106 base pairs to 100 × 106 base pairs. Systematic study of the genome in the intervening years has seen the construction of a nearly complete physical map and the release of more than half the assembled sequence. Yet it is an awkward time to be writing about the genome, since our view of the genome is changing rapidly (~2 Mb of newly assembled sequence is being released per month), and as yet most of the sequence has been obtained from the gene-rich regions of the genome, with very little from the gene-poor autosomal arms. As a result, analysis of the overall sequence remains frustratingly anecdotal. Nevertheless, much has been learned, and this chapter will attempt to summarize our current understanding of the C. elegans genome.

The genome is the physical basis for genetics and includes both nuclear and cytoplasmic DNAs. For C. elegans, the mitochondrial genome (13,794 bp) has been fully sequenced (Okimoto et al. 1992). The nuclear genome contains approximately 100 × 106 base pairs, organized into six chromosomes ranging in size from 14 × 106 to 22 × 106 base pairs (Coulson et al. 1991), which is approximately 20 times the size of Escherichia coli (the underestimate of the E. coli genome size, used as a standard, led to the underestimate of...

