The human genome was sequenced in 2003, an important step in understanding the blueprint of life. However, before this information can be fully utilized, the location, identity, and function of all protein- encoding and non-protein-encoding genes must be determined. Moreover, the human genome has many other functional elements, ranging from promotors, regulatory sequences, and other factors that determine chromatin structure. These must also be determined to fully understand the human genome.
The ENCODE (Encyclopedia of DNA Elements) project aims to solve these problems by delineating all functional elements of the human genome. To accomplish this goal, a consortium was formed to guide the project. The consortium aimed to advance and develop technologies for annotating the human genome with higher accuracy, completeness, and cost-effectiveness, along with more standardization.They also aimed to develop a series of computational techniques to parse and analyze the data obtained.
To accomplish this goal, a pilot project was launched. The ENCODE pilot project aimed to study 1% of the human genome in depth, roughly from 2003 to 2007. From 2007 to 2012, the ENCODE project ramped up to annotate the entire genome. Finally, from 2012 onwards, the ENCODE project aims further increases in all dimensions: deeper sequencing, more assays, more transcription factors, etc.
This chapter will describe some of the experimental and computational techniques used in the ENCODE project.