Learning the alphabet of gene control
Scientists at Karolinska Institutet in Sweden have made a large step towards the understanding of how human genes are regulated. In a new study, published in the journal Cell, they identified the DNA sequences that bind to over four hundred proteins that control expression of genes. This knowledge is required for understanding of how differences in genomes of individuals affect their risk to develop disease. After the human genome was sequenced in 2000, it was hoped that the knowledge of the entire sequence of human DNA could rapidly be translated to medical benefits such as novel drugs, and predictive tools that would identify individuals at risk of disease. This however turned out to be harder than anticipated, one of the reasons being that only 1 percent of the genome that code for proteins was in fact possible to read. The remaining part, much of which describes how these proteins should be expressed in different cells and tissues, could not be understood. This, in turn, because the scientists did not know which DNA sequences are functional, and bind to the specific proteins called transcription factors that regulate gene expression.
"The genome is like a book written in a foreign language, we know the letters but cannot understand why a human genome makes a human or the mouse genome a mouse," says Professor Jussi Taipale, who led the study at the Department of Biosciences and Nutrition. "Why some individuals have higher risk to develop common diseases such as heart disease or cancer has been even less understood."
The human genome encodes approximately 1000 transcription factors, and they bind specifically to short sequences of DNA, and control the production of other proteins. In the work published in Cell, the scientists at Karolinska Institutet describe DNA sequences that bind to over 400 such proteins, representing approximately half of all human transcription factors. Data was generated with a new method that uses a modern DNA sequencer that produces hundreds of millions of sequences, giving the results unprecedented accuracy and reliability.
In addition, binding specificities of human transcription factors were compared to those of the mouse. Surprisingly, no differences were found. According to the scientists, these results suggest that the basic machinery of gene expression is similar in humans and mice, and that the differences in size and shape are caused not by differences in transcription factor proteins, but by presence or absence of the specific sequences that bind to them.
"Taken together, the work represents a large step towards deciphering the code that controls gene expression, and provides an invaluable resource to scientists all over the world to further understand the function of the whole human genome," says Professor Taipale. "The resulting increase in our ability to read the genome will also improve our ability to translate the rapidly accumulating genomic information to medical benefits.
Source: Karolinska Institutet
Latest Science NewsletterGet the latest and most popular science news articles of the week in your Inbox! It's free!
Check out our next project, Biology.Net
From other science news sites
Popular science news articles
- Hiding in plain sight: Elusive dark matter may be detected with GPS satellites
- Cutting-edge computer software helps pinpoint aggressiveness of breast cancer tumors
- The dirty side of soap
- Major brain pathway rediscovered after century-old confusion, controversy
- Symmetrical knees linked to Jamaican sprinting prowess
- Magnetic fields frozen into meteorite grains tell a shocking tale of solar system birth
- Lightning expected to increase by 50 percent with global warming
- Genomic data support early contact between Easter Island and Americas
- Space: The final frontier in silicon chemistry
- Hubble sees 'ghost light' from dead galaxies