Learning @ Georgetown

Change font size: A A A

Dr. Chris Elsik Unravels Bioinformatic Code

By Megan Weintraub

At first, it seems as if honey bees, cattle, and video games have almost nothing in common. But Dr. Chris Elsik, an assistant professor of bioinformatics in Georgetown’s Department of Biology, sees a clear connection. In one way or another, all three factor into her professional interests. Dr. Elsik is fascinated by all types of puzzles in the natural world, and she has found that the field of bioinformatics offers her a way to marry her skills in biology and programming. For the past several years, her passion for solving problems has led her to study one of nature’s most beguiling riddles: the gene.

Genomes include long strings of deoxyribonucleic acid (DNA), the complicated blueprint of an organism’s function and composition. Scientists face a formidable challenge when they attempt to locate relevant bits of information within these extensive sequences of DNA. This research, called genome annotation, first tries to discern which parts of the sequence codes for genes. From there, scientists can try to identify the trait dictated by a particular gene within the organism. Scientists who work on genome sequencing projects, such as those run by Dr. Elsik, approach this challenge with a healthy dose of patience.

“Unraveling the strings of DNA’s nucleotide code in order to find the genes hidden within is like searching for a sentence embedded in a chapter,” she explains.

To parse meaning from a long genome sequence, Dr. Elsik runs computer programs and creates databases that allow scientists around the world to research some of today’s most important issues in medicine and social organization. The gene prediction programs automate the process of sifting through large amounts of data. When the programs finish analyzing a data set, they produce gene prediction results that can identify the genes embedded in the long strings of DNA.

Genome annotation challenges scientists to find similarities and differences on the molecular level among the world’s species. This family tree of animals shows scientists where the genes of certain species diverged at various points in the past.

“We are interested in organisms that have a particular evolutionary distance from humans because this can tell us a lot about our own genes,” explains Dr. Elsik.

In particular, she helps to identify honey bee and bovine genome sequencing. While working at Texas A&M University, Dr. Elsik began her collaboration with Baylor College of Medicine’s Human Genome Sequencing Center, which is funded by the National Institutes of Health to determine the genome sequence of many organisms, including honey bee and cattle. With funding from the United States Department of Agriculture, Dr. Elsik started the Bovine Genome Database Project and BeeBase (the honey bee genome database), which catalog decoded gene sequences and allow researchers to discover more about the behavior of cattle and honey bee genes.

“Cattle are a useful species to study because they are domesticated, so they are part of a pedigree,” explains Dr. Elsik. “Pedigrees can show us how a certain trait was inherited from one generation to the next.”

Her work with honey bees has implications for the study of medicine since, like humans, honey bees are highly social creatures. Scientists can observe honey bee colonies in order to learn more about the spread of disease within tight-knit populations.

“The honey bee is a good subject for genome sequencing research because their colonies represent an intricate kind of society and system of communication. The queen bees and worker bees display a high level of organization,” she says.

Dr. Elsik also created a database with a graphical interface to enable the collaboration of approximately 200 scientists from around the world as they review the gene prediction results produced by different algorithms. The website serves as a bridge between bioinformaticians and biologists. The program is useful for scientists because computational research can often save them precious time in the lab.

“Scientists can look at their favorite genes online and correct the output or provide extra information that might be missing,” she explains. “When they return to the lab, they already know where in the sequence to locate a particular gene for research.”

Scientists are always learning about new attributes and functions of the gene, but the gene prediction algorithms they use are not flawless. Occasionally, they fail to accurately predict the location of a gene in the DNA sequence. Dr. Elsik points out that genome sequencing research has come a long way in its short history. Yet, while the algorithms are sophisticated, they still require a step of manual annotation where the results are viewed by human eyes to verify that the algorithm made an accurate prediction. In recent months, she has recruited several undergraduate students at Georgetown to perform this work-intensive evaluation of the software’s output.

Dr. Elsik first made the jump from traditional “wet lab” biology to bioinformatics when she was studying plant molecular genetics and working toward her Ph.D. at Texas A&M.

Eventually, Dr. Elsik decided to pursue post-doctorate research in bioinformatics so that she could find new answers to the puzzles that fascinated her.

“I love to look at a large data set to find the previously unexplored,” she says. “Writing programs is like playing video games. I try something out and get rapid results. If the program doesn’t give me a helpful answer, then I just try to solve the problem a different way. Most of all, I love the challenge of trying to figure out what the genome is telling us.”

Print Article

Related Stories