Learning @ Georgetown

Change font size: A A A

Predicting the Best Cut

Dr. Chris Elsik

Dr. Elsik created a computer-based statistical model to assist biologists working with protein strings. (Photo: Roland Dimaya)

By Megan Weintraub

Proteins consist of strings of amino acids that facilitate many of the essential processes within our cells. Some proteins, such as those used in x-ray crystallography, are particularly large and unwieldy. Often, scientists are interested in cutting the protein strings into smaller, more manageable pieces for their research, but they must be wary of the structure of the molecule because it contains both independent and dependent parts. While at Texas A&M University, Dr. Chris Elsik, now a professor of bioinformatics at Georgetown’s Department of Biology, wanted to find a way to predict the correct place to cut the string.

“Depending on where you cleave the string, the protein can either fold into an independent structure of its own, or it can wither and be useless for study,” she explains.

These independent structures are called protein domains. Human DNA contains roughly 20,000 protein-coding genes, all with their own domain structure, a staggering number to consider categorizing according to functional attributes. As genome sequencing projects have taken off in recent years, the need for a method by which to determine the boundaries between domains has increased.

“It’s great when we find new proteins to study,” says Dr. Elsik, “but the boundaries between the domains of these new structures are even harder to figure out.”

At Texas A&M, Dr. Elsik teamed up with a statistician, Bani Mallick, and a Ph.D. student, Kyounghwa Bae, to develop a model for predicting the location of linkers, the short amino acid sequences that connect multiple domains in a long protein string.

“We developed algorithms to detect the difference in composition between linkers and domains,” explains Dr. Elsik. “Our challenge was to develop a method that would achieve this with a high level of accuracy and efficiency.”

Dr. Elsik and her colleagues eventually created a statistical model to predict the correct place for scientists to cleave the amino acid string. As a result of their findings, she was invited to the University of Vienna to present their work and they have also published papers on this topic.

The potential for the statistical model is significant. Eventually, Dr. Elsik would like to see the model run within a computer program that would automate the search for scientists all over the world.

“This would save biologists time and money in the lab. If they knew where to cleave the protein string, they would be able to start their objective research goals sooner,” explains Dr. Elsik.

Although this project is presently not being funded, it may be of significant value to the National Institutes of Health, which launched Protein Structure Initiative in 2000 to discover as many different protein structures as possible.

Print Article

Feature Story

Related Stories