Our group is currently engaged in the development and application of computational methods to study bio-molecular systems (such as proteins and nucleic acids), often with explicit all-atom representation of both solvent and the macromolecules. Typical calculations involve molecular dynamics simulations of the system to provide an atomic-detail picture of the behavior of a single molecule.

Protein Folding and Aggregation

The self-assembly of proteins into their native state from random coil state to carry out biological functions has been an extensively studied problem for over a few decades. Understanding the mechanism of folding would enable us to predict protein structures more accurately and to design new proteins. Due to the extreme difficulties in probing the folding process with experimental methods, computer simulations have gained importance in filling this gap. The current challenge lies in the understanding how particular chemical detail in proteins lead to particular protein structures and folding mechanisms.

Protein misfolding has been linked to a number of human diseases, including cystic fibrosis, Alzheimer’s disease and other amyloidoses, and prion spongiform encephalopathies such as Creutzfeldt-Jacob disease. An understanding of the competition between folding and aggregation is critical to preventing misfolding and misassembly of proteins. We have carried out series of computational studies on short peptides to study their aggregation behavior.

Simulation Method Development

Because of large number of particles presented and fine integration time step required, it has been difficult to study long-time behavior of bio-molecules at time scales important to the biochemical events. One area of focus in our group is to develop new algorithms for accurate and efficient simulations. As part of the AMBER development team, we are developing highly efficient simulation algorithms that can take advantage of modern massively parallel computers and to allow us to study the long-time dynamics of biomolecules. We are aslo developing efficient conformational sampling methods that allows us to sample substantially more conformational space than that dictated by the fundamental dynamics. Such methods will facilitate applications in structural refinement.

Structure and Dynamics of G-Protein Coupled Receptors

G-protein coupled receptors (GPCR) are membrane proteins. They typcailly act as the initiators of many signal transduction pathways. They can be activated/deactivated by external signals and interact with the G-proteins which act as signal transducer. Because of their convenient location and their physiological significance, the GPCR superfamily receptors have been the premier targets of drug development effort. About 50% of the existing drugs on the market are designed to interact GPCRs.

Despite the effort, it is still rather challenging to obtain high resolution membrane protein structures. So far, only one GPCR structure has been solved. Fortunately, GPCRs are believed to share a common 7 transmembrane archeticture with relatively flexible extra-cellular and cytoplasmic domains. Therefore, computational modeling has played important role to provide detailed and accurate information on the structure and dynamics of GPCRs.


DNA-protein interaction

The interactions between proteins and DNA play a central role in molecular biology and genomics. These interactions include both sequence-specific and sequence-non-specific types. In the sequence-specific interactions, proteins recognize a specific DNA sequence (e.g., restriction enzymes and transcription factors). In the non-specific type, proteins interact with DNA regardless of the sequence. An example is shown on the left which is the structure of a DNA-histone complex. In this structure, a piece of DNA wraps around the octamer histone complex by about two turns. This is the smallest structural unit in chromatin that is thought to be part of the higher-level organizations to enable tight packing of long DNA double helix. Apparently, interactions with histones become important steps in processes such as transcription and replication. During these processes, the histone dynamics is believed to play crucial roles.

Computer Aided Drug Design

We are currently collaborating with Dr. Honggao Yan from Michigan State University to study 6-Hydroxymethyl-7,8-dihydropterin pyrophosphokinase (HPPK) as a potential drug target. HPPK is an enzyme that catalyzes the first reaction in the folate pathway. As mammals obtain folate through dietary means while microorganisms must synthesize this essential metabolite de novo, that makes HPPK an ideal target for antimicrobial agents. Initial studies will focus on molecular dynamics simulations to understand the induced-conformational changes in the active-site to provide important structural and dynamics information for structure-based drug design.

HIV integrase is an emzyme responsible for integrating the viral DNA to the host DNA. Shown on the left is the crystal structure of the catalytic (core) domain. In this structure, however, the loop overhanging the catalytic site (marked by Mg2+ ion shown as the green sphere) is partially disordered, making it rather difficult to interpret the functional role of the loop. In collaboration with Jim Briggs of University of Houston, we studied the dynamics of the core domain and found that the loop can potentially undergo conformational transition between the open and closed forms.


Computing Resources

Major supercomputer resource is provided by National Partnership for Advanced Computational Infrastructure (NPACI) and Pittsburgh Supercomputing Center (PSC).
Local clusters:
Elan cluster: 40 dual-CPU Pentium IV Xeon 2.4GHz, compute nodes, 3 storage nodes (1.8TB RAID 5).

Opteron cluster: We are in the process of building an Opteron cluster which will have approximately 250 nodes in the summer and will be expanded to about 400-500 nodes within 2-3 years.

Acknowledgement

Funding has been provided by:
National Center for Research Resources (NIH), (P20 RR15588)
National Institute for General Medical Sciences (NIH), (1R01GM64458 and 1R01GM67168)