|T H E N I H C A T A L Y S T||J A N U A R Y - F E B R U A R Y 1 9 9 7|
|P E O P L E|
My work in computational biology at NIH has been dedicated to four main activities: exploring the evolution of viruses, detecting conserved motifs in proteins, predicting new protein functions and characterizing protein families and superfamilies, and undertaking the comparative analysis of complete bacterial genomes.
In 1994, my NCBI colleagues Roman Tatusov and Stephen Altschul and I developed a new program called MoST (Motif Search Tool) for detecting conserved and potentially functionally important motifs in protein sequences. This new method, combined with existing approaches, helped us and others to discover several new motifs in proteins, which led to important functional predictions. These include, for example, a novel nucleotide-binding motif shared by eukaryotic translation initiation factor eIF-2B and a variety of nucleotidyltransferases, and another motif that is conserved in splice-junctions of self-splicing proteins and in the hedgehog family of development regulators.
Lately, we have been concentrating on proteins implicated in human disease or development. Many of these proteins have multiple domains and are primarily regulatory rather than enzymatic. They contain motifs that define critical protein-protein interactions that are subtle and hard to detect yet are likely to provide important clues to the proteins' mechanisms of action. An example of a recent discovery in this area is a domain shared by BRCA1, the product of the breast and ovarian cancer susceptibility gene, and proteins involved in cellular DNA's damage-responsive checkpoints. Experimental pursuit of this lead may advance our understanding of BRCA1's involvement in cell-cycle control and malignant transformation. Very recently, my colleagues Arcady Mushegian and Mark Boguski and I completed a detailed analysis of the protein sequence encoded by all positionally cloned human disease genes.
A major achievement in the past two years in genome research has been the sequencing of the first complete genomes of single-celled species. By the end of 1996, complete genome sequences were available for four bacteria, one archæa (Methanococcus jannaschii), and one eukaryote (the yeast Saccharomyces cerevisiæ). Comparative analyses of these se-quences opens up a whole new area of research and may eventually result in the reconstruction of the list of specific genes that must have been present in the last common ancestor of bacteria, eukaryotes, and archæa. So far, our computer analyses have resulted in the prediction of several new gene functions and the reconstruction of biochemical pathways in bacterial and archæal species that have not been extensively characterized experimentally (e.g., Hæmophilus influenzæ and M. jannaschii). Mushegian and I have proposed the deduction of a theoretical minimal gene set for cellular life that could be derived by comparing genomes of distantly related species, detecting conserved genes, and supplementing the conserved genes with unrelated genes that perform the same essential functions in each of the bacteria. By comparing the genomes of H. influenzæ and Mycoplasma genitalium, we converged on a set of about 250 genes that may present a reasonable approximation of the minimal gene repertoire required for a cell to function. We are now working simultaneously on the detailed comparison of bacterial, archæal, and eukaryotic genomes and the development of an automated system for genome analysis.
Computational biology obviously has been on the rise in the 1990s, but the real excitement lies in the near future, when multiple genome sequences of model organisms and the human genome become available. The approaches and tools we are developing now will prepare us intellectually and technically to begin mining the wealth of information about life encoded by these sequences.
My principal research interest has been the investigation of the dynamic mechanisms mediating vascular-function abnormalities that may have pathophysiological and clinical implications. In particular, my lab's studies center on endothelial function in patients with essential hypertension and patients with hypercholesterolemia, a focus informed by the observation that the contractile state of vascular smooth muscle is dependent on the presence and integrity of endothelial cells.
We have performed intra-arterial infusion of drugs into the brachial artery with noninvasive measurement of the response of the forearm vasculature and found that both hypertensive and hypercholesterolemic patients have impaired endothelial function. In both sets of patients, impaired function is due to decreased activity of nitric oxide, a small molecule released by endothelial cells during resting conditions and in response to a variety of physiological and pharmacological stimuli. Although we do not yet understand the precise mechanisms accounting for this abnormality, we have observed important differences between hypertensive and hypercholesterolemic patients, suggesting that distinct pathophysiological pathways underlie the endothelial dysfunction in these two conditions.
Endothelium-derived nitric oxide plays a central role in vascular homeostasis by regulating not only vascular tone but also other important processes, such as thrombus formation, lipid transport, and oxidation of lipid molecules. Therefore, a defect in nitric oxide activity might constitute a link between risk factors and the development of atherosclerosis. Our goal is to further characterize the precise mechanisms that regulate endothelial function and that contribute to endothelial dysfunction. This research may lead to a more rational and specific approach to the prevention and treatment of atherosclerosis.
In addition to my research in vascular physiology, I have directed the clinical and research activities of the Echocardiography Laboratory of NHLBI since 1990. This laboratory is responsible for the performance and interpretation of approximately 2,000 studies per year. Over the past few years, we have focused on using cardiac ultrasound imaging to study coronary artery disease. Routine transthoracic echocardiographic examination can identify the pattern of myocardial contraction at rest, during inotropic stimulation to increase cardiac muscle contraction, and during stress, but there is significant attenuation of the ultrasound signal due to the density of the fat, muscle, and bone of the chest wall. Transesophageal echocardiography overcomes the limitations of the transthoracic examination by obtaining heart images through a transducer positioned within the esophagus. We initially reported on the accuracy of transesophageal dobutamine stress echocardiography for the identification of obstructive coronary artery disease in patients undergoing coronary angiography. More recently, we have focused on the study of the myocardial response to dobutamine (a positive inotropic agent) to unmask viable myocardium in patients with left ventricular systolic dysfunction.
Future research directions include the use of novel methodologies, including myocardial contrast echocardiography and three- and four-dimensional imaging, that we anticipate will expand the usefulness of echocardiography as a tool for the clinical investigation of heart disease.
My research has focused on two fundamental questions that turn out to be closely related. First, given that there are limited direct data about human immunodeficiency virus (HIV) infection rates, how does one track its spread in the United States? Second, how does one model the incubation period of the disease?
When I began my research in 1988, 83,000 AIDS cases had been reported to the Centers for Disease Control. That figure now stands at more than half a million. To get a handle on the extent of HIV infection, both diagnosed and unsuspected, I helped to develop what is now known as the "back-calculation" method, whereby one works backwards, on the basis of the AIDS incubation period, to learn how many people must have been infected over time to account for the subsequent numbers of diagnosed AIDS cases.
This so-called back-calculation method has become a major approach worldwide to estimating the size of the AIDS epidemic. In collaboration with epidemiologists at the CDC, I have used these techniques to help make official Public Health Service estimates of HIV prevalence. On the basis of the concordance of back-calculation with other data, the estimate of about 1 million HIV-infected people in the United States was revised downward to 630,0000 to 900,000 Americans living with HIV or AIDS in 1992. A careful look at the numbers underlying this total, however, cautions against complacency. People living with AIDS today are typically in their late 30s, but my research indicates that the majority of newly infected individuals are in their teens and 20s. Thus, the "stability" of HIV prevalence in the United States is misleading: typically, people become infected in their 20s, progress to AIDS in their 30s, and are dead by age 40. Clearly, prevention efforts need to focus on teenagers and young adults.
The calculations of national infection rates are made in light of careful assessment of the natural history of the disease. As chief statistician for the NCI Multicenter Hemophilia Cohort Study (MHCS), I have helped to monitor the experience of more than 1,200 HIV-positive people with hemophilia. Our results suggest that HIV-positive hemophiliacs infected as children progress to AIDS more slowly than hemophiliacs infected as adults (or any other group, for that matter). By exploiting the extensive database of clinical, immunologic, and virologic outcomes in the MHCS, we are working to zero in on the biological mechanisms that account for the protective effect of younger age at infection.
The scientific environment of NIH has allowed me to work at the interface of statistics and epidemiology. The software I developed for back-calculation has evolved into a general-purpose "toolbox" for statistical deconvolution. To estimate the incubation period for AIDS, I developed new methods to obtain smooth estimates of the hazard function from survival data that are directly applicable in cancer and other diseases. My current investigative focus is on identifying those groups at highest risk of HIV infection in the 1990s. Preliminary results point to young homosexual men and young women exposed to heterosexual contact with at-risk individuals as most vulnerable. Minorities within those two groups are at especially high risk.
The activity of virtually every cell in the body is regulated by extracellular signals (e.g, neurotransmitters, hormones, and sensory stimuli) that are transmitted into the cell via distinct plasma membrane receptors, most of which are members of the superfamily of G protein-coupled receptors (GPCRs). By using different muscarinic acetylcholine receptors (m1-m5) and various members of the vasopressin peptide receptor family (V1a and V2) as model systems, my group has addressed the following fundamental questions regarding the structure and function of GPCRs: How are GPCRs arranged (assembled) in the lipid bilayer? How do GPCRs bind ligands? Which structural elements determine the specificity of receptor-G protein interactions? What conformational changes do activating ligands induce in the receptor protein?
Given the lack of high-resolution structural information on any GPCR, we have used a molecular genetic strategy (involving the functional rescue of misfolded mutant muscarinic receptors by complementary mutations) to gain insight into GPCR structure. We have identified specific contact sites between individual transmembrane helices, thus providing insight into the molecular architecture of the transmembrane receptor core.
We recently found that GPCRs can be assembled from multiple independently stable building blocks. We have shown that coexpression of muscarinic or vasopressin receptor fragments - obtained by splitting the wild-type receptors in various intracellular and extracellular loops - results in functional receptor complexes. Immunocytochemical studies revealed that the individual receptor fragments (even when expressed alone) were stably inserted, with proper orientation, into lipid bilayers. Moreover, we have demonstrated that truncated V2 vasopressin receptors known to be responsible for X-linked nephrogenic diabetes insipidus can be functionally rescued (in cultured cells) by coexpression with a C-terminal V2 receptor fragment missing in the mutant receptors. Such findings have potential therapeutic relevance.
We were among the first to comprehensively map the ligand-binding domain of a GPCR (m3 muscarinic receptor). The amino acids forming the acetylcholine binding site were identified by site-directed mutagenesis, and a molecular model of the acetylcholine-receptor complex was delineated. We also showed that the binding site for muscarinic antagonists is distinct from the acetylcholine binding domain, although some amino acids are shared by both sites.
Characteristically, each GPCR can activate only a limited set of the many structurally similar G proteins expressed within a cell. Using different muscarinic and vasopressin receptor subtypes as model systems, we could identify distinct intracellular receptor segments (as well as single amino acids contained within these regions) that are sufficient to dictate receptor-G protein coupling selectivity. On the basis of these findings, we proposed a structural model of the receptor surface critical for G protein recognition.
A major focus of our current work is identifying specific regions on the G protein(s) that are contacted by the different, functionally critical receptor sites. To address this issue, we developed a new experimental approach involving the coexpression of hybrid GPCRs with hybrid G protein alpha subunits. Using this approach, we identified a functionally critical contact site between a short segment of the m2 muscarinic receptor and a short sequence on Galpha1.
The molecular nature of the ligand-induced structural changes in GPCRs (resulting in receptor activation) is as yet unknown and represents a major focus of our future work. Interestingly, we recently identified a series of mutant m2 muscarinic receptors that can activate the proper G proteins even in the absence of ligands. The predicted structural characteristics of these constitutively active mutant receptors suggest that ligand-induced receptor activation involves a translational and/or rotational movement of one of the transmembrane helices.
Since all GPCRs, as well as all heterotrimeric G proteins, share a high degree of structural homology, our findings should be of great general relevance. A better understanding of the molecular basis of ligand-receptor-G protein interactions should pave the way for the development of novel therapeutic strategies.
My laboratory focuses on the role of cell-adhesion-molecules and cytokines in the pathogenesis of ocular inflammation. Initial studies in two experimental models - autoimmune uveoretinitis and endotoxin-induced uveitis - de-monstrated that the expression of E-selectin, ICAM-1, and VCAM-1 is upregulated in the eye before the influx of inflammatory cells. We then showed that monoclonal antibodies against several cell adhesion molecules, including ICAM-1, LFA-1, Mac-1, VLA-4, E-selectin, and P-selectin, could inhibit both autoimmune and endotoxin-induced ocular inflammation. We subsequently treated ragweed-induced allergic conjunctivitis in mice by blocking ICAM-1 and LFA-1 with monoclonal antibodies or the selectins with a small molecule inhibitor.
More recently, we investigated changes in cell adhesion molecule expression on lymphocytes during cell activation. These studies involved transgenic animals that express hen egg lysozyme (HEL) in the lens. Transgenic mice develop severe ocular inflammation only if injected with in vitro-activated splenocytes taken from wild-type animals immunized with HEL; nonactivated cells cause no ocular disease. Fluorescence-activated cell sorting (FACS) analysis showed that activation is associated with upregulation of VLA-4 on the cell surface, and anti-VLA-4 antibody inhibited the adoptive transfer of disease. We have also demonstrated upregulation of adhesion molecule expression in the retina and choroid of patients with uveitis, as well as in human corneas undergoing allograft rejection. Our work led to the granting of the U.S. patent for treating uveitis by blocking cell adhesion molecules with monoclonal antibodies; studies in patients with sight-threatening uveitis are planned.
Our research on the involvement of cytokines in uveitis yielded the interesting observation that two pro-inflammatory cytokines, TNF-alpha and IL-12, paradoxically ameliorate ocular inflammation while provoking systemic inflammation or even death. This observation underscores the uniqueness of the ocular environment and supports the hypothesis that cytokines can have varying effects, depending on the type of inflammation, time course of the disease, and other cytokines present. Our ongoing studies use knockout mice deficient in ICAM-1, LFA-1, and IL-6 to further define the role of adhesion molecules and cytokines in uveitis. We also plan to investigate what effects blocking CD40 ligand may have on ocular inflammation.
In addition to my laboratory research, I am involved in clinical studies on the pathogenesis, diagnosis, and treatment of uveitis. In recent clinical trials, we have investigated the safety and efficacy of the carbonic anhydrase in-hibitor, acetazolamide, for cystoid macular edema, a major cause of vision loss in patients with uveitis, and the combination of prednisone and cyclosporine for ocular Behçet's disease.
Intraocular lymphoma is a disease that frequently masquerades as an idiopathic uveitis. We have shown that elevated ratios of IL-10 to IL-6 in the vitreous or the cerebral spinal fluid are associated with the presence of malignant cells, which can be extremely difficult to recognize by cytopathology. Also, in collaboration with investigators at NCI, we are conducting a Phase I/II trial of combination chemotherapy for lymphoma of the central nervous system or eye.
I am also studying the ocular complications of AIDS. We were the first to recognize retinal toxicity associated with the antiretroviral agent didanosine (ddI). Histopathological examination revealed destruction of the retinal pigment epithelium and overlying neural retina; electron micrography showed a membranous cytoplasmic inclusion consistent with a metabolic storage abnormality. Finally, we are involved in investigating new therapies for cytomegalovirus retinitis and are currently studying whether increases in CD4+ T-cell counts that are induced by anti-HIV medication will prevent progression of this disease.
Most living organisms are continually subjected to a variety of chemicals, both synthetic and natural, that damage their DNA. Although many organisms have evolved elaborate repair processes to deal with this damage, under certain conditions not all of the damage can be processed by error-free repair mechanisms. As a result, the DNA is replicated with a much lower fidelity than normal. My laboratory focuses on trying to understand the molecular mechanisms of this mutagenic process. To date, most of our efforts have focused on Escherichia coli, but we are now using Saccharomyces cerevisiæ and Xenopus lævis as model systems in our investigations of similar processes in eukaryotic cells.
Genetic experiments with E. coli indicate that DNA polymerase III holoenzyme (the main replicative enzyme), RecA, and the UmuDC-like mutagenesis proteins - all of which are induced as part of the cell's multigene so-called "SOS" response to DNA damage - are directly required for the mutagenic process. In the mid 1980s, Bryn Bridges and I proposed a two-step model to explain UmuDC and RecA activities. We suggested that the RecA protein might act to influence the incorporation of incorrect nucleotides opposite DNA lesions, and the Umu proteins might act at a later stage by promoting continued DNA synthesis from the incorrectly paired primer.
In an attempt to test this hypothesis, we overproduced and purified the UmuD and UmuC proteins and demonstrated that UmuD undergoes a RecA-mediated post-translational cleavage reaction that generates a shorter, but active, UmuD' protein. We also discovered that UmuD' exists as a dimer in solution and that it interacts with a monomer of UmuC to form a mutagenically active UmuD'C complex. Indeed, using these purified proteins together with RecA and DNA polymerase III, we were able to reconstitute the mutagenic process in vitro and demonstrate translesion DNA synthesis. Experiments undertaken by Ekaterina Frank in my laboratory revealed that UmuD' physically interacts with RecA protein and that this protein-protein interaction provides a means by which Umu proteins can target DNA lesions.
Recently, in collaboration with Wayne Hendrickson at Columbia University in New York, we were able to crystallize the UmuD' protein. The structure was refined to 2.5 Å and elucidated the self-cleavage process UmuD undergoes during its conversion to UmuD'. In addition, we discovered that whereas UmuD forms a molecular dimer with itself, the extended amino and carboxyl terminals of one UmuD' protomer can interact with a protomer from another dimer to form an extended polymeric structure that we believe is essential for mutagenic activity.
We have also investigated the in vivo stability of the Umu proteins in E. coli. Because relatively few molecules of active UmuD'C complex are required to promote mutagenesis, E. coli has evolved an exquisite mechanism to reduce the cellular concentration of UmuD': instead of forming a homodimer with itself, UmuD' preferentially forms a heterodimer with the intact UmuD protein. This heterodimeric complex is specifically recognized by the ClpXP serine protease, and the UmuD' protein is therefore rapidly degraded. We are now performing experiments aimed at elucidating the signals that allow ClpXP to recognize the heterodimeric UmuD-UmuD' complex but not the homodimeric UmuD' protein.
A major goal of our work is to identify similar processes in eukaryotic cells. As part of a collaborative study with Eric Ackerman (NIDDK), we have shown that whereas X. lævis oocytes can efficiently replicate undamaged single-stranded DNA, they are unable to replicate DNA that contains adducts. Interestingly, this replication arrest was alleviated in progesterone-matured oocytes and in oocytes microinjected with mRNAs encoding the prokaryotic UmuD' and UmuC mutagenesis proteins. This finding strongly suggests that the basic mechanisms contributing to mutagenesis are conserved between prokaryotic and eukaryotic cells. Indeed, both structural and functional homologs to UmuC that have been identified in S. cerevisiæ, mice, and human cells are now under investigation.
Return to the Table of Contents