k crandall

Keith Crandall

Director, Computational Biology Institute
Faculty: Full-Time
Address: Innovation Hall
Phone: 571-553-0107
[email protected]

Areas of Expertise

Infectious Disease




Keith A. Crandall, PhD is the founding Director of the Computational Biology Institute at George Washington University. Professor Crandall studies the computational biology, population genetics, and bioinformatics of a variety of organisms, from crustaceans to agents of infectious diseases.  His lab also focuses on the development and testing of Big Data methods DNA sequence analysis.  He applies these methods and others to the study of the evolution of infectious diseases with particular focus on HIV evolution. Professor Crandall has published over 270 peer reviewed publications, as well as three books (The Evolution of HIV, Algorithms in Bioinformatics, and Decapod Crustacean Phylogenetics). Dr. Crandall’s research has been funded by both the National Science Foundation and the National Institutes of Health as well as from a variety of other agencies, including American Foundation for AIDS Research, National Geographic, US Forest Service, Pharmaceutical Research Manufacturer’s of America Foundation, Alfred P. Sloan Foundation, etc.  He has been a Fulbright Visiting Scholar to Oxford University and an Allen Wilson Centre for Molecular Ecology and Evolution Sabbatical Fellowship at the Bioinformatics Institute at the University of Auckland.  Professor Crandall has received a number of awards for research and teaching, including an Alfred P. Sloan Foundation Postdoctoral Fellowship in Molecular Evolution at the University of Texas, the American Naturalist Society Young Investigator Award, an NSF CAREER Award, a PhRMA Foundation Faculty Development Award in Bioinformatics, an NIH James A. Shannon Directors Award, ISI Highly Cited Designation, Honors Professor of the Year award at Brigham Young University, and the Edward O. Wilson Naturalist Award. He was also recently elected a Fellow in the American Association for the Advancement of Science (AAAS) and the Linnean Society of London.  Professor Crandall earned his BA degree from Kalamazoo College in Biology and Mathematics, an MA degree from Washington University in Statistics, and a PhD from Washington University in Biology and Biomedical Sciences.  He also served as a Peace Corps Volunteer in Puyo, Ecuador.


Current Research

My research program has three main aspects.  The first and central component is work on the development and testing through computer simulation of methods for the analysis of DNA sequence data.  We have developed methods for estimating gene genealogies, detecting recombination, detecting selection, and measuring genetic diversity and demographic events in the history of a population.  We develop software to implement many of these methods and then develop software to test our methods and many others by comparison through computer simulation.  Through comparison and tests of robustness to assumption violations, we can gain great insights into why particular methods perform well or poorly and then are in a good position to redevelop improved methodology.  In fact, we are currently embarking on the development of a comprehensive simulation software package that allows one, for the first time, to examine the impact of a host of population genetic phenomena at the same time (e.g., migration, mutation, recombination, selection, fluctuating population sizes, tracking geographic locations of alleles in a population, multiple locus populations, etc.).  Through such studies, we gain a better understanding of the methodology we use to infer evolutionary and population demographic histories and associated parameter estimates when we apply them to empirical data.  Furthermore, such studies provide valuable insights into the development of new and improved theory and methodology for inference from DNA sequence data.

The remaining two aspects of my research program deal with applications of the above methodologies in two fairly distinct arenas.  The first is in molecular ecology, conservation biology, and systematics research.  We have applied the methods developed and tested in our lab (and many others) to examine the populations genetics, historical demography, and molecular ecology of various species of freshwater crayfish.  We have also examined the molecular systematics of a variety of organisms, from the origin of dogs to the origin of freshwater crayfishes.  You will see from my CV that the systematic studies are typically collaborative studies.  I firmly believe in collaboration, especially in systematic studies that require both organismal (including morphological and ecological) expertise and molecular (including phylogenetic analysis) expertise.  We the freshwater crayfishes, we typically are the morphological and taxonomic experts as well.  However, with all the other organismal groups, we develop (often international) teams of expertise to tackle outstanding questions in systematic biology.  We then apply our results to a diversity of biogeographic and conservation biology questions.  These typically lead naturally into broader issues relating to conservation biology such as diagnosing species and the relative importance of different sources of information regarding conservation priorities and conservation status such as ecological data versus genetic data.

The second focus of my empirical research is in the area of the evolution of infectious diseases.  Here our main system has been HIV, but we have also now been very active in bacterial genetics, especially Neisseria gonorrhoeae.  Our main goal with these research projects is to explore the population dynamics of infectious disease, particularly relative to the evolution of drug resistance.  We have been heavily involved in evaluating the performance of multi-locus sequence typing (MLST) methods to track population dynamics of bacterial species.  Our results suggest that these MLST are not as selectively neutral as researcher had hoped and that different MLST work differentially well in a diversity of species.  We plan to continue the exploration of MLST and their application in tracking population dynamics of bacterial agents of bioterrorism, as well as tracking dynamics of infectious disease.   Our extension into bacterial genomics coupled with an interested in environmental samples, has naturally led to an exploration of novel computational techniques to identify pathogens from environmental samples using next-generation sequencing approaches to collect relevant data and novel statistical and computational approaches for analyses.  We are involved in all phases of this work from the molecular approaches to the statistical models to the computational implementation.

The research outlined in these three main areas in my lab has enjoyed a diversity of funding from the National Institutes of Health, the National Science Foundation, and private agencies such as the Alfred P. Sloan Foundation and the Pharmaceutical Manufacturers of America.  My research program is moving evermore into the genomics and bioinformatics arena and applying these insights into conservation management, human health, and biomedical applications.

I am currently working with the DC Center for AIDS Research, the Clinical and Translational Science Institute at Children's National, the Data Science Institute, and the GW Cancer Center.


BA Kalamazoo College, 1987 (Mathematics and Biology)
MA Washington University in St. Louis, 1993 (Statistics)
PhD Washington University in St. Louis, 1993 (Biology & Biomedical Sciences)


For a complete publication list, please visit Dr. Crandall's GoogleScholar page.

Hahn, A, ML Bendall, K Gibson, H Chaney, I Sami, GF Perez, AC Koumbourlis, TA McCaffrey, RJ Freishtat, and KA Crandall. 2018. Benchmark evaluation of true single molecular sequencing to determine cystic fibrosis airway microbiome diversity. Frontiers in Microbiology in press.

Houzet, L, M Pérez-Losada, G Matusali, C Deleage, N Dereuddre-Bosquet, AP Satie, F Aubry, E Becker, B Jégou, R Le Grand, BF Keele, KA Crandall, and N Dejucq-Rainsford. 2018. Semen in SIV chronically-infected cynomolgus macaques is dominated by viruses originated from multiple genital organs. Journal of Virology in press.

Stern, DB and KA Crandall. 2018. The evolution of gene expression underlying vision loss in cave animals. Molecular Biology and Evolution in press.

Lewin, H.A., G. Robinson, W.J. Kress, W. Baker, J. Coddington, K. Crandall, R. Durbin, S. Edwards, F. Forest, T. Gilbert, M. Goldstein, I. Grigoriev, K. Hackett, D. Haussler, E. Jarvis, W. Johnson, A. Patrinos, S. Richards, J.C. Castilla Rubio, M.A. van Sluys, P. Soltis, X. Xu, H. Yang, and G. Zhang. In press. The Earth BioGenome Project: Sequencing Life for the Future of Life. Proceedings of the National Academy of Sciences 115(17):4325-4333. Doi/10.1073/pnas.1720115115

Pérez-Losada M, Castel AD, Lewis B, Kharfen M, Cartwright CP, Huang B, Maxwell T, Greenberg AE, Crandall KA (2017) Characterization of HIV diversity, phylodynamics and drug resistance in Washington, DC. PLoS ONE 12(9): e0185644. https://doi.org/10.1371/journal.pone.0185644

Restrepo, P., M. Movassagh, N. Alomran, C. Miller, M. Li, C. Trenkov, Y. Manchev, S. Bahl, S. Warnken, L. Spurr, T. Apanasovich, K. Crandall, N. Edwards, and A. Horvath. 2017. Overexpressed somatic alleles are enriched in functional elements in Breast Cancer. Scientific Reports 7:8287. doi:10.1038/s41598-017-08416-w

Stern, D. B., Nallar, E. C., Rathod, J., & Crandall, K. A. (2017). DNA Barcoding analysis of seafood accuracy in Washington, DC restaurants. PeerJ 5:e3234.

Perez-Losada M, Alamri L, Crandall KA, Freishtat RJ. 2017. Nasopharyngeal Microbiome Diversity Changes over Time in Children with Asthma. PLoS One 12: e0170543.

Sun Y, Huang Y, Li X, Baldwin CC, Zhou Z, Yan Z, Crandall KA, Zhang Y, Zhao X, Wang M, et al. 2016. Fish-T1K (Transcriptomes of 1,000 Fishes) Project: large-scale transcriptome data for fish evolution studies. GigaScience 5:18.

Pérez-Losada M, Crandall K.A., Freishtat R.J. 2016. Two sampling methods yield distinct microbial signatures in the nasopharynges of asthmatic children. Microbiome 4:25 DOI: 10.1186/s40168-016-0170-5

Hilton SK, Castro-Nallar E, Perez-Losada M, Toma I, McCaffrey TA, Hoffman EP, Siegel MO, Simon GL, Johnson WE, Crandall KA. 2016. Metataxonomic and Metagenomic Approaches vs. Culture-Based Techniques for Clinical Pathology. Frontiers in Microbiology, 7:484.

Hinchliff, C., S.A. Smith, J.F. Allman, J.G. Burleigh, R. Chaudhary, L.M. Coghill, K.A. Crandall, J. Deng, B.T. Drew, R. Gazis, K. Gude, D.S. Hibbett, L.A. Katz, H.D. Laughinghouse, E.J. McTavish, P.E. Midford, C.L. Owen, R. Ree, J.A. Rees, D.E. Soltis, T. Wiliams, and K.A. Cranston.  2015. Synthesis of phylogeny and taxonomy into a comprehensive tree of life.  Proceedings of the National Academy of Sciences, USA 112(41):12764-12769. doi: 10.1073/pnas.1423041112

Castro-Nallar, E., Y. Shen, R. J. Freishtat, M. Pérez-Losada, S. Manimaran, G. Liu, A. Spira, W. E. Johnson, K. A. Crandall. 2015.  Integrating metagenomics and host gene expression to characterize asthma-associated microbial communities.  BMC Medical Genomics 8:50, DOI 10.1186/s12920-015-0121-1.

Hong, C., S. Manimaran, Y. Shen, J.F. Perez-Rogers, A.L. Byrd, E. Castro-Nallar, K.A. Crandall, and W.E. Johnson. 2014. PathoScope 2.0: A complete computational framework for strain identification in environmental or clinical sequencing samples.  Microbiome 2:33.

Faison, W.J., A. Rostovtsev, E. Castro-Nallar, K.A. Crandall, K. Chumakov, V. Simonyan, and R. Mazumder. 2014. Whole genome single-nucleotide variation profile-based phylogenetic tree building methods for analysis of viral, bacterial, and human genomes. Genomics 104(1):1-7

Byrd, A.L., J.F. Perez-Rogers, C. Hong, S. Manimaran, E. Castro-Nallar, I. Toma, T. McCaffrey, S. Siegel, G. Benson, K.A. Crandall, and W.E. Johnson. 2014. Clinical PathoScope: Rapid alignment and filtration for accurate pathogen identification in clinical samples using unassembled sequencing data.  BMC Bioinformatics 15:262.


  • Steering Committee - US Node Chair, International Barcode of Life, 2010-Present
  • Editorial Board, PLoS Currents: Tree of Life, 2010-Present
  • Editorial Board, Bioinformatics, 2007-2010
  • Scientific Advisory Board, Canadian Barcode of Life Network, 2005-2009
  • Associate Editor, Bioinformatics, 2005-2007
  • Associate Editor, Evolution, 2005-2007
  • Honorary Research Fellow, Bioinformatics Institute, University of Auckland, 2005-2006
  • Editor, Animal Conservation, 2003-2006
  • Council Member, Society of Systematic Biologists, 2003-2006
  • Executive Vice President, Society of Systematic Biologists, 2003-2006
  • Associate Editor, Molecular Biology and Evolution, 1999-2003
  • Associate Editor, Systematic Biology, 1999-2003
  • Editorial Board, Animal Conservation, 2000-2002
  • Postdoctoral Fellow, University of Texas, 1993-1996


Classes Taught

BISC 3584 - Introduction to Bioinformatics

HSML 6299 - Research Analytics