Doctoral Thesis Ming Zhang: Tracking HIV-1 Genetic Variation
The last 26 years saw a rampant global epidemic of HIV-1. HIV-1’s extraordinary diversity is seeded by a high mutation rate, rapid replication, frequent recombination, and strategic placement and loss and gain of N-linked glycosylation sites. In this thesis, the genetic variation of HIV-1 was investigated with special focus on recombination and N-linked glycosylation sites.
Using phylogenetic analyses, distance methods, and HIV-1 subtying tools including one called jumping profile hidden Markov model, HIV-1 recombinants dominating HIV epidemic in three different geographical regions were examined. We found that CRF13_cpx includes sections of the rare subtype J, and that breakpoint inference can be greatly improved using all available sequences within a CRF family. We confirmed that CRF02_AG, a recombinant between subtype A and G that is prevalent in West and West Central Africa, is an old recombinant. The main recombination events that generated CRF02 took place before the 1970’s, before HIV-1 had started to spread worldwide and the currently recognized subtypes had formed. Recombinants consisting of subtypes B and C are frequently found in the HIV-1 epidemic of Asia, especially in southwest China where they are associated with different drug trafficking routes. Our study suggested that CRF07 was derived from a recombination between CRF08 and subtype B. However, it is possible that the currently defined CRF07 is not the direct product of this recombination event. Lastly, we found that recent recombination between subtypes B and F in Argentina and Brazil, two epicenters in South America, has created many different, but related, recombinant forms. Taken together, it appears as if the HIV-1 epidemic is becoming more complex as it moves ahead into the future. Recombination among co-circulating forms creates new forms of HIV-1 that are now starting to dominate the epidemic in certain parts of the world.
We developed methods to track N-linked glycosylation sites (sequons) in HIV-1 as they shift positions and vary in local densities. Comparing primate lentiviruses, hepatitis C virus, and influenza A viruses showed that generating and tolerating shifting sequons is a unique evolutionary avenue for HIV-1 immune evasion. In addition, we found the primate lentiviral lineages have host species - dependent levels of sequon shifting, with HIV-1 in humans the most extreme. Further, unlike influenza A hemagglutinin H3 HA1 that accumulates sequons over time, HIV does not have a net increase in the number of sites over time at the population level, indicating that variation in number and placement, not accumulation of N-linked glycosylation sites, is more critical for HIV-1 immune evasion.
The studies detailed in this thesis, together with our great effort in re-subtyping > 150,000 sequences in the Los Alamos HIV sequence database, enables us to draw a more comprehensive and dynamic picture of the global HIV-1 epidemic