How Did the SARS-CoV-2 Virus Originate? | Alex Washburne | #97
Updated: Dec 15, 2022
Full auto-generated transcript below. Beware of typos & mistranslations!
Alex Washburne 5:47
Yeah, I have kind of a checkered past, Mike grew up wanting to be an ecologist and I always love lizards, I wanted to study lizards. As an undergraduate, I did a lot of different research projects from your year, everything from ecology of lizards and the desert southwest to working on protein to evolution at Brandeis University, the studying the immune system of snails, and finally kind of finding my niche, doing mathematical modeling and in mathematical biology of viruses in their hosts, I got two undergraduate degrees, one in math and one in biology, graduated summa cum laude or something like that straight A's valedictorian, and went to Princeton with the National Science Foundation, Graduate Research Fellowship, which is a prestigious award for for graduate studies. At Princeton, I got a PhD in quantitative and computational biology, studying again, kind of math and biology, I was really interested in statistics and evolution and ecology and parasites and hosts and, you know, broadly like the ecology is the interactions of organisms with each other, and with their environment. And that includes predator prey interactions, as well as host parasite interactions. So under the broad stroke of ecology, and evolutionary biology is basically everything in biology, and so I really, again, have this checkered, or one might also call holistic paths and biology, did a postdoc at Duke University, working on novel methods to analyze microbiome datasets. So microbiomes are the set of all the microbes that live on us or in US or in the soil or in any given place. And so you get these huge datasets of 1000s of microbial species and you want to understand how they're changing, you know, are there some microbes that are associated with disease, whether that's inflammatory bowel disease, or Crohn's disease, or whatever. And so I built some methods to analyze these microbiome datasets in light of their evolutionary tree, for instance, to find which lineages of microbes are impacted by an antibiotic or something like that. After Duke, I went on to Montana State University, which is what brought me to Bozeman and this was back in 2017. And that's sudden doing a postdoc that quickly turned into a research scientist position, studying pathogen spillover from bats to people of all things. So we were, I was part of a team that received the DARPA preempt grant, which was one of the biggest grants available in the world to study pathogen spillover with the goal of pre empting pathogen spillover and preventing pandemics. The team that I worked with studies had NEPA viruses, like NEPA, and Hendra, which are really bad viruses, they, you know, have like a 30 to 50% infection fatality rate. So very, very bad. And yeah, so while I was working on DARPA, DARPA pre work, I was doing a lot of modeling of pathogen spillover, the risks of pathogen spillover, you know, I was studying which sort of features predict or you know, can help us understand or prevent pathogen spillover. Then COVID happened. You know, one kind of, you know, one sort of branching, there's kind of no, no linear way to explain this, but we'll go back in time to the postdoc at Duke. So I, my postdoc advisor at Duke passed away just a few months into that position. So all the you know, tree math and stuff that I did was just kind of, on my own. I started at that time moonlighting at a hedge fund to do a bunch of quantitative data analysis and time series analysis. So I did a lot of forecasting on the side. And I did that throughout most one of the real assets for predicting paths and spillover when COVID happened, these worlds collided because COVID was the biggest macroeconomic issue and there's a lot of forecasting needs. When will the outbreak happen? How bad will it be? Stuff like that, so I'd switched from pathogen spillover to doing medical demand forecasts for years throughout COVID. This included in February 2020. I was one of the first if not the first person to say that there could be a huge surge in March 2020. At the time, the conventional forecasts was that there'd be slow 6.2 Day doubling times of cases with June July of 2020. Peaks. remdesivir was slated to pass clinical trials in late April 2020. Whereas my forecast said that places like New York City could experience these huge surges in March, you know, late March 2020, we can see peaks, you know, with a lot of people being hospitalized, and the case is doubling every two to three days. So that was, you know, the first real like war zone of COVID was like, before, major surgeries happened outside of Wuhan trying to figure out what the heck is going to happen. And as you know, the there was a huge surge in March 2020, in New York City. I shared this on the CDC forecasting call. And not a lot of people believed it until they saw the two to three day doubling times of ICU arrivals across providers in New York City. At that point, I got really connected with some medical managers in New York and elsewhere. And I mean, I could talk the entire session today just about the Odyssey of COVID forecasting. And you know that that's been a huge journey, but long and short of it is that it started in February 2020. And I kept doing forecasts for every outbreak cycle all the way to Omicron. And then ba five provided my final forecast to some researchers connected with managers in New York City. And then finally decided, you know, I want to get back into the spillover question, you know, where did this virus come from? During COVID, I left academia and just kind of branched off and did a lot of consulting and other things. And so, you know, now my, I'm not a research scientist at Montana State University, I'm just a guy, I ended up in the forecasting, doing a lot of trading on the stock market, just kind of anticipating when peaks would happen, how bad they would be, etcetera. So right now pay the bills with capital gains. And you know, just had a lot of time to go back to my field of study and study the virus and read read the literature on where people think it came from what evidence they had to support those claims, etc. And so that's the long and short of it. It's kind of a checkered past, complicated background and ecology and economics and evolution and in math and statistics, and pathogen spillover.
Nick Jikomes 12:39
And so we're going to talk primarily today about this question of where the SARS cov two virus came from. There are basically two hypotheses for for where this thing came from one is called the zoonotic origin hypothesis or the natural spillover hypothesis, and the other is the lab Lake hypothesis. So taking these one at a time, and just describing for people what those hypotheses are and what they state not worrying yet about what we think the balance of evidence is, what is a natural spillover? What does that hypothesis, say? And what what would be true if that's the way that the virus originated?
Alex Washburne 13:21
The natural origin, hypothesis or theory, however you want to call it, a call it a hypothesis, is that SARS, one was in an animal that was not being manipulated by researchers, it could have been, you know, held by animal traders, it could have been someone's pet who knows what, but that it was in an animal, and then it went from that animal to people without any research related activities involved in the lab origin hypothesis says that there was probably there could have been, believes that there was research related activities that brought the virus to Wuhan that played a role in the virus entering the human population. So the question here is whether scientific research brought the virus into the human population.
Nick Jikomes 14:10
And so you've got spillover events, where a virus is naturally circulating in some animal species. And then you know where its name comes from, is it then you know, spills over? To us, it gets to us via another animal, maybe through a second intermediate host species. The lab leak is basically you know, their scientists doing research in the lab tinkering with viruses. And it spills out of the lab as a result of that research. How, you know, how common or How rare are each of these scenarios? Do you see do we see in general over the years, lots of viruses jumping into humans from animals?
Alex Washburne 14:46
And do we see any like lab leaks happening each year? Yeah, so you know, all viruses come from somewhere, and, you know, every virus that's infected humans came from somewhere, you know, Some of them came from, you know, 1000s of years ago and the agricultural revolution, we started chopping around with domestic animals a little bit closer, more. And that led some viruses to, you know, adapt to human populations or other viruses that are spilling over every single day. A good example of this will be vector borne diseases like malaria or Dengue or Zika. These viruses are just hopping from host to host and it's part of their natural lifecycle to hop from host to host. So there are viruses in range and how sort of specialized and generalized they are. spillover is really common. Natural spillover happens every day with Lyme disease, malaria, dengue, Zika, etc. And then there are some really more dangerous pathogens that cause you know, really dangerous outbreaks like Ebola, for example, which, or I mean, another good example is NEPA virus. This is the NEPA virus clade that we were setting before COVID. Or then I was working with people setting before COVID NEPA spills over commonly in the NEPA bout in Bangladesh in India, by way of fruit bats, but what happens is that people, they go to these palm trees, date palm trees, and they like drinking the sap. So they put a stake in the tree and put a bucket under the stake, so the SAP drops down. But that's also like the SAP, they go try to drink the SAP pee in the bucket, someone drinks the bucket, they get sick with a 50% infection fatality rate, half of them die. And that can have onward transmission from person to person. And so spillover is really common. The severity of the disease vary significantly, depending on the virus and other factors. And then how much on retransmission there is and how much how well we're able to contain it varies a lot as well. And so broadly speaking, like 99.9999999, I could go on most of the viruses in nature have entered the human population through spillover. Okay. And it's only it's only recently that we've really started seeing this proliferation of biological research in these labs and start to study pathogens. I mean, we didn't even know, you know, about DNA is the I mean, we didn't know about the genetic code until the late 1900s. We didn't sequence the first human genome until forget the late 1900s, early 2000s. But biology is a rapidly changing field. And so we're now seeing this proliferation of new kinds of research facilities that study viruses in a lab. And accidents happen. You know, I mean, this is why we have chemical safety protocols. When you're working in a lab. I'm just old school chemistry, you know, you have showers in case he spills chemical on yourself. And there's a lot of precautions taken in these BSL two, three or four labs. Honestly, I can't speak much about that as an expert, because I haven't worked in one of those more intensive biosafety labs. But I can say that accidents happen. And, you know, no matter what lab you work at, accidents can happen. And there are a lot of efforts to reduce the risk of accidents. But that risk never drops to zero. And especially if you start to have, as we saw, for instance, in the Wuhan Institute of virology report, some signs that they may not have had an adequate maintenance budget, maintenance is important. You need your air filters to work, you need seals and things to be seeing things to be sealed so that you know, air that may contain a pathogen stays in a place where you want it to stay.
Nick Jikomes 18:32
You know, when we you know if we think about the virus, accidentally getting leaked out of a lab where they're working on viruses, is this ultra rare today? Is this something that's like once in a once a human generation, or is it more common than that something that happens every year? So it's a lot more
Alex Washburne 18:47
common than that. For instance, there have been seven SARS outbreaks since the original SARS outbreak in 2002. Six of them were lab leaks, two of them in China. And so from that sample, I would guess if it's a SARS Coronavirus, it probably came from a lab just you know, run those past events out of a hat. So it's it's sadly very common. And it wasn't until you know, the SARS outbreak that people started being really worried and studying a lot of SARS, coronaviruses and labs and then that led to more lab accidents happening. And so, sadly, yeah, these are these accidents are more common than we'd like to acknowledge or certainly more common than the public is aware. So it happens, you know, I would say several guests several times a year and other people can speak more authoritative ly on like, what viruses in which labs, you know, the geographic?
Nick Jikomes 19:40
Yeah, I mean, based on the conversations I've had, I think the basic point here is lab leaks, just like natural spillovers are common meaning they happen every year, multiple times a year usually. And it's not like every single time at some major world changing event. You know, it could be a fairly innocuous virus that doesn't really cause problems. Um, But the point is, you know, this is not an ultra rare thing, both both of these things are a common way that viruses get into the human population.
Alex Washburne 20:07
Exactly. I mean, I am as an undergrad, I studied just as ours, which are these worms that infect snails, and, you know, I spilled water on myself that might have had just as I was, and I was like, Oh, crap, you know, this is my, and so I had to then take some precautions, and you know, wash my hands and do some other things like that. But it's just an people stumble and fumble and Bumble. And that's just part of the unfortunate reality is that mistakes happen. Yeah, lab links are common. And I think 1.2 is that it's, there's important information about the geography and the virus itself. If someone has malaria in Brazil, my first guess is that it's not a lab leak. You know, malaria is common in Brazil, this is commonly spilling over from people, from animals into people. Same with dengue or Zika, etc. You know, the, the baseline hypothesis can vary depending on where you see an outbreak happen. If Ebola pops up in Congo, I might think that well, it's probably natural spillover as well, because they don't have many labs, studying bola and Congo. On the other hand, there's Hamilton, Montana, which has a BSL four lab that studies a lot of dangerous pathogens. And so if you saw Ebola, or NEPA, spillover, you know, have an outbreak of these viruses in Hamilton, Montana, my first guest might be that at least from a lab, and so the location, and the virus itself can provide a lot of information to kind of change our minds about what's likely.
Nick Jikomes 21:39
So, you know, before we get into your work and your sort of views on the likelihood that this was a spillover or lab leak for source code to can you sort of steal man each case. So just for everyone listening, there's experts who are very knowledgeable on the subject, some of whom favor the lab like hypothesis, some of whom favor the natural spillover hypothesis. Can you still man each of these for us one at a time? So starting with the natural origins hypothesis? What are the what are the strongest arguments or best pieces of evidence that align with that hypothesis that we have today?
Alex Washburne 22:13
Yeah, so the the natural origin hypothesis has taken on many forms, and many pieces of evidence have been presented over time. So there's a couple of ways one ways you could look at this as like, what did people say over time? And then how did that kind of back and forth happen? In other ways? Like if I were to just say, Okay, I'm going to put on my natural origin hat today. People would say that the strongest evidence for natural origin is that natural spillover happens, it's more common than lab leaks for most viruses and most of human history. And then they would say that look, cases in December of 2020, were centered around the wet market, and the early outbreak, evolutionary tree of SARS, to have these two lineages that are called basal poly Tommy's basals, that they're at the bottom of the evolutionary tree there, Polly told me because they don't have just this bifurcating branch they have many, many sub lineages radiating out of a single common ancestor. So they said how are we getting these two basil Polly Toby's, they ran some simulations and said, Well, it seems unlikely in their simulations that these would happen that you get to basil Polly Tommy's from one spillover event. So they hypothesized there must have been two spillover events, then they look at these lineages lineage A and B. And then they find some early cases of lineage a lineage B, and try to make the argument that these are centered around the wet market. So then they say, Look,
Nick Jikomes 23:47
just just for people who don't know what is a wet market.
Alex Washburne 23:52
I might not have the right definition here. As I understand that there's an animal market, the Hunan Seafood Market, which sells a whole bunch of different animals. And, you know, that's where viruses come from is from animals. And so this is a very natural point for humans and Wuhan to come in contact with animals that could have a Coronavirus that is typically found in Laos or Vietnam.
Nick Jikomes 24:17
And so, so there's just there's what markets in China and elsewhere. There's an association between the early source code to outbreak in China and this wet market. Do we know for a fact that like the first infection happened in the wet market, or is there just a general association that we know that people were getting infected who had gone to the market and had been there?
Alex Washburne 24:39
Yeah. Well, Toby, to get straight to your point. We don't know that the first case came from the wet market, because we don't actually know the index. There's no, we haven't found a patient zero. You know, the. In early the early outbreak, there were actually credible claims of cases going all the way back to mid November that didn't have any connection with the wet market. So Are many of us who were kind of following the case data early on which I was, you know, because I was part of the forecasting game, because you had to know everything about every case, to know if it's going to peak in March in New York City, or June or July. There were early cases in mid November that were reported. And those cases were not represented in this dataset, the data set that was analyzed showing a spatial clustering of cases around the wet market, came from the following process of a hospital realize there's a bunch of patients with a pneumonia of unknown etiology etiology. And they said, huh, they asked people, where do you come from? Where'd you go, you know, what's, what's your story here. And they found a lot of them had ties, about half of them had connections to the wet market. After that, they started contacting location tracing. And so they looked for people who were connected to the wet market, and they found more cases there. So a lot of these early cases that were centered around the wet market came from a sampling process of early cases connected to the wet market, followed by contact tracing and people with connections with the wet market, which gave us a very strong signal of cases around a wet market. Now, you can do the same thing with a choir, right? If a bunch of people showed up, and they're hospitalized, and all within the same choir practice, you can do contact tracing, find more cases with a choir connection. That doesn't mean the choir was the site of spillover. So yeah, early cases had some different, you know, different stories. But again, those cases we have haven't actually had the sort of transparent, you know, testing of this case to demonstrate it was or wasn't COVID They've kind of fallen aside in history. And we don't really know why they didn't like, take that patient, run a cero survey to confirm or reject if they've actually had the virus. Yeah, another thing that happened was some Chinese CDC officials went into the wet market, and they sampled surfaces around the wet market. And they found there is no significant difference in the probability a test is positive on a surface, whether that surface was near an animal vendor, or vegetable vendor. And so you might expect that if it were if there were some outbreak within animals, that the animals would be infected and that there'd be more positive samples on surfaces near the animals. They also sampled over 420 animals at the White market. None of them were positive. And this is important to contrast with the first SARS outbreak. SARS one also had an outbreak that is much more clearly shown to be tied to the animal trade. One thing that shows it's tied to the animal trade is that it actually left you know, animals were infected animals were put in cages and trucks and they infect each other. And we saw this pattern of cases over a lot of Guangdong Province. In contrast, we just see a single outbreak starting in Wuhan. Not in all around Hubei province, but just a single outbreak in Wuhan very close to the Wuhan Institute of neurology. So we don't see this geographic pattern of an animal trade outbreak like we saw on SARS one. What's more, is that in SARS, one they sample just 25 animals in the sample small sample of 25 animals, they found six, I believe it was civets or raccoon dogs that were positive and another either raccoon dog or civic that was positive. So they found positive right away.
Nick Jikomes 28:24
So for the SARS one outbreak, that was a number of years ago now in China. That was a wildlife spillover event. And we know that because they were able to quickly find animals, multiple animals that were infected with just this virus. And that happened very soon after identifying what the virus actually was.
Alex Washburne 28:43
That's correct. And the virus that they found in the animals was much more closely related to SARS one than anything we found yet for SARS to
Nick Jikomes 28:52
I see. So for SARS to well, just just to sort of summarize the argument. For the wildlife spillover, it would basically be that there has been lots of wildlife spillovers where viruses get to humans and other animals in history. Very common. We know that in Wuhan, where the first known outbreak happened, you have these wet markets where there's lots of different animals that could potentially have been infected. And there's lots of people in that initial outbreak who were infected, that had some tied to the market. They had been there recently, they had plausibly been near animals of different kinds. And so there's the sort of circumstantial evidence that might lead you to believe like, Okay, well, an animal in the wet market, have this thing, someone got it, and now it's in the human population. But then you said, there's also these weird observations such as well, they didn't actually find any animals in the wet market that were infected by it. And we still to this day, you know, a couple years later have not identified an animal from which we can clearly say that this virus came.
Alex Washburne 29:54
Yep. And the surfaces were just as likely to have the virus even if it was under a vegetable versus an animal. Well, and another thing they didn't SARS one very shortly after the outbreak is they looked at the serological evidence, you know, see who had antibodies against the virus, and they found a very high rate of zero positivity in animal handlers that was much higher than people who handle vegetables. So we have that evidence for SARS one, we don't have that evidence for SARS two. In fact, we have the opposite. That is, you know, we don't have the serological evidence of people at the White market, but we do have the observation that the surfaces were just as likely to be positive, regardless whether they were near animals or vegetables, which suggests that there was probably a human that brought the virus to the market was a super spreading event for human human.
Nick Jikomes 30:40
Hmm. Okay. So, you know, we're, we're now might as well ask you now what's, what would what would the steel man case be for the lab leak hypothesis? What's the best evidence that's consistent with that that we have today?
Alex Washburne 30:53
So the lab origin, the research related origin hypotheses, cover a lot of different possibilities. You know, it could be anything from someone was trying to catch a bat and sample it from normal but I'm, you know, Coronavirus surveillance, and they got sick. So it could have been a completely natural, you know, bat to human spillover, that just incidentally, had a research connection for the reason why that person contacted that bat. On the other end of the spectrum, you have some, you know, other theories of, oh, maybe it's a bio weapon or you know, this. So, in the, in the middle ground, you know, in this huge distribution of possibilities. The middle ground is that, well, this arose in Wuhan, which is far from the hotspots of Coronavirus, diversity, so the geographic evidence becomes important. The lack of a geographic footprint, like we saw with SARS one makes us less inclined to believe that this was an animal trade outbreak capable of causing to spillover events and humans in Wuhan without also causing a spillover event somewhere else in Hubei province, or on the 1000 mile journey from Hubei province to where the closest relatives of SARS to are found. So the lack of geographic trail the virus arising in Wuhan and the virus had some very unusual features of its genome most of all, is this pure and cleavage site and the fearing cleavage site is this site that and others again, can tell you more about it than I could because I'm not you know, kind of a jack of all trades. kind of guy but I think the the urine cleavage site is the site that the protein for urine cuts and it enables cell entry. So it helps the virus enter into the cell for in the urine cleavage site in SARS. cov. Two is not found in any other SARS Coronavirus. So every other source Coronavirus, so we'd sampled over the decades since you know almost two decades since SARS one and all the corona viruses that we've sampled before then give us a very clear signal that there just are not a lot of or there's not any fear of cleavage site in the Savi Coronavirus lineage. So this this is a this is an outlier. This is a weird virus. This is not what we would expect to see from a SARS Coronavirus outbreak both in terms of the geographic fingerprint of the outbreak and the fear of cleavage site. And the fact that it arose in Wuhan which contains one of the world's you know, hotspots of Coronavirus research that led people to hypothesize that maybe there's some connection to the Coronavirus research that's conducted at Wuhan. So that's you know, that was the that was the big early evidence for the lab origin hypothesis, which kind of you know, the Federal cleavage site is really strange, because not only is this the only SARS Coronavirus with the fear of cleavage site, but it also has these two codons CGG CGG, which are not found anywhere else in SARS Coronavirus, is in it's found is is a particular codon that isn't very common in bats. So it's unclear why that would happen in the bat virus, but it's very common in humans is optimized for people.
Nick Jikomes 34:01
Okay, so So to summarize this piece, you know, up until, you know, since sort of the early part of the pandemic, the stuff that might make someone think that the lab leak was possible or plausible is well, the first outbreak happened in Wuhan, which just happens to have this Wuhan Institute of virology, where they work on SARS, by SARS Cove viruses like this. The pattern of like early infection, how it spread doesn't really look quite like SARS, Juan and other viruses that we know for sure were wildlife spillover events. And the virus itself has molecular features that at the very least, don't really look like wild natural populations of Corona viruses that you would think this one came from if it did, in fact spill over from a wildlife spillover event. So there's sort of these observations that might make one thing Okay, well, maybe it's not a well, they've spillover it actually came from research activities of some kind.
Alex Washburne 35:00
Yeah, and that's, you know, you know, researchers knew about Furin cleavage sites, and they were interested in whether if you're including sites might enhance the transmissibility of a Coronavirus. And early on in the outbreak, we didn't have this evidence that you know, the researchers were actively thinking about doing, you know, inserting a fear and cleavage site and Osiris Coronavirus. It's just so it's important to emphasize the evolutionary novelty. The fear and cleavage site is this insertion of 12 nucleotides. So some way somehow like this perfect 12 nucleotides string just got injected into exactly the right place at the virus. And it has these codons that are optimized for people. And it happens in the SARS Coronavirus that spills over in Wuhan. So that was curious. And shortly after this when people started looking at that saying I think this could have leaked from the lab it's it's something we need to examine and investigate. We had some people publishing articles saying no way could this have come from a lab? You know they in one of those people. It turns out we didn't know at the time, but he wrote an article this Peter das sack with with Jeremy Farrar. They wrote an article in The Lancet saying condemning conspiracy theories of laboratory origin without disclosing a conflict of interest that we discovered a year later in 2021 that that sack himself had co authored a grant with the Wuhan Institute of neurology, proposing to insert a fear and cleavage site into SARS Coronavirus, that Wuhan. So there's something unusual about that, you know, that's, that's a piece of evidence in terms of the statement of intent from 2018. And I
Nick Jikomes 36:39
just want to I just want to say to for the listeners, I've had a couple of guests on to talk about this general subject or just Coronavirus, evolution and things like that. I have invited Peter dataset onto the podcast. Obviously, he's a very busy guy, he's probably got a really full inbox. He may never have seen my emails, but just so everyone knows, I have invited some of those people on to hear what they think about all of this, but but I've not gotten responses for them.
Alex Washburne 37:04
Yeah, I think you know, it's it's unfortunate that we haven't had the kind of transparency that could reject the laboratory origin if it zone article origin were true. If the zoonotic origin were true, you know, that's I can equal Health Alliance has been collecting coronaviruses for over a decade. They have a huge database of Coronavirus is with the Wuhan Institute of virology, that would give us a lot more information on the natural Coronavirus evolution. And so if we had those data and the zoonotic origin were true, it'd be very much quicker to rule out a lab origin hypothesis. So it's a bit unusual that not only have we not seen these conflict of interest disclosures, but the lack of transparency, both in terms of not sharing the grant proposing to insert a fear and cleavage site, as well as you know, not sharing the database that could help us understand this quickly. Is something to just keep in mind in terms of I mean, it's not it's not strong evidence. But that's an important piece of the puzzle in terms of like what we know and what we don't know and why we don't know it. So yeah.
Nick Jikomes 38:10
So can you maybe give people now a little synopsis of, you know, we're going to talk about your paper and a few moments here and really dive into the details. But up to the paper that you recently put put out not including that quite yet. How has sort of the debate and the balance of evidence in favor of either the spillover hypothesis, or the lab leave hypothesis changed, in your view, what new evidence has sort of come out that might shift people one way or the other over the last couple of years? And how, how has the scientific community been responding to that and talking about it and debating it? You know,
Alex Washburne 38:46
so I again, I was kind of sucked up into metal demand for medical demand forecasts for a few years. And when I finally went back to read the literature on the zoonotic origin, it was extremely unusual for me, because I was reading spillover literature before COVID. And there were standards of evidence, there were ways that people describe things that had this characteristic humility, and you know, necessary statement of limitations. That was just the culture of the field. And that was just strangely completely thrown out the window in when it came to SARS, to origins. And when you get these extremely strong claims that were oftentimes just speculations masquerading as fact. One example of this is that in the proximal origins paper, which was a paper written in early 2020, saying that they basically tried to refute a laboratory origin. In that paper, they would make an argument like this, they said, well, in a computer simulation, this surface protein of SARS tune isn't optimal in its binding of a human receptor. So there For it couldn't have been research related or it couldn't have been a bio weapon, or it couldn't have been, you know, because it's sub optimal. But that's really weird because like, nowhere was that ever a necessary criterion for a laboratory origin. In fact, sometimes to optimal of binding of a receptor can be bad for the virus, even if it is specifically adapted to people, because then it might not let go of the receptor and view the rest of its viral lifecycle. So it was a really weird argument and the virologist making it, I feel should have known that and would have known that in 2019. And so there were these arguments presented, and I read them. And I was like, There's no way this is like, this is it's it seems very unusual, it was a straw man argument, taking the most extreme scenario and presenting it as the only one for a laboratory origin that in order for this had been a lab, it has to be an optimal binding by a weapon, and it's not. And so that doesn't really it's a, it was a very weird effort to try to rule out a lab origin, just transparently clearly, like that's not, there's given this whole spectrum of possibilities of lab origin, saying we think we rule out this one from a computer simulation, so therefore, everything's ruled out,
Nick Jikomes 41:20
doesn't hold Yes, I want to emphasize for people to just how weird that is because, you know, in, in the field of like molecular biology, very broadly speaking, and like all the scientists here, not all of them, but you know, this is sort of like molecular biology is what we're talking about the people that have the most expertise in this area are going to have a lot of molecular sort of level knowledge for how these viruses work. And all of this stuff, to their very functional vary cause and effect people that do you know, cause and effect experiments in labs for a living. And this type of scientist, I sort of was trained by these types of scientists. This is why I can speak to this. For someone to base an argument off of a computer simulation is culturally very weird in this area of science.
Alex Washburne 42:10
And it's it didn't stop there, you know, so that was 20. Those are the first arguments refuting the lab like many data sec, saying it's a conspiracy theory, not telling the world that he wrote a grant and sort of pure cleavage site, Wuhan in a Coronavirus in Wuhan. And then fast forward and you have more recent papers and science that have said, Oh, look at these two basil Polly Tommy's, they can base their analysis off the simulation, they run an outbreak simulation, and they say in our outbreak simulation, we didn't find to base Upali. Toby's and so therefore, they couldn't have happened by natural spillover there or they couldn't have happened bilaterally. They must have happened by two spillover events. And there's nothing about that, in fact, like the way you generate a poly Tomi is by one person spreading the virus to many people. A super spreading event generates a poly Tomi, and this was shown in many other cases in Austria and choir practices, etc, that one person will have predominantly one strain of the virus within them. If they infect 60 more people, all six of those people will then have viruses that whose strains branch off of this common ancestor. And that's a poly Tony. So all the two basic poly Toby's tell us is that there were made probably two super spreading events, which are very common for sorry, this is a you know, one of the super spreading events account for the majority of cases in SARS, too. So that was unusual, then there is the arguments about like oh cases, you know, from late December, were clustered around the wet market. But they throw out cases that were earlier that had no connection to the wet market. And they don't consider the fact that contact tracing amplified our you know, focus of cases around the black market. So I saw that in their language was unusual. They call this dispositive evidence, you know that it rules out laboratory origin, and it doesn't. And so I think that when I saw these claims, there's something inside you there's just like, feels like something when someone goes too far in an argument. You just have to like push back, say no, you can't say that's not true. You can't that's ridiculous. And that there was a little bit of that in me just like but then I was like, why are they doing this? Why is it this consistent pattern of a very different argumentative style and very lower standard of evidence suddenly, on this specific topic, and for the purpose of making this specific claim that absolutely no way did this come from a lab. You know it and so that was weird. And so that was also weird when he learned that Anthony Fauci and Francis Collins played some role in Christian Anderson, who is a awesome proximal origin paper. Thanks, Fauci and Collins are their leadership in helping with this paper and that's unusual because Fauci has run the US bio defense funding since ours too. And they funded and I already funded equal Health Alliance for very similar work as what was proposed in their great saying they're gonna answer the fair cleavage site. So yeah, that's a
Nick Jikomes 45:09
conflict of interest. Yeah. And again, just for listeners just for the record, Christian Anderson, very, very credentialed person in this general field. He strongly favors the spillover hypothesis, not the lab like hypothesis. I've also reached out to him to ask him to give his perspective on this topic. And I know that some other people have as well. Again, very busy guy, very voluminous inbox, I assume. But he has not responded to me either. But I, you know, I, I'm trying to talk to experts who have different viewpoints on this. But it's been difficult. So anyways, I don't think we need to belabor sort of the history here too much more. I've had other episodes on the subject, I think we've given people a decent survey of both hypotheses. So the question is, Where did this virus come from? Is it related to a wildlife spillover event? Or is it related to laboratory research of some kind, you recently with a couple colleagues, published a preprint, that speaks to this, and we're gonna go through that now. So I'm gonna do a share screen. It's not like we're gonna go through this line by line. But it will be up on the video version for people, we're obviously going to do a really good job of verbalizing what we're talking about, for those of you that can't see this. And the paper is called endonuclease. Fingerprint indicates a synthetic origin of SARS cov. Two. So to start out with Alex, can you give us an extremely simplified bare bones overview of what this paper says before we kind of go into the nitty gritty?
Alex Washburne 46:44
Absolutely. So if a virus were made in a lab, and if it were, especially if it had a fear and cleavage site inserted in a lab, there's a particular way that researchers would have had to do that, specifically SARS cope to is an RNA virus, and we can't work with just the RNA and insert if you're in Cleveland site into a single stranded RNA molecule, instead, what we typically do is we build the virus, the DNA version of double stranded DNA version of the virus. And then you can work with the DNA using your classic tricks of, you know, cutting them with restriction enzymes to put it into your fearing cleavage site or whatever. So this technology to assemble the DNA version of the virus, and then transcribe it to a single stranded RNA molecule, insert that RNA molecule into a cell and poof, it starts making a virus, that's called this infectious clone technology. And there were specific ways that researchers tended to build these DNA clones of the virus before COVID. So we did a meta analysis to look at all the infectious clones that were built on coronaviruses before COVID. And overwhelmingly they were assembled by this method called type two directional assembly. This type two directional assembly refer, you know, there's a specific kind of enzyme that cuts DNA. But leaves the sort of a SIP I can do with my hands here leaves these like Sticky, sticky ends, it cuts, it doesn't cut the DNA straight in half, but instead it kind of unzips it a little bit to leave these three to four nucleotides sticky ends. So that way, you can reassemble the same DNA segments, you know, one block at a time, the sticky ends will find their complementary sticky end glue together. So these restriction enzymes, type two restriction enzymes enable this, seek this cutting and pasting of DNA blocks this, you know, build these segments, you can make one segment, cut it with this restriction enzymes with these sticky ends, build another segment, kind of a restriction enzyme that will get the complimentary sticky and you can kind of glue them together. And that's how you can assemble a large 30 kilobase DNA version of the virus. So this is a common method. Yeah.
Nick Jikomes 49:06
So it's okay. So just for people who have absolutely no background in this, if you want to put together a synthetic genome in a lab, basically, what you're saying is a common way to do that is you take up a genome a big hunk of DNA, you chop it at certain parts of that genome, you chop it into smaller chunks, using enzymes that cut the double stranded double helix of DNA so that, you know at the ends where it's being cut, one strand sort of overhangs and a single stranded. And what that means, or what that allows you to do is then stitch together smaller pieces to sort of reassemble or create a bigger piece of DNA.
Alex Washburne 49:48
Yep, exactly. Exactly. It's all about
Nick Jikomes 49:51
chopping up big pieces of DNA at particular locations, and then stitching things back together in the way that you want them stitched back together.
Alex Washburne 49:58
Yep. And so to Typically what would happen is you would, you would get the viral RNA, and you can either get the genome directly from that. And then you could print out these chunks of the genome, cut them, put those cut segments inside of a plasmid, and then that you can grow the E. coli containing that plasmid. And that's how I get many, many, many copies and clone the virus you know, and have more of it, you can make put it in a fear of cleavage site and inside one choke, and you can have these blocks the building blocks of the virus that eventually you can then cut out of the plasmids with the same restriction enzymes, glue them together and make the virus itself then transcribe that DNA, full length cDNA, clone, put it in the cell, and then poof, the virus is born. So that was the method. Now the way the researchers would do this, is they would look at a viral genome. And there is there are many methods of assembly. And this ends up being important when it comes to, you know, the later discussions of our manuscript after release. There are many ways to do this assembly. One of the common ways that was implemented and that was proposed for making these very efficient reverse genetic systems was to look at the viral genome and say, Hmm, well, it doesn't have all the cutting and pasting sites exactly where I want them. But with silent mutations, I can add and remove cutting sites to turn the Coronavirus genome into roughly five to seven equally sized segments that are each cut up by these enzymes. And so researchers would look at a genome, they would see where they can add and remove these, these cutting sites by silent mutations, make those silent mutations, and then they would have a slightly modified version of the virus of the virus that they would use to build their segments. So the infectious clone looks a little bit different than the wild type virus,
Nick Jikomes 51:52
I see. So so hold on there. So if you compare a synthetically created virus genome to a natural one, there's going to be differences between the two. And what you're saying is, in order to create a genome synthetically, you basically want to like chop a genome up into five, or six or seven about equally sized chunks using these things called restriction enzymes. But because the natural virus genome that you might be sort of working from doesn't necessarily contain those restriction enzyme sequences in exactly the right places for you to conveniently do this, the the optimal way to do it, you introduce mutations into the genome unnatural mutations that you're introducing as the experimenter, in a way, you said silent mutations. So you're introducing mutations that allow you to chop up this genome, the way that's going to be easy for you to build stuff with it. But in a way that doesn't disrupt like what proteins are being made, or what that virus genome is actually doing.
Alex Washburne 52:50
That's exactly right. And you want to make sure they're silent mutations, because they have to hold the silent mutations. Because if you add a non silent mutation, you can disrupt the virus, and then the thing you're studying isn't at all like the wild virus. And so that's why the silent mutations were pretty essential. But yeah, so that we looked at these historical examples of infectious clones, and studied how researchers chose to place these cutting sites. And we saw that in wild type viruses, these cutting sets are pretty random. Whereas in the infectious clones, they had a very clear pattern of regular spacing. And so if you want me to show the people who are watching this, we can look at figure two, I think that kind of captures what we, you know, the essence of our study. Okay, so
Nick Jikomes 53:37
I'm going to share my screen again. Can you see that paper? Alex? Sure. Can Yeah, that's one, we're looking at figure two. Let's, let's go back to figure one real quick, just as we have a visual here. And, you know, this is this is we don't need to go into too much detail. But this is kind of a cartoon showing, you know, basically, the process Alex was describing where you would chop up a genome into different chunks, and then use these things called restriction sites to stitch things back together. And you know, is there anything else you want to say here, Alex? Or do you think we covered it?
Alex Washburne 54:10
I think this figure just yeah, this figure shows the method and the reasons why and how you amplify these segments and plasmids and how you or you can put them all into a bacterial artificial chromosome or back. Whether you're amplifying these chunks in a plasmid or a bacterial artificial chromosome, you would still have the same patterns of these types of restriction sites, equally spaced, you know, in the genome. And so this figure is just kind of going through some of the design considerations as a bio engineer when you're trying to do this and then just telling people how it works. So yeah, basically the stuff that we've talked about here, then figure two shows some very specific examples. In the first in the top left here, part A, there's a merge Coronavirus, merge spills over from Campbell's to People and has a very high infection fatality rate really dangerous. And they need to study this in the labs. They look at the MERS genome. And it has these restriction sites that are very randomly spaced that are not regularly spaced at all. And so when they modified the MERS Coronavirus, they removed all the pre existing restriction sites and added in six others to create seven fragments that are equally more or less similarly, there's similarly sized notice are not exactly equally sized. Because
Nick Jikomes 55:30
yeah, so what you're saying here, so the top line here is basically the natural virus virus. It has these two sites, and they're just sort of in two spots, you know, wherever they happen to be in that virus. But then in this one, which is the one that's, you know, worked on in the lab, those two spots are removed, so we don't see something here or here. And you've got 123456 of these sequences in the genome. And they're approximately evenly spaced from start to finish.
Alex Washburne 55:58
That's exactly right. And the same thing happens in the in panel V with a different virus. This is a bat SARS Coronavirus, that was engineered at the Wuhan Institute of virology called with one and with one had these four restriction sites. And three of them were actually in OK, spots for the researchers. So what they did, and this is, you know, walk you through, so they removed the first restriction site. And then they added several others but of note, so they actually added the one to like point two, five, so quarter of the way through the genome, and then the ones slightly over halfway. And then the 1.75. So they those are the ones they added first, and they tried to tried to synthesize in the lab, only to find that this third segment was unstable. So they had to add another restriction site right in the middle there. So this is kind of just showing the research process of how they how people look at the genome and think about it and iterate to make, you know, in this case, you see, there are some very some small fragments in there that were the consequence of otherwise unstable plasmids that had to be cut in to.
Nick Jikomes 57:04
So okay, so if I'm, if I'm tracking you here, if you're working on a virus in a lab, like MERS, you look at its genome, and it doesn't have restriction sites, and places that are convenient for you, as the bio engineer to do what you want to do. So you're adding a bunch of these sites to places they don't normally show up in the genome, that's this panel. If you're working with another virus, you know, it's got a few of these sites in genome, a couple of them are three of them might be in places that are convenient, so you leave those ones in. So that's that one, that one and that one, but then you need to add some more, and that's this one, this one, this one and this one.
Alex Washburne 57:37
That's right. And, you know, initially, they started off with more evenly spaced sites, but then they put those segments into plasmids. And they found out that the plasmids were unstable, they weren't able to faithfully replicate the E coli and get you know, a larger number of these segments. So, they had the, because of the plasmid instability, they had to modify one of these sites to cut it make you know, so that ended up making making things slightly less evenly spaced. And this is just pointing out like, you can get some small fragments, the result of here and this is the called their fragment C one and c two, because they fragments are typically you know, alphabetize, A, B, C, D, E, F, and they're fragments C was unstable. So, they cut it into two, two components C one and c two by another step of adding an additional restriction site. Now, the plasmid instability is important, because this is one of the major bioengineering constraints is that if your plasmid is too long, or the longer your plasmid is that so the longer your segment is, the more likely it is to be unstable, and to not be synthesizable. So because of that, we realized that a really good statistic for identifying infectious clones likely be the length of the longest fragment. That's something that if you know, it carries information on the even spacing, because if the length of longest fragment is like the genome, then you have as uneven a spacing as you can get, if the length of the longest fragment is one over the number of fragments, and you have perfectly even spacing. And this longest fragment length additionally tells us something about how synthesizable it's likely to be. So that's why we use for our way of identifying these infectious clones. The length of the longest fragment is our test statistic for comparing natural versus infectious clone or engineered coronaviruses.
Nick Jikomes 59:30
So what you're saying is, if someone is doing the type of bioengineering where you want to make a synthetic genome like this, you're going to be chopping a big long piece of DNA up into segments using restriction enzymes. And in general, if the segments that come from chopping the DNA up are too long. It's not convenient from a bioengineering perspective to work with because these things become unstable. And so in general choices will be made by the scientists doing the bioengineer During to put the restriction enzyme sites in certain spots of the genome so that none of the fragments are likely to be too long.
Alex Washburne 1:00:08
That's exactly right. Okay. Sometimes they iterate where they say, Oh, we tried to make them very evenly spaced, but this one fragment was unstable. So then they cut that in half, or they try to cut at some way somehow to make it stable. And so there's other kinds of considerations that come in making these restriction maps, which is the set of all these restriction sites in the genome and their spacing, etc. But yeah, so you can look at the length of longest fragment as a function of the number of fragments. And if the length of the longest fragment is unusually short, given the number of fragments, then that's an indication that these sites are more regularly spaced than you might expect by chance. So in order to get an understanding of what you know, what what do you expect in a wild, not engineered Coronavirus? We took 70 Other coronaviruses from NCBI drawn based on what we could find, you know, easily in our so with no kind of picking or choosing we just every single Coronavirus, we can get the full genome with a spike or you know, in this database all kinds of lined up to help us build an evolutionary tree. We take these 70 coronaviruses. And we digest them with a whole bunch of restriction enzymes and show because these restriction sites themselves are not under selection. They're randomly spaced in the genome of wild Coronavirus is they form a very regular wild type distribution, which is shown here in gray. That gives you a distribution of length of the longest fragment as a function of the number of fragments. And infectious clones will fall into a narrow box of unusually short longest fragments, typically between five to seven fragments, here, we have eight fragments, because again, with one, they tried to do seven fragments, but found fragments seem to be unstable, cut it in half to make eight. So actually, that would want to be shifted one to the left in their original design. So the five to seven fragments is the idealized range for an infectious clone. And sometimes they fall out of that range by unstable plasmids. But that's a narrow range in terms of both the number of fragments, and the length of the longest fragment falling underneath the sort of box plot that you see this, you see there. So unusually short, long fragments, in the five to seven fragment range, that's an infectious clone. That's the range that we think that that's indicative of infectious close and consistent with the infectious clones in the literature.
Nick Jikomes 1:02:36
I see. So again, we're looking at data here for this MERS virus and this W IV virus, these are viruses that, you know, this is past data. So we know that these viruses are engineered in the lab, you're doing this analysis showing that when you engineer viruses like this, as has been done in these two examples, you are ending up chopping up the genome using restriction sites, such that you get, you know, five or six, or in this case, seven or eight fragments. And they tend to be fragments that are shorter, lower on the graph here than you would find if you just sort of randomly chopped up the natural virus genome using this type of approach.
Alex Washburne 1:03:13
That's exactly right. And so we had done a meta analysis, looking at all of the infectious clones, built with type two directional assembly, in coronaviruses, from 2000 to 2019. Almost all of them were built in this way. And so we have 10 infectious clones that we use in our study to you know, show in the next figure three, that they all cluster in that exact region. And so this is shown in Yeah, exactly on the right on the panel, see, they're all those infectious clones fall exactly within that box. In fact, there's only one of them that had the eight fragments. And that was a weird one. Because most of them, they intend to be in the five to seven range. So we can narrow that box even more, and I'm sorry to again, is just smack in the middle of what we expect from an infectious clone. And so this is what we did, there were there are a lot of different possible type two enzymes that you could look at. We in our paper only looked at one pair of enzymes. And this is if you go to the left. Panel, the panel be here. So we had to think of okay, we don't want to dredge the data. We don't want to just look at everything and then not be so sure that we found something because if you, you know, test 100 hypotheses, you know, you'd expect 5% Five of them to have p values less than point o five, and so we didn't want to do that. The dredge data, but we wanted to have some bioengineering reasoning that kind of drove our discovery here. So there's one enzyme that was commonly used free COVID was BGL one however, For BGI oh one doesn't have a lot of options in sites that are already existing in Coronavirus genomes. And we looked at the diffuse grant the diffuse grant proposed this is the grant that was written by the Health Alliance, the Wuhan Institute of neurology, and UNC Ralph Barraca, UNC who, he's the guy who came up with this efficient reverse genetic system of cutting the genomes into five to seven fragments and stuff like that. When you look at the diffuse grant, they cite very specific literature about how they're going to make chimeric coronaviruses. In this lineage of SARS coronaviruses. In order to make Chimera Coronavirus is you have to have the same cutting and pasting site across different species. So the BGL one doesn't have a lot of conserved sites. So it's not actually a good candidate for creating this backbone to help you make chimera is across this broad lineage. So we reason that the other two most common enzymes on the market BSA one and Bs MD One were more likely to be used, they have a lot of conserved sites, which are all these dots you see across coronaviruses. Every one of those dots gives you a place in which you can mix and match viral parts. So the BSA and again, the BSA one bsme. One these are like when you go to the store to buy restriction enzymes, these are the first ones they'll recommend. These are the you know, this is the the Air Jordans, these are the best ones on the market that you want to pick for this sort of type two directional assembly. So this is not like a obscure set of enzymes we chose these are the ones that were commonly used that were You were used previously for this exact procedure. And then we've reasoned that these are good ones to use for the purposes of the research proposed in the diffuse grant. So yeah, you can see stars white stars, too, we show the vertical dash lines or it's VSA one and vs MB one cutting sites, those segments are fairly evenly spaced, although there is one small segment in between the last BSM b one and the first BSA one site. So you know that again, that's that's that we have small segments sometimes. But the length of the longest fragment, that's the key constraint for bioengineering a small fragment is totally manageable, you can have that small fragment in a plasmid, keep it there. And then every other segment here is docked by two of the same restriction sites. So you could take out, you know, if you have a segment B, for example, the second segment in there, you could use just a BSN B, one enzyme, cut out that segment, manipulate it, put it back in. And same for you know, a B, C, D, E, segment E, which is the first one contained between two BSA one enzyme that contains a receptor binding domain. So if you want to do insert, if you're in a cleavage site, and you would have that segment of a plasmid surrounded by two BSA one enzymes, you could cut it out, you could modify it as you please put a fear of cleavage site in there, put it back into the plasmid, and then you can reassemble the whole thing using the same method we described before. So
Nick Jikomes 1:08:10
I see So so this pattern here, of these two restriction sites, being at these places, would make it very convenient to reassemble the genome and manipulate specific very important regions of the genome, like the receptor binding domain.
Alex Washburne 1:08:26
That's exactly right. And that was a region that was proposed to be, you know, the, the most interesting to recombine across this lineage. And so yeah, cuz that contains a spike G protein, which is critical for receptor binding and the receptor binding determines your host specificity. Are you able to bind onto a bat? Are you able to bind on to a human? So researchers were very interested in whether some of these viruses in this lineage have spiked genes that make them better able to bind on to human receptors and therefore better able to infect people. And they hypothesize in the diffuse grant that if should one of the SARS coronaviruses, that now again, none of them had a fear of cleavage site? Should one of them get a fear of cleavage site in this, you know, in between the s one and S T junction of the spike protein, then perhaps because humans have urine and that fear in humans would cleave this fear and cleavage site that could you know, increase the infectivity of these viruses and humans, SARS to hazard fear and cleavage site in between, it's s one and S two junction and so, yeah, so this would make it very easy to do the state and research in the diffuse grant, it would make it possible to insert if you're in cleavage site exactly where we see one. And as we show on that other panel is just the This isn't this is very consistent with an infectious clone. When we go even further because we say okay, well, a lot can happen by chance, right? And it's a very low chance of having five to seven fragments and having that significantly low of a our longest fragment length in fact in terms of the type two s enzymes or type two enzymes that could be used for this method, the BSA one vsmp. One map of SARS two was the most likely to have been engineered of 1491 other restriction maps that we were able to look at this set of coronaviruses. So if there was ever a Coronavirus that was engineered without people saying it was engineered, it would be SARS Cove too. But again, we had to look a little further. So we found that and we said, Okay, that's interesting. That's a very strong statistical pattern. There's less than a point zero 7% chance of seeing that equally spaced of restriction site sites in a wild Coronavirus. So we looked at, okay, what's the odds of this mutating from a close relative. And that's what we show here is that banal 52 and RA T G 13 are two of the closest relatives of SARS to and it's a very low chance of random mutation generating a restriction map as soon as extreme or more extreme than SARS to under random evolution, we give it a 1.2% chance for our ATG 13. And for the closer relative of SARS, to which is banal 52 random mutation had an even less chance of generating this extreme. This extreme of this this Yeah, this significantly short of longest fragment length. So that's, that was one piece of evidence that we again, the follow up like is this maybe could have happened easily from close relatives, it couldn't have happened easily from close relatives under this model of mutation here. The other thing we look at is are all the mutation silent? Because again, that's the bioengineers trick and silent mutations, and do you have a significantly higher rate of silent mutations within these sites and the rest of the genome. And both of those turns out are true, all.
Nick Jikomes 1:11:54
So the idea is, if you're bioengineering and genome, you introduce these things called Silent mutations, which are going to make it easier for you to chop up and paste together pieces of DNA to do the synthetic work, but they're not disrupting the natural function of any of the genes that are that are there. That's exactly
Alex Washburne 1:12:13
right. And it turns out, there were 14 mutations that separate the BSA, one DSMB. One map of SARS, two, with both of its close relatives are ATG 13, and ba na 52. All of them are silent. Now, most mutations in the virus are silent. 84% of mutations are silent. But still, for all 14 to be silent, there's only a 9% chance of that happening if you're just doing a coin toss at every mutation, but then we did one other tests, we asked is there a higher rate of silent mutations per nucleotide within these restriction sites compared to the rest of the genome, not including those restriction sites. And that's where we found an incredibly significant signal of a much higher rate of silent mutations per nucleotide within these restriction sites than the rest of the genome. So when you combine all of this, that this restriction map is very unusual, and it's even spacing. In a virus, it's very unusual. And if you're a cleavage site coming out of Wuhan, with a very unusual pattern of spillover that doesn't have a geographic trail of infections, like an animal trade outbreak typically does. And then when it has all the mutation separating its restriction map from close relatives or silent, and a higher rate of silent mutations within the sites and the rest of the genome, that body of evidence becomes very significant altogether. And so that's, you know, our paper, again, it looked at the restriction map, as a hypothesis that if this was engineered in the lab, it probably would have been made by typical pre COVID infectious clone technology. And we found the fingerprint of exactly that technology. And perhaps you can totally explain it by chance. It's possible that happened by chance. Some people said, I, you know, you can recombine parts of viruses and that could give you the same pattern,
Nick Jikomes 1:14:02
but it's a it's a statistical argument.
Alex Washburne 1:14:06
Exactly. I have every other Coronavirus in a data set was also subject to recombination, and none of them had this significant is most more, you know, none of them had a type two restriction map as significant infectious clone like SARS, cov. Two.
Nick Jikomes 1:14:22
So what you're saying what you're arguing based on your data here is that yes, it's it's possible that a natural pattern of mutation and evolution happened to produce the patterns that you identified here. But that would be a very, very, very unusual thing to happen statistically speaking, however, if this was an engineering event, this is not unusual at all.
Alex Washburne 1:14:47
It's exactly what we would expect from the diffuse grant. This is what was proposed in the diffuse Grant was creating an infectious clone backbone in order to enable the assembly of chimeric spike proteins across the state. The Age of Coronavirus is and the way you do that is by finding these conserved type two sites that allow you to sort of mix and match viral parts.
Nick Jikomes 1:15:08
So let's explain that for people. So there's grants that were written that were out in the world. That said literally, we would like to get research money in order to do exactly this kind of thing.
Alex Washburne 1:15:22
They said they wanted to add human optimized Furin cleavage sites. So that would explain the CGG CGG codons. Those are human optimized codons. They wanted that human optimized during cleavage sites and test the infectivity of chimeric viruses, infectious clones of these viruses with human optimized fear and cleavage sites in human airway epithelial cells.
Nick Jikomes 1:15:48
And who is they who wanted to do that? This was a collaboration between
Alex Washburne 1:15:51
eco Health Alliance Peter das X, the president of equal Health Alliance, the Wuhan Institute of neurology, specifically xi, Xiang Li, and Ralph Berek at the University of North Carolina who created this technology of you know, Rowsell work.
Nick Jikomes 1:16:05
So the GOP Pietrzak, the guy at Eco Health Alliance, who has made his career out of funding this type of research and other types of research, who has said publicly that the lab leak thing is a conspiracy theory, and basically, it can't possibly happen, has literally written grant saying he wanted to do this kind of research.
Alex Washburne 1:16:26
And he did not disclose that he wrote those grants. So he didn't those grants were pried from their unwilling, uncooperative hands. The Vice President of ego, former vice president of eco health has left and released a lot of materials to the world showing that this is what they were proposing to do before COVID. And that Vice President of eco health former vice president maintains that this virus was created in the collaboration to an equal health, Wuhan Institute of neurology. And that's just one person's word. But that was the vice president of equal health.
Nick Jikomes 1:16:56
So let's talk about what you think. And then let's talk about what other people think about this work and, you know, things that are in this orbit. So what are you so on? Given all of the evidence that you've seen this is your perspective, your opinion, based on your work and others work that you've seen? You know, with your paper, and everything else that's out there? Do you think that we have definitive evidence one way or the other for either hypothesis, or do you simply favor one hypothesis as being much more likely?
Alex Washburne 1:17:28
Yeah, I think you never get definitive evidence and science, you know, very rarely, I mean, get definitive evidence in math, you know, you can prove and disprove something formally with logic, but it's always a statistics game and science, you know, one theory becomes easier and explaining all the facts. Another theory, another theory just has these, you know, this stack of anomalies that each requires a very specific, oh, well, maybe this happened kind of justification. And eventually, like, well, I don't really like that theory. It doesn't help me predict something that I haven't seen yet. So what's interesting is that this theory helped us predict the silent mutations you know, and all these other it helped us anticipate there could be this unusual restriction map and a Coronavirus. So, you know, when you look at the origin of the virus and Wuhan far from the hotspot of wildlife coronaviruses, when you look at the lack of animal trade outbreak trails, like we saw in Guangdong province for SARS, one when you look at the earliest cases not having a connection to the wet market, when you look at the reports suggesting that there was you know, substandard maintenance budget, the Wuhan Institute of neurology. When you look at the grant, that the Wuhan Institute of neurology was a part of proposing to get back coronaviruses from Southeast Asia and put fear in cleavage sites and then with human optimized codons in an infectious clone. And then a SARS Coronavirus, shows up in Wuhan looking like an infectious clone with a fear and cleavage site with human optimized codons. It starts to me I'm very strongly I strongly believe that this likely arose from a lab. The existence of this infectious clone restriction map suggests to me that it was likely an accident because I think if someone were doing malicious work, they would you know, this is a huge this is you know, 80 million people died anyone. I can't imagine anyone with the expertise doing this and not additionally covering the tracks. So I it's it looks like lab X and of the research, you know, seeing the innocent but risky research proposed before COVID to swap parts on viruses and make a human infectious virus with the fear of cleavage site human optimized codons. And someone got sick and someone you know, and then that would also explain why the Wuhan Institute of neurology and eco If we have a large database of coronaviruses, which, again, if there are zoonotic origin were true, that database would help us see it clearly, much more clearly than we can now. We would have a much larger sample size of Coronavirus genomes to study the evolution and say, Oh, wow, this totally happened to nature. Like, look, here's all these viruses recombining in Laos and your nan province, etc.
Nick Jikomes 1:20:22
So why don't we just look at that database.
Alex Washburne 1:20:25
So the Wuhan Institute of virology took it offline. And China's not cooperating with investigators. Interestingly, guess who was put at the as US emissary for the World Health Organization, investigation into the Wuhan and serology or than to the possible in the COVID origins, this repeater dataset. Again, conflict of interest not disclosed very, like huge conflict of interest massive, like the biggest of interest in the history of conflicts of interest that I've seen on this sort of thing. And so, you know, I think there's just a lot of that. Then you have the funders saying absolutely no way was there a lab leak? No way It's a conspiracy theory. That language is highly unusual. The literature trail is highly unusual and then you have this very like concrete biological evidence the first ever SARS Coronavirus. The fear of cleavage site, the first SARS Coronavirus, not one but two CGG codons there are optimized for humans that has no close relative that helps us and again the most significant infectious clone looking SARS sorry, most significant infectious clone looking Coronavirus, not just a SARS Coronavirus, any Coronavirus. So when you put all that together, it's still possible. I mean, anything's possible by chance, right? Like I could find a green fluorescent mouse outside of a lab in Norway where they make green fluorescent mice. And it's possible that that just happened by chance. But I would not be a good detective if I didn't start thinking like, oh, wait a minute, maybe
Nick Jikomes 1:21:56
it's the green fluorescent mouse lab that's
Alex Washburne 1:22:01
starting, like green for us. And the recombination is more common in coronaviruses. But it's but the recombination of fair cleavage sites with human codons and many recombination events making this infectious clone looking thing. That's not common.
Nick Jikomes 1:22:18
Yeah. It's kind of like that, that John Stuart joke from a few months ago where he was like, oh, yeah, where did this Coronavirus come from? Tech. Well, did you ask the Coronavirus lab? Right next door? I mean, not just ask something.
Alex Washburne 1:22:33
Yeah, I would like to ask them, you know, like, hey, let's check out your database. So we can all learn you know more about Coronavirus evolution to understand this and prevent a pandemic?
Nick Jikomes 1:22:43
And they say no, you definitely can't look at that.
Alex Washburne 1:22:47
Yeah, no, offline, will not cooperate. So then the lack of cooperation raises some questions.
Nick Jikomes 1:22:53
Okay. So let's just look, obviously, you did this paper, you have your perspective, and it's yours and everything. Let's, let's just try and flesh this out as fairly as possible. I have not followed the online chatter around this too, too closely. But I know that right, you put this preprint up. There's obviously chatter on Twitter and elsewhere on the internet from all the interested parties and this type of thing. Some people, it seems think that this is a very interesting analysis, that is probably mostly valid, and other people have been critical. So who have been your sort of highest profile, most critical critics, and what exactly have they criticized so far?
Alex Washburne 1:23:31
So some people the critic, the, you know, the jovial discussions on Twitter have, have run a wide spectrum of, you know, legitimacy of statistical and scientific points. But I think the ones that come up, some people said, Ah, you know, your pee hacking, or you're cherry picking genomes or something like that. And that's just not true based on what we did. You know, we chose this statistic of longest fragment length. Because of the bioengineering reasons we said, we chose these enzymes, because of the bioengineering reasons we said in consideration of what they're trying to do with the diffuse grant. And we ran just standard statistical tests otherwise, and so the the workflow is pretty straightforward. No packing, which in terms of cherry picking, they're like, Oh, what about this gene? Or what about that, you know, what, and so that these other genomes exist that we can we've looked at none of them have an infectious column looking type two restriction maps, they actually increase the significance of SARS to in terms of its equal spacing of sites. However, some people say look, some of these ones look like they could have been recombined with SARS wide. So recombination is common and Coronavirus, is and this is probably the most significant and valid critique is saying that recombination is common. Maybe recombination could have caused this. So if you have a virus that is, I mean, let's take this if you took SARS one cut it up into five chunks, put those five chunks into different viruses, then yeah, you'd say recombination and explains this Exactly, right? You'd say recombination, you could put these five chunks together that exactly explains it. So it's not the silent mutation thing. Those are just, that just happens by chance, they would say, I mean, it's still unusual, right? There's this high concentration of silent mutations within these sites. So just by looking at these sites, we found on hotspots of silent mutations, that's interesting. But recombination could happen. The problem is you don't actually know that recombination happened, the recombination is inferred or hypothesize, because we look at a whole bunch of different viruses over a whole bunch of different windows of the genome to see how similar the sequences is this virus decides to that virus decides to and so when you do this for 100, you know, I think they did it for 36 viruses, across the whole length of the genome. And then they just found these cut points in the genome that seem to have slightly higher sequence similarity and some virus versus others, and classify that as recombination. But of course, with statistics, we don't know that that's true, it's uncertain, there could be other explanations for the sequence similarity that is broken across clades, that we don't have this very clear, like, you know, like our genome, whether it's chromosome one, two or three, will be more similar to chimpanzee chromosomes, one, two, and three, then grows, right?
Nick Jikomes 1:26:29
Yeah. So, you know, I think we've gone into enough detail on this, we'll let people decide what they think based on our conversation, and you know, going out checking other resources, obviously, we showed the paper, it's free online, let's talk a little bit about sort of the mechanics of making a paper like this, and getting a preprint published in a peer reviewed journal, and how that interfaces with the sort of controversy and politically charged nature of this subject in particular. So you have a preprint online, let's just describe for people who don't know what that is, what is a preprint? And what are the next steps to get this peer reviewed? And what you know, after you explain that maybe talk about, you know, what do you think are maybe the extra challenges that might be involved in getting something like this, published in the peer reviewed journal, given the the nature of the subject.
Alex Washburne 1:27:23
So scientific peer review, is there's there's someone who's kind of straddled academic and private sector science, there are many forms of peer review. We have these traditional peer reviewed journals that you submit the article to an editor, and then the editor picks reviewers. And then the reviewer is they might hate your guts, or they might love you, or maybe they want to be your collaborator, and so they're going to be really nice to you. So there's some politics that happens in terms of like, you don't get to pick your reviewers, your reviewers could totally hate your guts completely disagree. They can sit on the paper for six months, and no one ever sees it. And then they reject it ultimately, in the editor says, well, that's peer review, you know, so peer review does have some capacity tendency to, I guess, move all scientific outputs towards the mean, you know, towards whatever three peer reviewers are going to agree to. And if you're lucky, you have very open minded peer reviewers who, you know, accept that there's a wide range of views on this issue. And they look at your methods and they look at your logic, and they say, Okay, this is reasonable, you didn't cite this paper here, you know, maybe this other statistical test is more appropriate there. And they have usually relatively minor changes. Other times peer review can be quite nasty peer review can be, you know, some people, one paper I had was, um, COVID forecasts that correctly predicted the number of people that would die in 2020, from COVID. That was rejected on public health grounds, saying this would be a risk to public health, because if I estimated fewer people would die than conventional models estimate. So when you say that they're like, oh, but you can inspire complacency and kill more people. And I thought my job as a statistician was to get the right answer and not to like think about, you know, if people are going to change their behavior based on my estimate of the size of a cow, or for people that died from COVID, in an unmitigated outbreak in South Dakota, for example. So there's the peer review is complex, many subfields of science just use preprint servers. Once you have scientists that care about being right, and they care about, you know, contributing something meaningful to the literature, physics and mathematics, they often just submit their articles to a preprint server. And then if it's a good article, it gets cited straight from there. So physics
Nick Jikomes 1:29:51
and physics and math do this the most. They also just happen to be the two most quantitatively rigorous fields
Alex Washburne 1:30:00
Yeah, I mean, you know, you can they read the papers from Preprints, they know who's writing and they're like, oh, this person's, you know, someone I trust this person is from CERN. They're awesome, you know, and so they read it. And you don't actually need to have this editorial filter, which some people point out can lead to gatekeeping. If the editor doesn't like you, you're not getting in, you know, if the editor has come out vocally on Twitter, saying they clearly are against a lab or hypothesis or something like that. You don't know like, is this gonna get his day in court? You know, is this is this a fair judge? Sorry?
Nick Jikomes 1:30:32
So are you guys submitting this to a journal?
Alex Washburne 1:30:34
Yeah, we are. I'm, personally I've kind of grown disillusioned with pure view. And biology, just as a mathematical biologist. It's quite common that, you know, if you use math, people call it jargon. And they're like, No, this is the bad paper, you know, like, you shouldn't use these math words. And so there's, it's a, it's there's some very deep frustration and peer review, even for topics that are not that contentious. But just the issue of like, you know, I have the luxury of doing an alternative model and peer review in the private sector, both working with hedge funds and consulting biotech companies. Seeing that like, No, we all are scientists, we care about doing this right and care about being honest in our methods. And I wrote a bunch of white papers for these funds, and they were just internally reviewed by people that cared about stuff getting done, you know, we have to get the paper down, we have to do it right, if this is wrong, and we lose money, so don't be wrong, you know, but also don't stop it unnecessarily, just because we disagree with some premises or, you know, possible conclusions, or, you know, the politics of the issue. So, we're gonna submit this paper somewhere, we're not sure where, because there's some journals that have shown a very clear bias and how they treat papers on this really contentious topic. I mean, this is, this is historic, in its implications, if it's true that SARS arose from the lab, so as COVID, two rows from the lab, then scientists created a pandemic that killed 80 million people. And that's three times the number of people that died in the Holocaust. So while the intention isn't as bad, like the stakes for human history are massive, and that's something that a lot of people have very strong opinions on, and that some people are very afraid of the possibility. If it were true, that biologists created a virus that killed people like this. That that's devastating for trust in science for virology research more. So it's very hard, then who's your peer reviewer? You know? How are we going to get like someone look at the methods and look at it, we have limitation section, you know, we have a whole section of the discussion saying like, there are many limitations in our analysis, we're going to add recombination as a limitation. You know, like, we think these are important. But will we get our day in court? I don't know. And I fortunately, my experience with peer review at on COVID. Specifically, his, his kind of killed my belief in the system. And I hate to say that, but like, I just don't think that we're going to get fast and rapid and accurate advances of science with this kind of entrenched interest that we have, and in the modern peer reviewed system. And so I think there's better ways to do it. That's for another call about like how we can have better scientific, more like decentralized scientific systems that allow people to, you know, especially when you have two big, different paradigms, and everyone disagrees strongly about if you have to convince three peer reviewers, and two out of three peer reviewers hate your theory, you're never going to get through peer review?
Nick Jikomes 1:33:41
What? What would you know, given everything that you've done, and what you think right now about the likely origins of this virus? What would it look like, for evidence to emerge that convinced you that this was a wildlife spillover?
Alex Washburne 1:34:01
Um, you know, it would be a Progenitor Virus, you'd find a close relative, and it would show that this recombination event those hypothesize did in fact, happen. And we look at the databases of researchers setting coronaviruses so they open up database and we look at them are like, Oh, wow, actually, look, here's a whole bunch of other SARS coronaviruses with European sites. So here's, you know, like this interesting hotspot of recombination. That's exactly where recombination happened in SARS, too. So there's like, there's a bunch of ways that I would shift my prior especially with transparency from the labs involved, like that's, that's worth emphasizing is that we could disprove a lab origin with notebooks and communications and databases from the labs in question. So if the zoonotic origin were true, then the people who could be exonerated have the data that Would exonerate them of this. And so it's very unusual that that hasn't been shared given, you know, given all this. And so that's if that were shared, and that were done in a very trusted and transparent way that I was, you know, I would look at those data. And if people could confirm the sequences exist in nature, then then we start to be like, oh, yeah, this is totally, there was a lot we didn't know about SARS Coronavirus, evolution that was sitting on a database. And now we know like, oh, this recombination is common. And then maybe they could find, you know, that this particular fear and cleavage site. Turns out, it has 100% sequence identity to some pink Golan RNA or something like that, you know, think of replicate this recombination of this acquisition of a fear and cleavage site, if that were replicated in a cell or an animal model, that would be reassuring to there's just a bunch of stuff that could tilt the scales here, you know, which just requires explaining how did you get a fear and cleavage site? How did it get this infectious cloud restriction map? How did it get to Wuhan? And how can we didn't cause a broader geographic trail infections in the animal train? How come they searched all these animals in the wet market and found not a single one that was positive? You know, that'd be there's some mysteries that we have to just because of how much the odds have been stacked against his own out of origin with the evidence that we have for a lab origin, the evidence we don't have for zoonotic origin, a lot has to happen to kind of pull belief back to his own article origin. But if someone found a Progenitor Virus that you can replicate in the lab and show prove that it's true, and not some, you know, potentially fabricated sequence because I could do that I could just write a sequence and say, Oh, look, I found a progenitor and submit it and say, like, this is a progenitor, you know, like, here it is. And it could have been like, I just took the SARS to genome and like, tweaked it in my computer and submitted it. So we have to have some, like, real confirmation that the virus has been presented are real, that they, and that they have to be done in a way that that we can trust for posterity that 100 years from now people can look at and be like, Oh, this is totally a trusted scientific process.
Nick Jikomes 1:37:20
What has you know, the past two years of all this stuff? How's it made you think about what the way that we prioritize and fund what science gets done? And why?
Alex Washburne 1:37:35
Oh, man, I mean, I left academic science, just because of the heartbreak of seeing how how hostile science can be to new theories, it's into people with different opinions. And I think that it's very easy for people to monopolize, or for monopoly, or oligopoly over critical nodes of power in science, whether that's peer review, or funding at the NID and NIH, if someone has too much power, or power corrupts, and I think that it's totally possible that someone who runs us biodefense funding could have an unnatural sway in, you know, potentially a lack of accountability. Should they have funded something that caused the pandemic? I think we need more. You know, I, I don't want to go so far as to say we need to separate science in the state, you know, we separate the church and state and the science, it's just like critical institution that's supposed to inform and consult policymakers on real things like climate change, or, you know, weather events, or pathogen spillover or new technology and drugs and AI and who knows what and so science is really valuable for civilization, but it's not incorruptible. You know, science is a belief system that is deeply tied to funding systems and to, you know, very hierarchical systems of power and authority of who gets to talk, who's the expert who's not. And I think it's very hard to sometimes come from the bottom or come from outside and even if you're right, and say, like, oh, you know, here's, here's something that just is, here's a different forecast that maybe SARS could cause a major outbreak in New York City and march 2020. When I came up with that forecast, I had people saying you're not an epidemiologist, you know, like, stand down, you'll be responsible for the deaths of millions I got told, if I say something that turned out to be true.
Nick Jikomes 1:39:34
Shut up and stay in your lane.
Alex Washburne 1:39:36
Shut up and stay in your lane. Yeah. And I think that's so we have to like see science as the social system that it is and acknowledge that it is the system with with power and with personal reputational interests and skin in the game that can lead to people using their power in ways that don't serve the public good of science. You know, the public good of science is this. You impartial pursuit of truth and consideration of all the theories and hypotheses that are out there. But if you have someone like Anthony Fauci or, or Francis Collins, who has so much power in the medical science community, the health science community, they can totally, you know, nudge editors of journals in the right direction to publish an article saying, don't look at, there's no way these guys funded this research. Absolutely, they couldn't have come from a lab. So there's just like, there's a lot of systems of publication system, the funding system, the academic kind of institutions that, you know, have this like very clear pyramid of professorial prestige and rank. Those are the social systems of science, and we have to examine them critically, and potentially reorganize them in order to ensure that science is this equitable and impartial endeavor. So, you know, I thought a bit about it, I'm working on a platform called Silva to make better scientific communication and social systems. It's designed to be a platform where PhDs are VIPs, and people can join the platform. And if you're a PhD student or a PhD, will provide a venue for you to share your opinion, and we're not going to filter, we're not going to have some editorial gatekeeping. Or if a pure reviewer disagrees with your theory, they can't stop you from publishing it. So I think there's some like ways and eLife has actually done this, they've said that we're not going to prevent or allow peer reviewers to block a paper, which is a radical idea that authors have a right to publish. Because we have different beliefs and different opinions, and we disagree strongly about some things in the scientific community, we have to have the right to share our side. Because if we have an editor who disagrees with us, our peer reviewer who disagrees with us, and they're allowed to slow walk or paper for six months or a year, or you know, then just reject it outright. That's not good for public funded science. You know, that's not good for the people who want this to be a more efficient marketplace of ideas. So there's a lot that we can do to change science. And, you know, I left academia, and I'm really interested in improving science, still, by making this science communication platform and trying anything to just make it better. It's totally doable, but we really have to look at science as a social system that it is, and potentially rewire how we fund it, how we find good papers and amplify them. You know, like we use Twitter a lot. And so whoever's got the most followers on Twitter will drive the discussion in science. Is that equitable? Is that how is that how ideas should get to bubble up in public awareness? It's not, I don't think that is the right way to do it. You know, I think it shouldn't be determined by Twitter followers, or by which institution you're at. If you have a good paper that should stand on its own, I hope. And that's kind of the idea that we can move closer towards. And preventing, you know, someone from having too much power and the funding of science to be able to block funding to someone they don't like, or stop or flood preferentially fun people supporting their theory, in this case, the real conflict of interest of some people actively, you know, like hiring their postdocs or PhD students and others in order to prop up the theory of natural origin that protects the funders who may have funded a hypothesized lab origin. Are they equitably funding other researchers who are investigating a possible lab origin? I don't think I have, we would have to look at that more carefully to understand but you know, those are sorts of issues that we get with this, again, kind of unaccountable power of science funding in the US as it currently is. And especially with large, international philanthropic funders, this is just like a $30 billion pot of money, you know, that some sits on they're like, Oh, we're gonna fund whoever the heck we want. And they can benefit their own interests, but not necessarily the public interests of advancing science. So should they find something that causes the pandemic, they could also find people who cover it up. And that's something that we have to think about is that that's science funding. And that determines who gets the tenure and rises in the rank of professor who becomes the expert that everyone consults on the New York Times when they want to ask whether this came from a lab or not. So there's the social systems and science. So we have to be explicitly aware about in order to make science better for the 21st century. And that's what's COVID COVID has taught me is that I think we can do it. And it just takes more explicit intention in our design of the scientific communication and social systems.
Nick Jikomes 1:44:42
Well, Alex, I want to thank you for your time and sharing all this with us. Are there any final thoughts or anything that you want to reiterate before we go?
Alex Washburne 1:44:51
Um, keep an open mind you know, like I'm eager to hear if someone has ideas about future research or you know, things we may have overlooked. within our own work, we're really interested in hearing that we care a lot about the truth. And I hope we can find out the answer to that sooner rather than later.
Nick Jikomes 1:45:10
All right, Alex Washburn, thank you very much.
Alex Washburne 1:45:13
Thank you to take care