The over-dispersion parameter is estimated around 0.1, which corresponds to a strong over-dispersion. With such a k value, 80% of the infections are due to about 10% of the infected.
The distribution of the number of secondary cases seems very over-dispersed: a very small number of infected people are responsible for a large number of secondary infections. Thus, to control the epidemic, avoiding situations that can lead to "super-spreading" events could be effective. (comments on version 1 of the article)
The results are in line with other studies (strong over-dispersion; important role of super-spreading events in the spread of the epidemic).
However, it would be useful to replicate the analysis with other, more reliable data (and the number of introductions and number of cases were probably greatly underestimated at the end of February).
A little more pedagogy, explanations and rationale around the equations would have been appreciated.
Estimate the distribution of the number of secondary cases, and in particular their over-dispersion (parameter k of a negative binomial distribution).
The authors use a formula from a previous paper (Blumberg et al. 2014 https://doi.org/10.1371/journal.ppat.1004452) to calculate cluster size as a function of the number of initial cases, which assumes that the numbers of secondary cases are independent and identically distributed, drawn in a negative binomial distribution, with mean R0 and over-dispersion parameter k.
The authors deduce a likelihood function. The code is available online at https://zenodo.org/record/3741744#.XwLbuy2w0W8.
The authors estimate k (over-dispersion parameter of a negative binomial) to a fixed R0 (between 0 and 5), then jointly k and R0.
Data: WHO "Situation report 38", from the end of February 2020, giving, by country, the total number of identified cases, and among them, those corresponding to imported cases, local cases, or undetermined cases.
bibliovid.org and its content are bibliovid property.