Jeremy has been defending this paper on Twitter so unless he has changed his opinion on this paper since then I’m expecting a plot twist at the end of this explanation
EDIT: to be clear, you probably still can’t draw proper conclusions on the graph alone, but the graph isn’t the whole paper
The approach of teaching seems very different this time compared to last time. There is less talk of DL and more general conversation in modelling. Is this intentional?
Isn’t there another statistic, beta?, that measures the discriminative power of the test? Would that with p help determine whether the data show an effect or not?
p-values guide quite a lot of research in biology and medicine still. Unfortunately so.
Lots of research dollars are spent to produce something significance of which is determined by p-value!
No equivalent, easy & popular alternatives exist though.
For those wondering: the p-value is the probability that random data could give a more surprising result, i.e. one that is even farther away from the result that would obtain uder the null hypothesis.
The null hypothesis is simply the proposition that there is no effect. For a linear model, the null hypothesis is there is no linear trend, i.e. the slope is zero and the data is consistent with a constant value.
Small p-values are good. A small p-value means your result may be real. A large p-value certainly means your result is rubbish.
For example, in the case of a linear model, a p-value of 0.01 means that 1% of the time, randomly generated data (i.e., noise) could give rise to a steeper slope (i.e. less consistent with the null hypothesis, which implies a slope of zero) than the slope fitted by your model.
It is generally accepted by many scientists that a significant result should have a p-value of less than 0.01. However, until recently, a threshold of 0.05 has been standard practice across some of the softer scientific disciplines.
R0 describes how many cases of a disease an infected person will go on to cause. So an R0 of 2 means that on average each infected person will infect 2 more people
Given that so much of academic research is still based on p values and that “research” is used to support policies and clinical treatment is based on “evidence”, this information about p value is very important. For example, I believe Texas and Idaho just put forward policies that medication being used to treat coronavirus can only be used for diagnoses “consistent with the evidence for its use”.
If R being the degree of reproducibility (spread essentially), then as per equation an increase in relative humidity will consequently decrease the the RO, but with reference to previous discussion by Jeremy the spread of the virus increases with moisture in the environment, as he said, the virus mainly resides in the droplets
A confidence interval is a good way to see how significant the result is. For example, if the confidence interval on the slope were to include zero, you would not believe that the effect was significant.
One of my favorite videos EVER for those of you that want to get a more intuitive understanding of concepts like p-values and statistical significance: