# Poor Man’s Logistic Regression

We own a B2B online store, and we want to boost conversion rates by improving
our search functionality. We a/b test with a SaaS provider, giving us
results^{1} On the first row are sessions that went to the SaaS provider. On the
second row are sessions that went to our existing solution. The first column are
sessions that lead to a purchase, and the second column those that did not.:

Sessions | Purchase | No purchase |
---|---|---|

SaaS | 1957 | 7898 |

Current | 3299 | 14152 |

Is the SaaS solution better?

We sent roughly a third of the sessions to the SaaS solution.^{2} This is an
experimental smell! It should have been much closer to a third for us to be
confident in our randomisation. Conversation rates with the current solution
are very high already, at 18 %. However, the SaaS solution lead to a
conversation rate 5.04 % higher. Great success! Or is it?

# Effect Size

An epidemiologist would look at the above as a contingency table. Hang on to your seat, because this is black magic.

Multiply the two cases that support our hypothesis (1957×14152) and divide by the product that oppose it (7898×3299). This gives us

\[\frac{1957 \times 14152}{7898 \times 3299} = 1.06\]

which is the *odds ratio* of the effect on conversion from the SaaS solution.
This means that the odds in favour of conversion are 6 % larger under the SaaS
solution than the existing solution. Great success! Or is it?

# Significance

Crank up the black magic to 11. Take the logarithm of the odds ratio, to get the
log-odds difference: \(\log{1.06} = 0.06\).^{3} For small odds ratios, you can do
this in your head: \(\log{x \%}\) is approximately \(x/100\). The error is less than a
twentieth for numbers up to 10 %, and less than a tenth for numbers up to 20 %.

Following this, we take *the square root of the sum of the inversions of all
counts* in the contingency table:

\[\sqrt{\frac{1}{1957} + \frac{1}{14152} + \frac{1}{7898} + \frac{1}{3299}} = 0.032.\]

This is an estimation of the standard error of the log-odds difference. We can compare the improvement (0.06 log-odds difference) to the standard error (0.032). We see that the improvement is not even two standard errors away from zero.

This means we should be skeptical of the significance of the experimental result. We should instead make the decision based on the opinion of the highest paid person in the room, as is customary.

But stop to appreciate what we did here. It’s basically a logistic regression
except with easy maths. You can do this in a *spreadsheet*!