Search Results for: Latex
Pitfalls of using Pearson’s correlation for comparing model performance
Pearson’s R (correlation coefficient) is a measure of the linear correlation between two variables, giving a value between -1 and 1, where 1 is total positive linear correlation, 0 is no linear correlation, and -1 is total negative linear correlation. While it’s a useful statistic for understanding the relationship between two variables, it is often […]
In defence of chaos
I commend you on your skepticism, but even the skeptical mind must be prepared to accept the unacceptable when there is no alternative. If it looks like a duck, and quacks like a duck, we have at least to consider the possibility that we have a small aquatic bird of the family Anatidæ on our […]
Converting pandas DataFrames into Publication-Ready Tables
Analysing, comparing and communicating the predictive performance of machine learning models is a crucial component of any empirical research effort. Pandas, a staple in the Python data analysis stack, not only helps with the data wrangling itself, but also provides efficient solutions for data presentation. Two of its lesser-known yet incredibly useful features are df.to_markdown() […]
Am I better? Performance metrics unravelled
What’s the deal with all these numbers? Accuracy, Precision, Recall, Sensitivity, AUC and ROCs. The basic stuff: Given a method that produces a numerical outcome either catagorical (classification) or continuous (regression), we want to know how well our method did. Let’s start simple: True positives (TP): You said something was a cow and it was […]
How to estimate the inestimable
Back-of-the-envelope calculations are one of our chief tools as scientists. When you spend most of your time wondering if your latest measurement is correct, having a tool to check if the numbers make sense is simply priceless. If you are lucky, a good estimate might just avoid a costly or laborious measurement — this is […]
Non-linear Dependence? Mutual Information to the Rescue!
We are all familiar with the idea of a correlation. In the broadest sense of the word, a correlation can refer to any kind of dependence between two variables. There are three widely used tests for correlation: Spearman’s r: Used to measure a linear relationship between two variables. Requires linear dependence and each marginal distribution […]
List comprehension: an elegant Python feature inspired by mathematical set theory
Even though I have now deeply entered into the fascinating world of statistical machine learning and computational chemistry, my original background is very much in pure mathematics. Having spent some of my intellectually formative years in this highly purified and abstract universe, I still love to think in terms of sets, ordered tuples and well-defined […]
Chained or Unchained: Markov, Nekrasov and Free Will
Markov chains are simple probabilistic models which model sequences of related events through time. In a Markov chain, events at the present time depend on the previous event in the sequence. The example above shows a model of a dynamical system with two states A and B and the events are either moving between states […]
Uniformly sampled 3D rotation matrices
It’s not as simple as you’d think. If you want to skip the small talk, the code is at the bottom. Sampling 2D rotations uniformly is simple: rotate by an angle from the uniform distribution . Extending this idea to 3D rotations, we could sample each of the three Euler angles from the same uniform […]