Category Archives: Python

Functional Programming in Python

Introduction

The difficulty of reasoning about the behaviour of stateful programs, especially in concurrnent enviroments, has led to increased in intrest in a programming paradigm called functional programming. This style emphasises the connection between programs and mathematics, encouraging code that is easy to understand and, in some critical cases, even possible to prove properties of.

Continue reading

How to be a Bayesian – ft. a completely ridiculous example

Most of the stats we are exposed to in our formative years as statisticians are viewed through a frequentist lens. Bayesian methods are often viewed with scepticism, perhaps due in part to a lack of understanding over how to specify our prior distribution and perhaps due to uncertainty as to what we should do with the posterior once we’ve got it.

Continue reading

GitHub Link to Text Mining Tool

I have created a GitHub page to share some of the codes that I used to conduct text mining to extract HBV-related genetic information from PubMed Central. This code is easily adaptable to search through sentences that satisfy your keyword search, so please take a look if you are interested: https://github.com/angoto/HBV_Code.

Note: GitHub page is currently unavailable online, but will be accessible in due course.

A Gentle Introduction to the GPyOpt Module

Manually tuning hyperparameters in a neural network is slow and boring. Using Bayesian Optimisation to do it for you is slightly less slower and you can go do other things whilst it’s running. Susan recently highlighted some of the resources available to get to grips with GPyOpt. Below is a copy of a Jupyter Notebook where we walk through a couple of simple examples and hopefully shed a little bit of light on how the algorithm works.

Continue reading

Three things to help you get started on Bayesian Optimisation

In this blog post I will share with you the materials that I found most useful when I started doing some Bayesian Optimisation in my research. Bear in mind, I am a Chemist by training, so I approached this topic from a non-mathematical background (my eyes have to be persuaded to look at mathematical equations). Out of all the materials I have come across, I found these to be the most accessible. 

Continue reading

Should scientists learn C++?

Conventional wisdom dictates that compiled languages are slow to develop, can be slow to compile, but are fast to run. Interpreted languages are easy to use and do not require compilation but have sluggish performance. Like most people in scientific computing, the first two languages I learned were C++ and Python; I use Python every day but when, if ever, would I use C++?

Continue reading

Quick Python tricks

It’s always fun when you stumble across something in your programming toolkit that you had never noticed. Here are three things I’ve recently enjoyed learning.

  • Ternary syntax
    a = int(raw_input())
    is_even = True if a % 0 == 0 else False
  • Enumerate

I’ve been looping over the length of my list, all these years, like a chump. It turns out you can do this:

for index, item in enumerate(some_list):
    # now the index of each item is available as well as the item    
# Don't do do this
for index in range(len(some_list)):
    item = some_list[index]
  • for… else

Every so often, you really need to know that a for loop has run to completion. That’s what for…else is for!

for item in iterable:
if item % 0 == 0:
first_even_number = item
else:
raise ValueError('No even numbers')

Some useful tools

For my blog post this week, I thought I would share, as the title suggests, a small collection of tools and packages that I found to make my work a bit easier over the last few months (mainly python based). I might add to this list as I find new tools that I think deserve a shout-out.

Biopandas

Reading in .pdb files for processing and writing your own parser (while being a good exercise to familiarize yourself with the format) is a pain and clutters your code with boilerplate.

Luckily for us, Sebastian Raschka has written a neat package called biopandas [1] which enables quick I/O of .pdb files via the pandas DataFrame class.

Continue reading