Author Archives: Maranga Mokaya

Drug Discovery Tools, but they’re olympic sports…

The Olympic Games may have come and gone, but like me, I’m sure you’re all wondering which Olympic sport your favourite drug discovery tool would compete in. Fear not, I have taken it upon myself to answer this pressing question. In this blogpost, we’ll match some of the most popular tools in our field with their Olympic counterparts. Before we begin, let me clarify that I’m using the term ‘tool’ rather loosely here; I’ve included a variety of resources. I don’t claim these to be the most popular, just the ones I thought were most sport like.

RDKit: Athletics. I’m biased, but we must start with the big one. Like track and field events at the heart of the Olympics, RDKit is at the centre of many other tools in our field. It’s versatile, essential, and it’s hard to imagine our work without it. RDKit does it all.

Continue reading

Under-rated or overlooked, these libraries might be helpful.

Discovering a library that massively simplifies the exact thing you just did right after you’ve finished doing the thing you needed to do has to be one of the top 14 worst things about writing code. You might think it’s a part of the life we’ve all chosen, but it doesn’t have to be. Beyond the popular libraries you already know lies a treasure trove of under appreciated packages waiting to be wielded. Being the saint I am, I’ve scoured the depths of pypi.org to find some underrated and hopefully useful packages to make your life a little easier.

Continue reading

OPIG: A decade of Scientific Shenanigans. What’s changed?

2013 was a big year: Andy Murray clinched the Wimbledon title, NASA’s Curiosity Rover discovered water-bearing minerals on Mars, and ‘twerk’ and ‘selfie’ made their way into the dictionary, something equally significant happened—the birth of BLOPIG.com. Intrigued by how the group has changed over the last decade, I started on a journey to unearth the some of the publications from then till now. Questioning their focus, methods, and evolution of the groups’ research over the past decade. This blog post is what I found.

While delving into each publication of the past decade genuinely seemed like an interesting idea, the imminent threat to my PhD progress forced me to adopt the most 2023-appropriate approach: outsourcing the task to AI. After collecting abstracts from all the group’s papers, I enlisted the help of everyone’s’ favourite hallucinator to summarise the works and (hopefully) highlight the shifts in their research.

So after a relatively long, sequence of prompts, this is (apparently) what we do?

Continue reading

Academic Reading? There’s an AI for that.

AI tools are literally everywhere. Recently, I stumbled across an AI aggregator website (theresanaiforthat.com) that, given a task, will find an AI solution. At the time of writing this article, there are 4871 AI’s across 1369 tasks, with solutions ranging from scribes to polygraph examiners. Recently, I stumbled across SciSpace (formerly typeset – https://typeset.io), an “AI assistant to understand scientific literature.” So, of course, I tested it out. In this blog post, we will explore the capabilities of SciSpace and discuss how it can potentially enhance your literature review process.

The user experience of a tool can make or break its adoption. Thankfully, SciSpace isn’t bad. Its main website offers basic search functionality, enabling you to find specific papers, topics, or authors within their database. I did notice that it is missing many new papers in its database; however, users have the option to upload a PDF for analysis. Additionally, each search result includes a TL;DR summary, providing a concise overview of the paper’s contents at a glance. As expected, this summary serves as a helpful reminder for familiar papers, but I often found it inadequate in providing enough information to grasp the main arguments or story of a paper. One interesting feature of SciSpace is the ability to “trace” papers in their database. By following the citations of a paper, users can navigate through related works, authors, and topics. I think this feature would be helpful during exploration and makes finding connections between related topics a little easier.

The best thing about SciSpace is the Copilot Chrome extension. Available whenever you open a paper’s PDF or journal link, it offers text analysis, summarization, and mathematical or table comprehension. It provides a set of common template prompts, which I found helpful. For example, “What were the key contributions of that paper?”, “What data and methods have been used in this paper?”, or “What are the limitations of this paper?” I found these prompts helpful in getting a quick overview of the work faster than reading the abstract, figures, and conclusion.

To put SciSpace Copilot to the test, I used it on my recent publication. The extension provided an accurate summary of the abstract and introduction. It effectively extracted the key result and arguments plus highlighted the main contributions of the work well. To be honest, it also offered a fair and accurate summary of the limitations of the study. It was helpful; however, it does not replace the need to read the full paper.

Tools like SciSpace are clearly becoming more popular and could potentially play a larger role in how we write, read, and understand research output. In the meantime, I’ve found it helpful to significantly improve the efficiency and effectiveness of my academic reading. Its clean, user-friendly interface, TL;DR summaries, and the impressive Copilot Chrome extension save me time. Plus, it’s completely free! I do expect that at some point it will become a paid tool. Until then, it’s a great way to stay on top of published work and build an understanding of related, but unfamiliar, fields.

Visualise with Weight and Biases

Understanding what’s going on when you’ve started training your shiny new ML model is hard enough. Will it work? Have I got the right parameters? Is it the data? Probably.  Any tool that can help with that process is a Godsend. Weights and biases is a great tool to help you visualise and track your model throughout your production cycle. In this blog post, I’m going to detail some basics on how you can initialise and use it to visualise your next project.

Installation

To use weights and biases (wandb), you need to make an account. For individuals it is free, however, for team-oriented features, you will have to pay. Wandb can then be installed using pip or conda.

$ 	conda install -c conda-forge wandb

or 

$   pip install wandb

To initialise your project, import the package, sign in, and then use the following command using your chosen project name and username (if you want):

import wandb

wandb.login()

wandb.init(project='project1')

In addition to your project, you can also initialise a config dictionary with starting parameter values:

Continue reading

Simplify your life with SLURM and sync

For my first blog post of the year, we’re talking about SLURM, everyone’s favorite job manager. If like me, you have the joy of running a literal boat-load of jobs with all kinds of parameters and command-line arguments you’ll know there are a few tips and tricks that make the process of managing these tasks and results as painless as possible. Now, I do expect most people reading this will already be aware of these tricks but for those who don’t, I hope this is helpful. After all, it’s impossible to know what you don’t know you need to know, you know? Any alternatives, improvements, or suggestions are welcome!

Array Jobs

Job arrays are perfect for the times you want to run the same job several times with slight differences each time. Imagine you need to repeat a job 10 times with slightly different arguments with each run. Rather than submit 10 (slightly different) batch scripts you can submit 1 script with all the information needed to complete all 10 jobs.

Continue reading

Why all academics should be on TikTok

Recently I have had the opportunity to get a closer look at the submission, review and promotion cycle for a typical academic paper. It was a great learning experience and led to an increase in the number and of research papers, news articles, and reviews I read in preparation. However, on multiple occasions, I did think “I wish I could watch a 2 min video to explain this”. That got me thinking, why couldn’t I and should I be able to?

Continue reading

6 things I’ve learnt in my first year as a PhD student

Despite spending only four weeks working in the department, this month roughy marks a year since I started my unlikely career as a statistician and was inaugurated into the hall of opiglets (if you account for my foray into the magic of quantum computing last summer). The past year has been filled with learning opportunities, some of which I ought to take note and others are probably worth forgetting. Nonetheless, here is a short list of things I’ve learned in my first year as a DPhil Student, which you may find helpful in what I hope are more precedented times.

Simple and stupid first

When it comes to deciding how to tackle your next scientific problem or which lesson to start your blog post with, often the simplest and sometimes most ‘stupid’ idea is the way to go. Keeping things simple gives you the time to better understand your question without getting lost in the details of a complex solution. Plus, the results will inform your later next steps.

Continue reading

A new Graduate students (unexperienced) guide to academic literature.

Given this is my first ever attempt at a blog post, let alone one on such a highly regarded platform I feel it’s proper that I introduce myself. Hi, my name is Maranga, I am a second-year SABS student starting my DPhil project in Small molecules, and honestly, I really don’t like reading. Especially, scientific journals. Now I can appreciate this does not bode well given my chosen career path, however, my aversion for reading is not new (shoutout to Biff, Chip and Kipper) and hopefully not permanent.

Continue reading