Some molecular pose generation methods benefit from an energy relaxation post-processing step.
Here is a quick way to do this using OpenMM via a short script I prepared:
Continue readingSome molecular pose generation methods benefit from an energy relaxation post-processing step.
Here is a quick way to do this using OpenMM via a short script I prepared:
Continue readingOne of the most annoying parts of ML research is keeping track of all the various different experiments you’re running – quickly changing and keeping track of changes to your model, data or hyper-parameters can turn into an organisational nightmare. I’m normally a fan of avoiding too many different libraries/frameworks as they often break down if you to do anything even a little bit custom and days are often wasted trying to adapt yourself to a new framework or adapt the framework to you. However, my last codebase ended up straying pretty far into the chaotic side of things so I thought it might be worth trying something else out for my next project. In my quest to instil a bit more order, I’ve started using Hydra, which strikes a nice balance between giving you more structure to organise a project, while not rigidly insisting on it, and I’d highly recommend checking it out yourself.
Continue readingDo you use pandas for your data processing/wrangling? If you do, and your code involves any data-heavy steps such as data generation, exploding operations, featurization, etc, then it can quickly become inconvenient to test your code.
On 5th April 2024, over 60 researchers braved the train strikes and gusty weather to gather at Lady Margaret Hall in Oxford and engage in a day full of scientific talks, posters and discussions on the topic of adaptive immune receptor (AIR) analysis!
Continue readingWhen training a machine learning (ML) model, our main aim is usually to get the ‘best’ model out the other end in an unbiased manner. Of course, there are other considerations such as quick training and inference, but mostly we want to be good at predicting the right answer.
A number of factors will affect the quality of our final model, including the chosen architecture, optimiser, and – importantly – the metric we are optimising for. So, how should we pick this metric?
Continue readingYou might have created the most esthetic figures for your last presentation with a beautiful colour scheme, but have you considered how these might look to someone with colourblindness? Around 5% of the gerneral population suffer from some kind of color vision deficiency, so making your figures more accessible is actually quite important! There are a range of online tools that can help you create figures that look great to everyone.
Continue readingI commend you on your skepticism, but even the skeptical mind must be prepared to accept the unacceptable when there is no alternative. If it looks like a duck, and quacks like a duck, we have at least to consider the possibility that we have a small aquatic bird of the family Anatidæ on our hands.
Douglas Adams
It’s not every day that someone recommends a new whizzbang note-taking software. It’s every second day, or third if you’re lucky. They all have their bells and whistles: Obsidian turns your notes into a funky graph that pulses with information, the web of complexity of your stored knowledge entrapping your attention as you dazzle in its splendour while also the little circles jostle and bounce in decadent harmony. Notion’s aesthetic simplicity belies its comprehensive capabilities, from writing your notes so you don’t need to, to exporting to the web so that the rest of us can read what you didn’t write because you didn’t need to. To pronounce Microsoft OneNote requires only five syllables, efficiently cramming in two extra words while only being one bit slower to say than the mysterious rock competitor. Apple notes can be shared with all the other Apple people who live their happy Apple lives in happy Apple land – and sometimes this even works!
Continue readingTraining a large transformer model can be a multi-day, if not multi-week, ordeal. Especially if you’re using cloud compute, this can be a very expensive affair, not to mention the environmental impact. It’s therefore worth spending a couple days trying to optimise your training efficiency before embarking on a large scale training run. Here, I’ll run through three strategies you can take which (hopefully) shouldn’t degrade performance, while giving you some free speed. These strategies will also work for any other models using linear layers.
I wont go into too much of the technical detail of any of the techniques, but if you’d like to dig into any of them further I’d highly recommend the Nvidia Deep Learning Performance Guide.
Training with mixed precision can be as simple as adding a few lines of code, depending on your deep learning framework. It also potentially provides the biggest boost to performance of any of these techniques. Training throughput can be increase by up to three-fold with little degradation in performance – and who doesn’t like free speed?
Continue readingAs a mother currently pursuing my doctorate, I often encounter the belief that higher education is not the ideal time for parenthood. In this post, I want to share my personal experience, offering a different perspective.
A year ago, I began my doctorate with a two-and-a-half-month-old baby. When I received the acceptance email from Oxford, I was thrilled – a dream come true. However, this raised a question: could I pursue this dream while pregnant? I believed in balancing motherhood and academic aspirations, and my advisor’s encouragement reinforced this belief. We, as a family, moved from Israel to England, adjusting to this new chapter.
It hasn’t been easy. Physically, post-pregnancy recovery and sleepless nights were tough. Emotionally, I constantly struggle with guilt over balancing academic and maternal responsibilities. If I focus on my daughter, I worry about neglecting my research; if I concentrate on my studies, I feel like a bad mother. The logistics of managing a household, especially when being the primary caregiver, added another layer of complexity. Motherhood often feels isolating, as not everyone around me can relate to my situation.
Yet, doctoral studies offered unexpected advantages. The flexibility allows me to align my work with my daughter’s schedule, often during nights or weekends. This means I can compensate for lost time without impacting others, unlike in a regular job. Interestingly, this flexibility leads to more time spent with my daughter than if I had a typical job. Moreover, the challenges of motherhood put academic obstacles into perspective. The best part of my day is always the hug from my daughter after a day of work.
As I keep moving forward with my PhD, here are some key tips that have helped me so far:
In the race of life, there never seems to be a “right” time for children. Whether it’s career progression or personal aspirations, the timing is always challenging. However, if you feel ready, that is the right time for you.
So the servers you use have Slurm as their job scheduler? Blopig has very good resources to know how to navigate a Slurm environment.
If you are new to SLURMing, I highly recommend Alissa Hummer’s post . There, she explains in detail what you will need to submit, check or cancel a job, even how to run a job with more than one script in parallel by dividing it into tasks. She is so good that by reading her post you will learn how to move files across the servers, create and manage SSH keys as well as setting up Miniconda and github in a Slurm server.
And Blopig has even more to offer with Maranga Mokaya’s and Oliver Turnbull’s posts as nice complements to have a more advanced use of Slurm. They help with the use of array jobs, more efficient file copying and creating aliases (shortcuts) for frequently used commands.
So… What could I possibly have to add to that?
Well, suppose you are concerned that you or one of your mates might flood the server (not that it has ever happened to me, but just in case).
How would you go by figuring out how many cores are active? How much memory is left? Which GPU does that server use? Fear not, as I have some basic tricks that might help you.
A pretty straight forward way of getting to know some information on slurm servers is the use of the command:
sinfo -M ALL
Which will give you information on partition names, if that partition is available or not, how many nodes it has, its usage state and a list with those nodes.
CLUSTER: name_of_cluster
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
low up 7-00:00.0 1 idle node_name.server.address
The -M ALL argument is used to show every cluster. If you know the name of the cluster you can use:
sinfo -M name_of_cluster
But what if you want to know not only if it is up and being used, but how much of its resource is free? Fear not, much is there to learn.
You can use the same sinfo command followed by some arguments, that will give you what you want. And the magic command is:
sinfo -o "%all" -M all
This will show you a lot of information abou every partition of every cluster
CLUSTER: name_of_cluster
AVAIL|ACTIVE_FEATURES|CPUS|TMP_DISK|FREE_MEM|AVAIL_FEATURES|GROUPS|OVERSUBSCRIBE|TIMELIMIT|MEMORY|HOSTNAMES|NODE_ADDR|PRIO_TIER|ROOT|JOB_SIZE|STATE|USER|VERSION|WEIGHT|S:C:T|NODES(A/I) |MAX_CPUS_PER_NODE |CPUS(A/I/O/T) |NODES |REASON |NODES(A/I/O/T) |GRES |TIMESTAMP |PRIO_JOB_FACTOR |DEFAULTTIME |PREEMPT_MODE |NODELIST |CPU_LOAD |PARTITION |PARTITION |ALLOCNODES |STATE |USER |CLUSTER |SOCKETS |CORES |THREADS
Which is a lot.
So, how can you make it more digestible and filter only the info that you want?
Always start with:
sinfo -M ALL -o "%n"
And inside the quotations you should add the info you would like to know. The %n arguments serves to show every node, the hostname, in each cluster. If you want to know how much free memory there is in each node you can use:
sinfo -M ALL -o "%n %e"
In case you would like to know how the CPUs are being used (how many are allocated, idle, other and total) you should use
sinfo -M ALL -o "%n %e %C"
Well, I could give more and more examples, but it is more efficient to just leave the table of possible arguments here. They come from slurm documentation.
Argument | What does it do? |
%all | Print all fields available for this data type with a vertical bar separating each field. |
%a | State/availability of a partition. |
%A | Number of nodes by state in the format “allocated/idle”. Do not use this with a node state option (“%t” or “%T”) or the different node states will be placed on separate lines. |
%b | Features currently active on the nodes, also see %f. |
%B | The max number of CPUs per node available to jobs in the partition. |
%c | Number of CPUs per node. |
%C | Number of CPUs by state in the format “allocated/idle/other/total”. Do not use this with a node state option (“%t” or “%T”) or the different node states will be placed on separate lines. |
%d | Size of temporary disk space per node in megabytes. |
%D | Number of nodes. |
%e | The total memory, in MB, currently free on the node as reported by the OS. This value is for informational use only and is not used for scheduling. |
%E | The reason a node is unavailable (down, drained, or draining states). |
%f | Features available the nodes, also see %b. |
%F | Number of nodes by state in the format “allocated/idle/other/total”. Note the use of this format option with a node state format option (“%t” or “%T”) will result in the different node states being be reported on separate lines. |
%g | Groups which may use the nodes. |
%G | Generic resources (gres) associated with the nodes. (“Graph Card” that the node uses) |
%h | Print the OverSubscribe setting for the partition. |
%H | Print the timestamp of the reason a node is unavailable. |
%i | If a node is in an advanced reservation print the name of that reservation. |
%I | Partition job priority weighting factor. |
%l | Maximum time for any job in the format “days-hours:minutes:seconds” |
%L | Default time for any job in the format “days-hours:minutes:seconds” |
%m | Size of memory per node in megabytes. |
%M | PreemptionMode. |
%n | List of node hostnames. |
%N | List of node names. |
%o | List of node communication addresses. |
%O | CPU load of a node as reported by the OS. |
%p | Partition scheduling tier priority. |
%P | Partition name followed by “*” for the default partition, also see %R. |
%r | Only user root may initiate jobs, “yes” or “no”. |
%R | Partition name, also see %P. |
%s | Maximum job size in nodes. |
%S | Allowed allocating nodes. |
%t | State of nodes, compact form. |
%T | State of nodes, extended form. |
%u | Print the user name of who set the reason a node is unavailable. |
%U | Print the user name and uid of who set the reason a node is unavailable. |
%v | Print the version of the running slurmd daemon. |
%V | Print the cluster name if running in a federation. |
%w | Scheduling weight of the nodes. |
%X | Number of sockets per node. |
%Y | Number of cores per socket. |
%z | Extended processor information: number of sockets, cores, threads (S:C:T) per node. |
%Z | Number of threads per core. |
And there you have it! Now you can know what is going on your slurm clusters and avoid job-blocking your peers.
If you want to know more about slurm, keep an eye on Blopig!