Author Archives: Charlotte Deane

Thinking of going to a conference

As so many members of the group have never attended an in-person conference, I thought it might be worth answering the question “why do people attend conferences?”

First- up, we should remember that flying around the world is not a zero cost to the planet, so all of us lucky enough to be able to travel should think hard every time before we choose to do so.

This means it’s really important to make sure that we know why we are going to any conference and maximise the benefits from attendance. Below are a few things to think about in terms of why you attend a conference and what to do when you are there, but this is definitely not a complete list, more a starter for four.

Continue reading

Writing Papers in OPIG

I’m dedicating this blog post to something I spend a great deal of my time doing – reading the manuscripts that members of OPIG produce.

As every member of OPIG knows we often go through a very large number of drafts as I inexpertly attempt to pull the paper into a shape that I think is acceptable.

When I was a student I was not known for my ability to write, in fact I would say the opposite was probably true. Writing a paper is a skill that needs to be learnt and just like giving talks everyone needs to find their own style.

Before you write or type anything, remember that a good paper starts with researching how your work fits into existing literature. The next step is to craft a compelling story, whilst remembering to tailor your message to your intended audience.

There are many excellent websites/blogs/articles/books advising how to write a good paper so I am not going to attempt a full guide instead here are a few things to keep in mind.

  1. Have one story not more than one and not less – when you write the paper look at every word/image to see how it helps to deliver your main message.
  2. Once you know your key message it is often easiest to not write the paper in the order the sections appear! Creating the figures from the results first helps to structure the whole paper, then you can move on to methods, then write the results and discussion, then the conclusion, followed by the introduction, finishing up with the abstract and title.
  3. Always place your work in the context of what has already been done, what makes your work significant or original.
  4. Keep a consistent order – the order in which ideas come in the abstract should also be the same in the introduction, the methods, the results, the discussion etc.
  5. A paper should have a logical flow. In each paragraph, the first sentence defines context, the body is the new information, the last sentence is the take-home message/conclusion. The whole paper builds in the same way from the introduction setting the context, through the results which give the content, to the discussion’s conclusion. 
  6. Papers don’t need cliff hangers – main results/conclusions should be clear in the abstract.
  7. State your case with confidence.
  8. Papers don’t need to be written in a dry/technical style…
  9. …..but remove the hyperbole. Any claims should be backed up by the evidence in the paper.
  10. Get other people to read your work – their comments will help you (and unless it’s me you can always ignore their suggestions!)

One of my other hats – Covid-19 Response Director for UK research and innovation

The group asked me if I would tell them a little bit about one of my other hats at our regular Tuesday meeting, and this blog is about that.

In October 2019 I was seconded part-time to UKRI as the Deputy Executive Chair of the Engineering and Physical Sciences Research council (EPSRC). What is UKRI (UK research and Innovation)? It’s a non-departmental public body that funds research and innovation. It is made up of the seven disciplinary research councils (acronyms to please Tom – AHRC, BBSRC, EPSRC, ESRC, NERC, STFC and MRC), Research England, and the UK’s innovation agency, Innovate UK.

As Deputy Executive Chair of EPSRC I was helping with UKRI strategy, learning how a spending review round works, visiting universities to talk about how they could work better with UKRI – pretty much everything I was expecting to be doing. But like everyone, my world changed in early 2020.

Continue reading

OPIGTREAT

On the 19th of March OPIG set off on our group retreat – henceforth referred to as the OPIGTREAT.

We kicked off a little late as apparently Saulo and check in times are not a good combination (though he is an expert at reversing on an icy road).

Jin and Flo gave the first talk on web programming specifically Flask and D3. If I understood correctly flask is a web development framework for python that runs everything on the server side. Whereas D3 is data/driven/document, which appears to be a way of making very pretty things.

Garrett then gave us an impressive overview on the area of docking, thinking about whether docking had improved in the last 10 years. He discussed how docking can be used to both predict the binding mode (the orientation and conformation) as well as the binding affinity. The state of the art appears to be if we are docking a small molecule into approximately the correct binding site a native like pose can be identified but binding affinity prediction in all cases remains challenging.

Mark then attempted the impossible, he tried to give a talk explaining how to give a good talk. In this case in the context of public engagement and taking our work out to schools. I am now versed in the 4 Ms Manageable, Measurable, Made first and Most Important. I am also weirdly aware that my head shouldn’t move when I am teaching.

Ellliot then took us through how we should judge a PDB structure, a really useful skill for everyone in the group. He described measures such as resolution, B factors Rfree, Clash score, Ramachandran outliers, sidechain outliers and RSRZ outliers. Interesting facts that I collected the average resolution of an X-ray structure in the PDB is ~2A and the average Rfree is 0.25. I also learnt of the existence of PDBredo a service that re-refines datasets in the PDB.

Saulo and the Fergi were up next and they treated us all to a short talk and then a Jupyter notebook practical on machine learning. They discussed supervised, unsupervised and reinforcement learning. Giving examples of each and how and when they should/could be used. Claire and I then learnt a great deal about Jupyter notebooks, the most important thing being to press shift enter. Useful facts “out of the bag” is a method for measuring the error of random forests, score using all data points apart from those used to make that tree.  

The evening finished with a film about the evil iniquities of smoking (very high brow stuff!?!).

The second day began with Bernhard (a visitor from the far of land of Barcelona these days) talking to us about his latest research project. As this is his story – no details in the blog.

Claire then gave an update of the talk she gave at the last OPIGTREAT – how to make “stuff” pretty. Obviously a popular topic as we all wish to display our data and findings in a way that is easily interpretable as well as visually appealing. Claire took us through some of the tools to use like ggplot and Pymol – showed us where to find the lists of useful commands and then showed us the types of images you could make if you really put some thought into it.

Anne was up next, she discussed the challenges and opportunities of integrating heterogeneous data sources and she came up with a lot of data sources to think about, running from protein structures, protein interactions, small molecule structures, drug safety, drug targets, functional annotation and pathways. One thing to remember probably don’t tell your boss when she should or shouldn’t be taking notes……

It was then the turn of team networks Javi, James and Lyuba who walked us through the basics of networks and expanded on their uses across multiple data types in biology. They mentioned areas from simple motifs to protein structure, MD simulations, ontologies, disease prediction, drug target identification…. We then had a practical to check we had understood the power of networks! The networks under consideration were dolphins, Myoglobin structure, Facebook data and the mystery voter network (where we discovered that Fergus the first in no way tried to rig the vote for what film to watch).

That afternoon I visited the bird sanctuary just down the road, others went to a gin distillery or on a walk. Top quote of the afternoon was from James “I want the birds to eat from my pants”. I believe he is from one of those countries that has the misguided belief that pants means trousers. Actually I could have a different top quote from Alex about somebody being a cheap ride in his dreams but I think I should pass over that one.

That evening we were treated to a fragment based drug discovery extravaganza headed up Hannah, Susan and Joe. They took us through the use of fragments for drug discovery and then we attempted a practical. I seem to remember that Claire and I once again excelled at shift enter on the Jupyter notebook.

That evening we had a pub quiz, which apparently ended in a draw between all the teams playing. I feel that Claire and Flo as quizmasters might have made a minor miscalculation. I was happy though as I ended up with the minions bowl and cup. I also managed to persuade several grown men to jump and smash chocolate eggs on their heads on the ceiling.

Next morning Alex and Matt were up first. In their talk they demonstrated not only their knowledge on the area of the future immunotherapy repertoire but also their ability to finish each other’s sentences. They gave a really excellent overview of current immunotherapies and where the field is moving and what might be the future. Facts to store in the head, first ever approved AB therapeutic Muromonab (1986). Currently most successful Humira (Adalimumab) from Abbvie worth 18.4b dollars in 2017, this is a fully human AB for autoimmune diseases and binds to the mediator of inflammation (TNF-alpha).

Next up Catherine and Lucian who discussed distributed computing in PySpark, they started by explaining why distributed computing is going to become so important. Basic info by 2025, 100 million to 2 billion human genomes will have been sequenced that is 2 – 40 exabytes of data. They discussed distributed computing vs centralised and Pyspark compared to Hadoop. There was a practical but Mark had to solo perform for the audience leading to one of the top photos of the whole OPIGTREAT.

As a punishment for being in charge I gave the final talk where I discussed future research direction and how you decide what those might be.

So with thanks to all of the group that concludes the OPIGTREAT report.

Conformational diversity analysis reveals three functional mechanisms in proteins

Conformational diversity analysis reveals three functional mechanisms in proteins

This paper was published recently in Plos Comp Bio and looks at the conformational diversity (flexibility) of protein structures by comparing solved structures of identical sequences.

The premise of the work is that different crystal structures of the same protein represent instances of the conformational space of the protein. These different instances are identical in amino acid sequence but often differ in other ways they could come from different crystal forms or the protein could have different co-factors bound or have undergone post translational modifications.

The data set used in the paper came from CoDNaS (conformational diversity of the native state) Database URL: http://ufq.unq.edu.ar/codnas.

Only structures solved using X-ray crystallography to a resolution better than 2.5A were used and only proteins for which at least 5 conformers were available (average of 15.53 conformers per protein). Just under 5000 different protein chains made up the set. In order to describe the protein chains the measure used was maximum conformational diversity (the maximum RMSD between any of the conformers of a given protein chain).

The authors describe a relationship between this maximum conformational diversity and the presence absence of intrinsically disordered regions (IDRs). An IDR was defined as a segment of at least 5 contiguous residues with missing electron density (the first and last 20 residues of the chain were not included).

The proteins were divided into three groups.

Rigid

  • No IDRS

Partially disordered

  • IDRs in at least one conformer
  • IDR in the maximum RMSD pair of conformational diversity

Malleable

  • IDRs in at least one conformer
  • No IDR in the maximum RMSD pair of conformational diversity

Rigid proteins have in general lower conformational diversity than partially disordered than Malleable. The authors describe how these differences are not due to crystallographic conditions, protein length, number of crystal contacts or number of conformers.

The authors then go on to compare other properties based on these three types of protein chains including amino acid composition, loop RMSD and cavities and tunnels.

They summarise their findings with the figure below.