Author Archives: JP Ebejer

Remove all LaTeX generated files

I am going to leave this here and bookmark it, because I am fed up of looking this up every time, not finding it and having to `history | fgrep rm`.  To be used if you want to delete all LaTeX generated (and pdfLaTeX) files.

rm *.aux *.out *.toc *.log *.synctex.gz *.bbl *.blg *.pdf

Use at your own risk!

 

3035 – OPIG Beer & Cycling

Other than my customary Saturday morning hangover, today I woke up with aching limbs – so I am feeling rather poor indeed.  The source of this pain, other than the five and a half pints of premium quality Lager, is the OPIG beer and cycling tour of Oxford (aptly named, “Le Tour de Farce”).  What can I say, over here they make you work for your beer.


View OPIG Oxford Beer and Cycling in a larger map

We started off at the Medawar building, where we sit on weekdays and occasionally on weekends. Then we cycled to The Fishes, The White Hart, The Trout (the Carne pizza is a must), yet another White Hart, and the vegetarian and vegan Gardener Arms (I know, I know – I was drunk and they forced me in here).  For those not familiar with Oxford custom – these are some of the most beautiful (and pricey) pubs this land has to offer.

OPIG members love a laugh and their beer on the cycling trip

OPIG members love a laugh and their beer on the cycling trip

The bike hike (devised by Charlotte) was 9.535 miles.  On 5.5 pints of beer (Nick spilt his pint, like an amateur – so I gave him half of mine), that means I run on 13.86 miles per gallon  (9.535 / 0.6875).  So I roughly have the fuel economy of a 2013 Ferrari 458 Spider.  Say what you want about these blog posts, but you cannot say I am not thorough about the research which goes on behind them.

9.5 miles in Malta

This is how far yesterday’s bike ride would have got me in my country (Malta).

This morning I have also noticed my calves are an inch thicker in diameter, which of course means I will go and throw away all my socks and make that long overdue trip to the bicester village.  Perhaps I will cycle there.

Citing R packages in your Thesis/Paper/Assignments

If you need to cite R, there is a very useful function called citation().

> citation()

To cite R in publications use:

  R Core Team (2013). R: A language and environment for statistical
  computing. R Foundation for Statistical Computing, Vienna, Austria.
  URL http://www.R-project.org/.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2013},
    url = {http://www.R-project.org/},
  }

We have invested a lot of time and effort in creating R, please cite it
when using it for data analysis. See also ‘citation("pkgname")’ for
citing R packages.

If you want to cite just a package, just pass the package name as a parameter, e.g.:

> citation(package = "cluster")

To cite the R package 'cluster' in publications use:

  Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik,
  K.(2013).  cluster: Cluster Analysis Basics and Extensions. R package
  version 1.14.4.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {cluster: Cluster Analysis Basics and Extensions},
    author = {Martin Maechler and Peter Rousseeuw and Anja Struyf and Mia Hubert and Kurt Hornik},
    year = {2013},
    note = {R package version 1.14.4 --- For new features, see the 'Changelog' file (in the package source)},
  }

This will give you BibTeX entries you can copy and paste in your BibTeX reference.  If you are using M$ Word, good luck to you.

[Publication] Cloud computing in Molecular Modelling – a topical perspective

cloud_computingtoc

My ex-InhibOx colleagues (Simone Fulle, Garrett Morris, Paul Finn) and myself have recently published a topical review on “The emerging role of cloud computing in molecular modelling” in the Journal of Molecular Graphics and Modelling.   This paper starts with a gentle and in-depth introduction to the field of cloud computing.  The second part of the paper is how it applies to molecular modelling (and the sort of tasks we can run in the cloud).  The third and last part presents two practical case studies of cloud computations, one of which describes how we built a virtual library to use in virtual screening on AWS.

We hope that after reading this article the cloud will become a less nebulous affair! *pun intended*

As an addendum, I recently came across this paper “Teaching cloud computing: A software engineering perspective” (2013) on how to teach cloud computing at a graduate level.  This work is relevant, because lots of universities are presently including cloud computing in their curricula.

 

Antimicrobial Drug Discovery Conference (Madrid)

I am a big fan of taking something, either a poster or a talk, to a conference, and getting something back – other than a €6 box of airport chocolates.  This blog post is in that spirit.

On the plane to the “Antimicrobial Drug Discovery” conference in Madrid I was reading the Cassandra Project (a novel on smallpox, how apt) instead of the stack overflow of scientific papers I planned to read.  Classic JP.

The conference had a mix of experienced, invited speakers and early stage researchers.  It was very “biological” for a computational scientist, so quite removed from what I normally do – but an opportunity to learn nonetheless.

The keynote lecture was by Julian Davies, a fantastic speaker who gave a general overview of antibiotics and antibiotic resistance.  Antibiotic resistance is a real concern (even those politicians in the G8 noticed a few hours ago!) and there is a fear we might return to pre-antibiotics era when you could not cure common diseases like bacterial pneumonia.  Pharmaceutical companies all got out of antibiotic research years ago, and there have been no new antibiotic scaffolds for more than a decade.  I found this surprising as you would think that there was a truckload of money to be made from finding the new penicillin.  Apparently, there is little return in anti-infectives because of rapid mutation of the pathogen and its short-term use (curing the infection, as opposed to having to take your medication for life, such as beta-blockers for hyperventilation).  Bacteria should not only be considered at an individual cell level but also as a population with complex signalling between the individuals (which may offer a way to stop bacterial infection).  In order to combat infections and increasing resistance sick patients are now supplied with combinations of drugs – this is still dangerous due to the possible (toxic) drug-drug interactions.

Natural products, e.g. some toxins, are good antibiotics but it is very hard to optimize such compounds to improve their drug profile (chemical synthesis of natural products is difficult).  Also a lot of people at the conference were talking of how antimicrobial peptides will save the day.  The attendees with drug discovery experience raised an eyebrow about this, knowing how hard it will be to make a 30 residue peptide into a drug.

Some antibiotics work by having a hydrophilic part (e.g. carboxyl) and a hydrophobic part (e.g. an alkane chain).  This hydrophobic part sits in the membrane wall disrupting it, which creates a “leak” from the bacteria which eventually kills the pathogen.  There are other mechanisms of action such as blocking transporter or signalling channels.

There was a brilliant, energetic talk by Bruno Gonzalez-Zorn with the audience paying rapt attention.  He showed how bacteria have these multiple, small plasmids offering antibiotic resistance.  He discovered there was a common two-part theme to antibiotic resistance, where a particular gene is always present.

Paul Finn gave a much needed talk on why drug discovery is hard (e.g. target selection, difficulty to get drugs in therapeutic area, potency, toxicity, have to optimize for different variables, etc.).  Unknowingly proving this point, there was this earlier talk of a whole optimization series which got a small molecule inhibitor of a viral infection from 150uM down to 1uM (IC50) – a great result in itself, and when the investigators tested this ligand in vivo rather than in vitro it simply did not have any affect on the virus.

Cele Abad Zapatero, one of the main investigators of AltasCBS, made the point that, today, we do not know where we are in drug discovery.  He argued we need to move to chemical-biology space instead of simply chemical space and recommended the use of ligand efficiency indices (e.g. BEI, SEI).

Having fun in Madrid

Madrid was way too much fun.  Zidane (and a few thousand others) kissed this Champions League Cup in exactly the same place. Talking about microbes.  (click to enlarge)

And what did I take to the conference?  I took a poster, the design of which is based on Dunbar’s stylish template.  Marta, Ana and myself won a “highly commendable” poster prize with the best poster going to Laura (Synthetic inhibitors of bacterial cell division targeting the GTP binding site of FtsZ, since you asked).  There were 24 posters in all, and mine was the only computational study in a room otherwise filled with phages, bacteria and plasmids (literally as well as metaphorically).  There is a sinister heart-warming joy in winning a bottle of wine, instead of a cheque or a certificate.  James deserves a sip or two.

 

Poster Prize Presentation

Cheekily asking for a corkscrew during the poster prize award

 

Making small molecules look good in PyMOL

Another largely plagiarized post for my “personal notes” (thanks Justin Lorieau!) and following on from the post about pretty-fication of macromolecules.  For my slowly-progressing confirmation report I needed some beautiful small molecule representation.  Here is some PyMOL code:

show sticks
set ray_opaque_background, off
set stick_radius, 0.1
show spheres
set sphere_scale, 0.15, all
set sphere_scale, 0.12, elem H
color gray40, elem C
set sphere_quality, 30
set stick_quality, 30
set sphere_transparency, 0.0
set stick_transparency, 0.0
set ray_shadow, off
set orthoscopic, 1
set antialias, 2
ray 1024,768

And the result:

ligand

Beautiful, no?

Good looking proteins for your publication(s)

Just came across a wonderful PyMOL gallery while creating some images for my (long overdue) confirmation report.  A fantastic resource to draw sexy proteins – especially useful for posters, talks and papers (unless you are paying extra for coloured figures!).

It would be great if we had our own OPIG “pymol gallery”.

An example of one of my proteins (1tgm) with aspirin bound to it:

Good looking protein

 

A javascript function to validate FASTA sequences

I was more than a bit annoyed of not finding this out there in the interwebs, being a strong encourager of googling (or in Jamie’s case duck-duck-going) and re-use.

So I proffer my very own fasta validation javascript function.

/*
 * Validates (true/false) a single fasta sequence string
 * param   fasta    the string containing a putative single fasta sequence
 * returns boolean  true if string contains single fasta sequence, false 
 *                  otherwise 
 */
function validateFasta(fasta) {

	if (!fasta) { // check there is something first of all
		return false;
	}

	// immediately remove trailing spaces
	fasta = fasta.trim();

	// split on newlines... 
	var lines = fasta.split('\n');

	// check for header
	if (fasta[0] == '>') {
		// remove one line, starting at the first position
		lines.splice(0, 1);
	}

	// join the array back into a single string without newlines and 
	// trailing or leading spaces
	fasta = lines.join('').trim();

	if (!fasta) { // is it empty whatever we collected ? re-check not efficient 
		return false;
	}

	// note that the empty string is caught above
	// allow for Selenocysteine (U)
	return /^[ACDEFGHIKLMNPQRSTUVWY\s]+$/i.test(fasta);
}

Let me know, by comments below, if you spot a no-no.  Please link to this post if you find use for it.

p.s. I already noticed that this only validates one sequence.  This is because this function is taken out of one of our web servers, Memoir, which specifically only requires one sequence.  If there is interested for multi sequence validation I will add it.

 

On being cool: arrows on an R plot

Recently I needed a schematic graph of traditional vs on-demand computing (don’t ask) – and in this hand waving setting I just wanted the axes to show arrows and no labels.  So, here it is:

x <- c(1:5)
y <- rnorm(5)
plot(x, y, axes = FALSE)
u <- par("usr") 
arrows(u[1], u[3], u[2], u[3], code = 2, xpd = TRUE) 
arrows(u[1], u[3], u[1], u[4], code = 2, xpd = TRUE)

And here is the output

Arrowed plot

(I pinched this off a mailing list post, so this is my due reference)

Next thing I am toying with are these xkcd like graphs in R here.

A free, sweet, valid HTML4 “Site Maintenance” page

So today we have moved servers from the cloud to a physically local server and we needed a “Site Maintenance” page.  A few google searches turned up a simple HTML5 template which I converted to HTML4 and is reproduced hereunder (could not find the original source, aargh):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
        "http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>
    <meta http-equiv="Content-type" content="text/html;charset=UTF-8">
    <title>Site Maintenance</title>
    <style type="text/css">
      body { text-align: center; padding: 150px; }
      h1 { font-size: 50px; }
      body { font: 20px Helvetica, sans-serif; color: #333; }
      #article { display: block; text-align: left; width: 650px; margin: 0 auto; }
      a { color: #dc8100; text-decoration: none; }
      a:hover { color: #333; text-decoration: none; }
    </style>

</head>
<body>
    <div id="article">
    <h1>We&rsquo;ll be back soon!</h1>
    <div>
        <p>Sorry for the inconvenience but we&rsquo;re performing some maintenance at the moment. If you need to you can always contact us on <b>opig AT stats.ox.ac.uk</b>, otherwise we&rsquo;ll be back online shortly!  Site should be back up on Friday 1st March 2013, 16:00 GMT.</p>
        <p>&mdash; OPIG</p>
    </div>
    </div>
</body>
</html>

And here is what it looks like… (nothing glamorous, you have been warned)

SiteMaintenance