The Sprawl: Slogs in Scribing and Software

“Dead shopping malls rise like mountains beyond mountains. And there’s no end in sight.”

Régine Chassagne

Sometimes I wonder would my PhD have been simpler if I had broken up the findings into three smaller papers. In the end there were 7 main figures, 7 supplementary figures, 5 supplementary tables and one supplementary data section in one solitary publication. The contents of a 3 year 3 month tour through the helper T cell response to the inner proteins of the flu virus. The experimental worked comprised crystal structures, cell assays, tetramer staining and TCR sequencing. During the following years as it was batted back and forth between last authors, different journals and reviewers I continually reworked the figures and added extra bioinformatic analyses. I was fortunate that others in the lab kindly performed some in vivo experiments which helped cement the findings. It all started in January 2014, but the paper wasn’t published until July 2020. There are many terms which could be used to describe how the process of writing and re-writing felt as it dragged on through my 3 year post doc, for the purpose of this very public blog I will refer to it as, “a slog.

Right now I am finishing up a similarly sprawling project. While the timelines are much shorter, keeping track of the many threads, the changes to the codebase, updating the figures, staring at the latest results and worrying whether there was a better way to do something all take me back…

There is always an element of chaos when trying to finish up a complicated project and tie it together into one coherent narrative. However is it possible to reduce this chaos and make the process of writing up smoother? Some projects are definitely easier than others, some have a nice story, simple message, minimal amount of analyses, others have no clear direction, shifting goal posts and models or analyses that simply will not play ball. So with hindsight and rose-tinted spectacles I will note down some thoughts on avoiding and then navigating these scenarios:

Thought 1. Know exactly where you are going before you set off.

The ideal scenario is to begin a research project with a clear vision of how the paper will look in the end. This keeps you on track and ensures you are not wasting time lost in the wilderness. If you can lay out the sections and figures as placeholders and see your day-to-day work as filling in these parts, then it will make life much easier… Do this right from the start and watch as each experiment and analysis you do fits neatly into its right place and everything works as it should. Publish and move on.

The above scenario is both boring and unrealistic. It is one end of a spectrum, on the other side of which exists a lone researcher messing around with experiments and data having no thought for how they will fit this into a clear story (but having a lot of fun).

Aim for order, but be prepared for chaos.

Thought 2. Draw a line under non-interdependent sections and move on.

So you started out thinking you knew where you are going, but now you are lost, trying to make that model work, finish that killer experiment or massage some data to work it into your story. You are tired and frustrated with a lack of progress, but you need to finish the paper and get on with your life. At this point you need to divide up your work into non-interdependent parts, write them up separately and mentally say to yourself they are done, put them out of your mind and focus all brain power on what you have left to do.

If you are a perfectionist, over-thinker then this is hard to do. It is tempting to revisit old parts of your manuscript in the quiet hours of your day, rework and try some new visualization. Best to have a combination of discipline and detachment and say no to your bad self.

Now this is nice if you have a project where things do separate out and do not need revisiting, but what if you are working on something that is highly interdependent, a language model project for instance. Or perhaps multiple language models and a software suite that allows users to run each model on different types of data and give multiple output formats. You have been testing and fine-tuning the models as well as checking the code. It is all changing and every figure needs to be updated when things change. How do you stay sane?

Thought 3. Kanban boards and software tests.

There is something about having all of the many things you have to remember and keep track of, on a giant Kanban board, right in front of you, that gives one some sense of control. “It is written down, so I don’t need to remember it.” When the project becomes too big and your head is swirling with all the things that need done before you can submit that pre-print, the Kanban helps.

In line with this, every time you make some change to your copious code base, you need automated tests to flag when something is messed up. Knowing that pytest is going to catch your code errors and alert you with a huge red banner across your terminal is mildly comforting.

You are essentially outsourcing the checklists in your head to give you some room to think through those final parts of the paper, or get into the zone so you can write more code. It is still a mess of experiments and analyses, but you are fighting to keep it under control, with a Kanban and a test suite in each hand.

However some days it gets too much and while you have outsourced both your project milestones and software checklists there is one final thing to offload.

Thought 4. Embrace the chaos and let go.

From time to time it is helpful to remind yourself that is it only research, and it doesn’t always go to plan, you will never have a perfect model, you may never understand the complexity of our biological systems and that is okay. Outsource your attachment to being perfect, and being in control, and just try to do what you can.

The project where everything works is boring, people go to the wilderness to find themselves. If you are in the research wilderness, embrace it. When it is all over, there can be some inner satisfaction in knowing that you navigated through the jungle and lived to fight another day.

Thought 5. Finish it.

The end.

Author