Bioinformatics training at the Norwegian Veterinary Institute

Earlier this month, I got something that most academics regard as something akin to a miracle – I got steady employment. I now have a work contract without an end date!Earlier this month, I got something that most academics regard as something akin to a miracle – I got steady employment. I now have a work contract without an end date!

I’ve now been working at the Norwegian Veterinary Institute for two years. These have been two very good years, I’ve had opportunity to work on many interesting projects.  The Institute has quite a broad portfolio of responsibilities – it is tasked with diagnostics, surveillance, monitoring, risk assessments and giving scientific advise to the government on a whole slew of issues. Considering the inroads into various fields that bioinformatics and sequencing have been making the last 10 years, it is not surprising that this now has come to the Institute as well. However, it has also gradually been noticeable that I am the only one at the Institute that can do the things that I do. I have at times become a roadblock.

So, in good Software Carpentry tradition, I’ve been working on training people. This fall we are now able to put this training on a more formal footing – we are starting up a year-long course aimed at training around 20 people to become for the most part self driving bioinformaticians. And, again in good Software Carpentry tradition, I am writing about it in case somebody has feedback and/or would like to do something similar themselves.

The course is slated to run for about a year, and we’ve estimated that people will use approximately one day a week on participation. The course will proceed in two distinct stages, with an introductory lecture series followed by a hands-on practical course

Open introductory lectures
The first part of this course consists of a lecture series that is open to all at the Institute. This series is meant to give people unfamiliar with bioinformatics and sequencing an introduction to what bioinformatics is, what kinds of problems you can solve with it, how to design studies, the methods that are involved in analysis, and so forth. This part has already started, this week we are continuing with the 3rd lecture in the series.

After the open introductory part, around 20 people will be selected for further participation in the practical part.  This part of the course will proceed in four stages:

Bioinformatics infrastructure workshops
The Institute is a pure windows shop. Thus, most of the people here have little to no experience working within the kind of infrastructure that is commonly used for bioinformatics. Due to the pure windows shop factor, we have chosen to use the HPC resources at the University of Oslo for analyses. To interact with the UiO systems, we have opted for going with windows laptops with virtualbox on them, with a biolinux image. That will put the learners on a unix platform with tools that can be used when/if needed while also enabling easy interaction with the HPC cluster. However, to effectively use this kind of a setup, our learners need to know quite a few things. Thus we will be introducing them to VMs, how to use the shell, and how to interact with the UiO computers (including the queueing system). We will also teach them a bit of basic R and python programming. Fortunately, we don’t have to invent the wheel ourselves here – Software and Data Carpentry has a lot of already existing material that we will use for this part. This part is planned as whole day hands on workshops.

We will subsequently divide the participants into three working groups. Not all people are interested in the same kinds of things, and it takes time to learn something properly. Through discussions with people at the Institute, we have figured out that people are primarily interested in comparative genomics, transcriptomics and metagenomics. Each separate group will have their own mentor. From this point onwards the workshops will be one half-day every week, with homework.

Case study workthrough
Once the common infrastructure part is done, we will have the students working in their separate groups. They will first be working through a case study that is known to their mentor. They will work through that case, and through that become familiar with common analyses within their fields.

Working with their own data
They will then proceed to working with their own data, solving their own specific problems. This work will be done under the guidance of their mentor.

Writing up a paper
We have also decided to include writing up a paper based on their own data into this course. Getting the figures and the tables created is one thing, and writing it up and formulating the results is something else. It can be hard to figure out exactly what your results are saying, and thus we decided to include that in the course too.

As is evident from the description above, this is quite an extensive program. I am very happy that I will not be doing this on my own. The Institute has hired two people in 20% positions to work on this with me. The others are Arvind Sundaram, who works at the Norwegian Sequencing Centre, and Thomas Haverkamp, who is at the Biology department at the University of Oslo. During this year, I will be mentoring the comparative genomics group, Arvind will be mentoring the transcriptomics group, while Thomas will be mentoring the metagenomics group.

All in all, I am very exited that we are able to start this course now, and I look forward to help upskill people at my Institute. However, I also have to admit that it is bit terrifying being responsible for the further education of this many people. But, we have to start somewhere, and I am fortunately not going it alone (another thing that Software Carpentry has taught me). I’d also very much like to hear from people who have done similar things, even if on a smaller scale. I also aim to write about this regularly here, to get more feedback and possibly help and inspire others who would like to do similar things.

Teaching Software Carpentry workshops – some tricks of the trade

These days I am gearing up to teach two more Software Carpentry workshops, one in Wageningen, Netherlands, and one in Oslo, Norway. In the Netherlands workshop I will be teaching a module that I haven’t even looked at before. This led me to think about the things I do to prepare for a workshop. So, here is a list (in no particular order) of things that I do that others might find useful.

  • Go through the instructor checklist. Have a look at the other checklists too, that helps with figuring out what you can expect of the other parties involved in the workshop.
  • Recently, all of the workshop modules have been put into their own github repos. Go sign up for notifications for those that you are teaching. It is highly likely that discussions about the material will prove useful. These can contain both information about technical issues and about how to teach that particular module.
  • Go through your module(s) on as many platforms that are available to you. If you are thusly inclined, consider creating a virtual machine or two and go through both the installation procedure and the module there. Remember, this takes time, so start before you think you have to, there will always be weird hickups.
  • Print out a copy of the lessons on paper and make notes on them as you go along. Take them with you to the workshop. During my first workshop I did not have a printout, and it was not a pleasant experience trying to switch back and forth between windows. I don’t know if I or the students ended up being the more confused.
  • Have a look at the wiki for technical issues and familiarize yourself with the latest technical annoyances.  Ensure that you have an easy way to get back to it again during the workshop. I have forgotten where it is a couple of times, and it was equally annoying each time having to spend time figuring out where it was.
  • Make sure that the host supplies stickies, and consider taking a backup stash with you in case the host misplaces them or simply did not get them because they didn’t believe in them. Ensure that you have at least twice as many stickies as students, sometimes they lose them, sometimes they spill coffee on them, sometimes they distractedly end up tearing them into tiny tiny little pieces. You get the picture.
  • During the workshop – USE THE STICKIES! They are a lifesaver. If you have not taught with them before, just give them one single go and that should be enough to convince you. It is a lot easier to keep track of where people are with them than without, you can keep a higher speed through the material without loosing anybody, and it is a lot easier to see who needs help. It also saves students sore shoulders since they don’t have to keep their hands up in the air until they fall off. On a more serious note, I suspect students ask for help more quickly with stickies since the overhead cost associated with it is reduced – it is not very taxing to put a stickie on your screen.
  • When you are live coding (typing on your computer) for the entire lesson it is tempting to sit down. Consider teaching standing up instead. It helps with speaking clearly and loudly enough so that people can hear. I also suspect that instructors may be quicker to go and help people when teaching standing up, because you don’t actually have to get up first. If you decide to teach standing up, tell the organizer so that they can fix something to have the computer on.
  • Bring good walking shoes. If you enjoy wearing heels, leave them at home. You are likely to do a lot of standing up and walking about, both during the workshop and in the evenings. You do not want to end up teaching with blisters. Also, you are likely to be walking around in a room with a lot of extension cords and leads lying on the floor. The risk of tripping over something is already higher than normal.
  • Bring throat lozenges or cough drops or whatever they are called, and a bottle of water. You will end up speaking a lot more than you are used to, which might lead to a sore throat, coughing and in a worst case scenario, losing your voice. I once got a coughing fit while teaching and it was not a fun experience.
  • If you can, try to get together with the helpers, the other instructors and the organizers the evening before the workshop. It really helps to have met before the workshop. Everybody, especially the helpers, are bound to have questions about things, questions that won’t have occurred to them until they are actually talking with others involved in the workshop. This is also good for giving last minute information, ensuring that everybody knows where and when to show up, organizing transport etc.
  • Ensure that you get to the workshop in plenty of time in the morning. The building you are teaching in might be confusing to navigate, so give yourself enough time to get there. You will then also have time to set up your own computer, sort out your papers etc.
  • Last but not least: have fun!

So there you have it!

The one where I went to Sweden

I spent some days two weeks ago in Stockholm, Sweden. Lex Nederbragt and I were invited by SciLifeLab to teach a Software Carpentry workshop there. This coincided with the very first PyCon Sweden Conference, and as the organizers would have it, I got to present a talk.

The workshop

The workshop went very well. Lex and I taught the by-now fairly well known novice workshop (if you want one at your institution, let them know!). Oxana Sachenkova, the local organizer, had also set up an intermediate workshop. The teachers in that one were Konrad Hinsen and Nelle Varoquaux, both flying in from Paris. Their workshop focused more on object oriented programming and intermediate git use. It was great meeting them, the only sad thing is that I could not sit in on their workshop

The division of labor between Lex and I have until now been that he teaches shell and unit testing, while I teach git and python. This time I taught both these parts from the new lesson material that has been developed. I had taught the git lesson earlier once before, so that material was well known to me. I think this lesson is reasonably easy to teach, the real challenge is to convey to the students why version control is useful at all. At this stage I am leaning towards most people not really understanding the need for version control before they have either messed up their work pretty badly, or have become involved in a joint development project.

I had not taught the python lessons before. These now take place entirely in the iPython Notebook. The first time I went through them, I actually wondered if I should return to the old lesson material, if nothing else because on the printout I had somewhere around 50 pages to go through. On the second run through, however, I realized that the notebook is a game changer. With the notebook, I could have the students editing and copy-paste code from earlier in the lesson, which would reduce the typing time and hence the teaching time dramatically. There were still things that I cut from this lesson – I did for example not go through he python call stack, simply because I still think this is too complicated for novices. Instead, I teach them the basic tenant “What happens in a function, stays in a function”, and that does seem to stick.

The conference and my talk

Due to teaching I only got to attend the last day of the conference. The programme looked really nice, and I got to see some really great talks. The morning of the last day opened with Laurens Van Houtven speaking about cryptography, and Jackie Kazil speaking about how she started using programming in her journalism and how that lead her to new pastures. After lunch there were several other talks, most of which were pretty technical. Such talks can be really good, but to me they lose their value when they don’t even have a 3 minute “subject of my talk for dummies” intro. 

My talk was at the end of the day, and was entitled “Python and Biology: a shotgun wedding” (pardon the pun, when the title appeared in my head, resistance was futile). The background for the talk was that I have several times during the last couple of years helped people – primarily biologists – start programming. Naturally, as opinionated as I am, I have ended up with some do’s and don’ts on where to start. I also included a bit of background on why life scientists have had to get into this game, and also showed some examples. I have included the slides below.

[gview file=””]

The talk seemed to be fairly well received – it was however aimed at novices, and there did not seem to be too many of those in attendance. I did however see some people nodding vigorously in the front, and got some really nice questions at the end, so all in all I think it went over well.



Basic bioinformatics python course, part II

This is the second part of the python bioinformatics course that I have taught biologists. This module is about control flow and how to handle input and output. Control flow is needed in mainly two different situations, either that a decision based on data has to be made, or that a piece of code should be repeated.  In Python, decisions are made using an IF statement, while iterations (repeating code) are done with either a FOR loop, or a WHILE loop. How to handle input and output from files is also described –  in most cases that is where the data in question is to be found, and it is easier to keep track of results if the program prints the results to a file.

[gview file=””]

Basic bioinformatics python course, part I


I have on several occasions had the privilege of teaching basic programming to biology students. My preferred language in this situation as in many others is python. I have also been fortunate enough to find a book which I think does a fairly good job of teaching basic python in a way that biologists find useful. In this context that mostly means dealing with sequences in a sensible way. The book in question is “Python for Bioinformatics” by Sebastian Bassi.

The only note here is that there are some spelling mistakes in it, and that it is from 2009. Python has progressed to Python version 3 now, whereas the book is at version 2. However, for a beginning programmer, this should not make too much of a difference.

I am here putting out the slides that I used for a one-day intro course for biologists. The course is very interactive, meaning that in the slides there are  many short exercises which are followed by the answer. I am in this post putting out the first set of slides that deal with the basics, the rest will follow during the next couple of weeks.

Note: I have tried to ensure that these slides are bug free, but there are bound to be some mistakes somewhere. Please let me know if you spot any!


Part 1: The basics

The first lesson begins with discussing programming a bit, and the two modes in which python can be used – interactively and batch mode. I then go through the basic datatypes in python, i.e. what kind of “things” that are available. I cover how to use python as a calculator, how to work with strings, and also what a list and a dictionary is and how to use them.

[gview file=”” save=”0″]