Biologists Should Not Learn How to Code

Update added here

OK, that’s a little extreme, but over the Thanksgiving weekend, I came across some advice which essentially amounted to biologists should learn how to code. I’m not sure that’s great advice. Since I’m in the genomics bidness, that probably sounds weird, but one of the few advantages of being…not young is that you have seen multiple rounds of similar advice.

It used to be (yoostabee!) that people were told to: learn PCR and sequencing, statistics, and advanced mathematics, to which we can now add coding. There’s nothing wrong with learning to code–if you’re more than a diletantte, you would have other career options if you want or need to leave biology (but you do have to have some credentials, a semester of intro Python isn’t going to cut it). But what we’ve seen with those other supposed must-have skills is they are now either outsourced or so ‘productionized’ that you don’t really need to learn those things. Mind you, it’s always good to understand how things work–and knowing things can come in handy–but you can probably do very good biology without knowing these things.

When it comes to biologically-relevant coding, we’re already starting to see the beginnings of publicly available pipelines (e.g., Galaxy) that don’t really require much more than some very basic file manipulation and knowing how to use that system, neither of which qualify as coding (any more than being an Excel maven means you’re good at coding). My hunch, and I could be wrong, is that in five to ten years, much of what constitutes the daily work of bioinformatics will be much more standardized in the same way most researchers order a Qiagen kit for DNA extraction instead of doing it old school.

What we do need are biologists who know biology, and you can never know too much biology. Again, if you’re thinking about career backups, there’s nothing wrong with becoming an expert (or close to one) in a given skill set (e.g., statistics, coding), and it also will be a good thing as a biologist. But I don’t think coding will be the must-have skill set a few years from now (except for those who develop new tools).

Put this another way: if coding is still a really good skill for biologists to have a decade from now, then we might have done something very wrong.

We do need some people who are good at these things, but you can’t learn everything (at some point, the goal of graduate school is to graduate).

This entry was posted in Bioinformatics, Computers, Statistics. Bookmark the permalink.

5 Responses to Biologists Should Not Learn How to Code

  1. bks says:

    Biologists need to learn to write down the protocols for their experiments, create a sample data set, and analyze the sample data set, all in advance of an experiment. If they refuse to do that, all the rest is moot, and most of them refuse to do that.

  2. Matthias Z. says:

    The headline is a bit misleading, since the text itself is carefully worded and quite balanced. I agree with many statements, but not with its conclusion. Of course, nobody expects a biologist to be an expert coder, but I advocate that some basic coding skills are part of any higher education. It is not about the actual coding skills, but about the skills one learns while learning how to code.

    We encourage kids to sport and join a team – not because we necessarily want them to become a pro athlete, but for the (physical) exercise. While sporting, they happen to learn sportsmanship, the value of team work, the power of endurance and the joy of improvement.

    Similarly, coding requires to thoroughly analyze a problem, to break it down to more manageable tasks, to develop a strategy to the solution (an algorithm!) and all of that before the first line of actual code is written. Considering how much one will have to google around, one also learns how to make good inquiries and how to familiarize oneself independently with new knowledge. Admittedly, all of those steps are also part of a good experimental plan in the lab, so a biologist should know, but it can never hurt to practice as often as possible.

  3. vdauwera
    vdauwera says:

    Learning to code is a great skill to learn for a biologist who is interested in helping software engineers develop tools and interfaces that will actually by useful for and usable by biologists 😉

  4. Sub-Boreal says:

    I’m a late-career environmental scientist, though not a biologist, who happened on this discussion, and found myself nodding my head.

    I’m old enough to have been in awe of my math-ier classmates who produced stacks of punched cards to do stuff back in the mid-70s that probably didn’t amount to much more than a few Excel functions.

    And – go ahead and shoot me – it seemed a lot more fun and interesting to spend my time learning about the things in my own (and allied) fields that I could master and make a contribution to, not stumbling around trying to be a half-assed coder.

    Then in the fullness of time, doing computer stuff became a lot easier and approachable for normal people who preferred to spend their efforts on things that they liked and were good at. That was roughly the stage that things were at when I had to do some stats in grad school, and then occasionally as a practitioner in government.

    Much later, returning to academia in mid-life, I was baffled by the popularity of things like R. It seemed like such a throwback to the era of do-it-yourself coding and the reasons for its popularity – apart from cost – seemed invisible to me.

    What am I missing here?

  5. Jonathan Jacobs – Rockville, MD – You can catch up with Jonathan on Twitter (@bioinformer).
    Jonathan Jacobs says:

    Agree 100% with this Mike. As an “older” scientist who didn’t even enter college until i was in my mid20s, I’ve seen this kind of advice in multiple fields. Bioinformatics is largely reaching an asymptote of methods based on the current data type most of it is focused on. Because of that maturity, a large slice of biofx methods are being wrapped up into commodity solutions you can just install and use. No coding required. QIAGEN’s CLC Genomics Workbench, Geneious, BioNumerics are all commercial examples that have been around for over a decade. GALAXY, UGENE, QIIME2, etc are opensource free examples too. The need is there. For Biologists who are eyeballs deep in too much work the last thing they want to here is : learn python. And then get good at it to build your own pipelines. Oh, you have to learn Linux, snakemake, github, R, some Perl maybe (if you’re lucky 😉) and also know which of the 100 different competing opensource tools are your best option.

    No. They aren’t going to do that. Nor should they.

    Now – for elective career/professional development or as a minor (in undergrad) etc – then yes of course by all means: Never Stop Learning More Skills.

    For me personally – i would rather collaborate with someone who already has those skills and then spend my spare time with my family playing games, backpacking, or learning some new skill unrelated to work.

Comments are closed.