Cloud Application Bioinformatician
EBI - European Bioinformatics Institute
Hinxton near Cambridge, United Kingdom
EMBL-EBI seeks an experienced ‘cloud’ Bioinformatician to contribute to the bioinformatic analysis of cancer genomes. Despite the widespread use of whole genome sequencing on cancer samples, the analysis of the data is still a challenging task. This is due in large part to the size of the data, which requires substantial capabilities in terms of storage and compute, especially when analysing large patient cohorts, which in addition are generally protected by privacy rules. Processing these large datasets requires robust software solutions that can be deployed in a wide variety of compute infrastructures.
The purpose of the project is to develop a suite of cancer genome analysis workflow on the EMBL-EBI’s Embassy OpenStack cloud compute infrastructure, comprising more than 6000 vCPUs and 4.5PB of storage. You will run large scale analyses for our own research purposes and provide training for external collaborators in using these tools. You will set up these tools and run them on thousands of samples provided by collaborators. In particular, you will:
- Establish a reusable deployment and monitoring methodology in an OpenStack environment;
- Curate and automate bioinformatic pipelines and other workflows for cancer genome analysis available at Dockstore;
- Manage data access and download the data;
- Run the pipelines, QC and curate results;
- Link the results to other EBI resources;
- Train other users in the use of the cloud-based pipelines.
You will work within the Genome Analysis team (led by Daniel Zerbino) and collaborate closely with the Computational Cancer Biology research group (led by Moritz Gerstung), which will provide scientific expertise for algorithmic development, and the Systems Application group, which will provide technical expertise for OpenStack cloud computing. The Genome Analysis team is itself part of Ensembl (led by Paul Flicek).
At EMBL-EBI, we help scientists realise the potential of ‘big data’ in biology by enabling them to exploit complex information to make discoveries that benefit mankind. Working for EMBL-EBI gives you an opportunity to apply your skills and energy for the greater good. As part of the European Molecular Biology Laboratory (EMBL), we are a non-profit, intergovernmental organisation funded by 22 member states and two associate member states and proud to be an equal-opportunity employer. We are located on the Wellcome Genome Campus near Cambridge in the UK, and our 600 staff are engineers, technicians, scientists and other professionals from all over the world
Qualifications and Experience
You will have a degree or equivalent qualification/experience in a Computational, Physical or Biological Sciences and the ability to work effectively in English. In addition you are required to have:
- At least 3 years or more experience as a user in OpenStack, AWS, GCP, Azure or equivalent cloud management frameworks running virtualised workloads (either in virtual machines or containers);
- Experience of having configured and managed Linux and equivalent systems (and their applications) in the cloud so that they meet the needs of scientific applications;
- Experience of communicating and working with expert technical users (e.g. in a science community);
- Experience in working in large scientific projects, especially in the field of (cancer) genomics, would be strongly preferred.
You must have extensive experience with computers and be fully fluent with:
- OpenStack, AWS, GCP, Azure or equivalent cloud management frameworks;
- Scripting languages (Perl, Python);
- Knowledge of DevOps methodology would be highly advantageous.
You must be capable of autonomously leading your work in a thorough, timely and detail-oriented fashion, as you will have to keep track of many parallel tasks. Communication and presentation skills within multidisciplinary teams will be highly valued.