Lead Senior Research Data Management System Developer

Posted to IASSIST on: 2015-12-17

Employer: Harvard Medical School

Employer URL: http://www.harvard.edu/


[A joint project between the SBGrid Consortium at Harvard Medical School and the Dataverse Team at the Institute for Quantitative Social Science at Harvard University has an immediate opening for a lead developer to help us build a next generation data publication system for large biomedical datasets. We aim to make biomedical datasets publicly available through a federated data grid to facilitate access, citation, and data analysis by scientists. Our pilot collection includes datasets generated using X-ray crystallography, computer modeling, lattice light sheet microscopy, and microED diffraction. This collection is currently replicated to computing centers in the US, Europe, Asia, and South America. The project is supported by the Helmsley Charitable Trust and was recently selected as a pilot of the U.S. National Data Service. To learn more about the environment, please visit our current implementation at data.sbgrid.org and our group websites at sbgrid.org, slizlab.org, and http://datascience.iq.harvard.edu/team.

The lead developer will be responsible for successfully migrating our in-house research data management system, written in Python, to Dataverse (http://dataverse.org) after first extending Dataverse (with the full support of the Dataverse development team) to include the features necessary for the migration. The candidate will develop a final set of requirements based on the feedback and experience of the end-user community using our current pilot system. Examples of features that must be added to Dataverse include better support for large (~100 GB) datasets, automatic data validation pipelines, and other functionalities relevant to specific biomedical data types. The lead developer will also help to evaluate data transfer and upload and management technologies, such as Globus, that can integrate with Dataverse to support larger datasets and provide direct computing on the data. The developer will work with our team to ensure that all new functionality developed under this project is merged into the Dataverse open source project and shared with the community.

As a senior member of our team, this individual will also support training junior members, collaborate with collection specialists, and present outcomes of the project at meetings and conferences.]{.TEXT}

[Bachelor’s Degree in computer science or engineering and 5-8 years of strong programming experience is essential, preferably in Java and Python, ideally in the context of web applications.]{.TEXT}

[Our team will welcome candidates with diverse technical backgrounds, but the successful candidate will have experience handling large datasets and leading software development projects. A working knowledge of Linux, shell scripting, databases, and distributed version control systems (git, mercurial, etc) is also necessary. The ideal candidate will also be familiar with data management software and the handling and analysis of large datasets.]{.TEXT}

Archived on: 2016-01-08