IASSIST 2025: IASSIST at 50! Bridging oceans, harbouring data & anchoring the future


Big Data and The Dataverse Project

Research data is growing exponentially in size and complexity, and funders are increasingly calling upon researchers to share their data. To meet these challenges, the Harvard Dataverse Repository now offers flexible, sustainable solutions for the stewardship and archiving of large-scale research data. We combine the comprehensive data stewardship features of Harvard Dataverse, with storage solutions from Amazon Web Services (AWS) and the Northeast Storage Exchange (NESE), and file transfer protocols like S3 and Globus, to provide cost-effective options meeting today's large data-sharing challenges.

Large data project outputs are diverse. They span the gamut from datasets with many small files, to a few extremely large files, or collections that start small but grow quickly over time. Furthermore, data re-users expect access to large datasets on-demand, asynchronously, or via graphical user interfaces (GUI) or scripting using application programming interfaces (API). Our service offerings, from data management planning to archiving, address diverse use cases. We offer pre-deposit consultations, quotes, and letters of support; curation assistance arranging and describing datasets for usability; and long-term storage, access, and retention options for researchers' large research data collections.

Harvard Dataverse Repository's Large Data Services empower the research community by ensuring reliable long-term access to large datasets, supporting compliance with funder and institutional data-sharing requirements, and providing cost-effective alternatives to commercial storage solutions. We ensure that data remains accessible, preserved, and valuable for years to come. We look forward to engaging with the community at this conference to share our approach, learn from others, and explore potential collaborations.

Ceilyn Boyd
The Dataverse Project, Harvard University
United States

SONIA BARBOSA
The Dataverse Project, Harvard University
United States