Already a member?

Sign In

Hands-on Big Data

Presenter 1
Ryan Womack
Rutgers University

This workshop is for those of you who, having read about Big Data and seen some of its results in academic studies and the commercial world, would like to get a sense of what actually working with Big Data entails.

The workshop will provide an overview of key technologies for the handling and analysis of large scale datasets, including Hadoop/MapReduce, the RHadoop package, other R packages used for large scale analysis, and Big Data handling environments such as Cloudera, Hortonworks, Tessera, and Amazon Web Services. We will also discuss a few of the primary challenges in successfully completing analysis of large scale data, such as integrating and structuring heterogenous data, handling sparse matrices, and devising effective analytical routines using parallel processing and splitting data. Participants will work with a live demonstration environment that provides a realistic introduction to Big Data Analytics using scripts that will run both on a scaled-down demonstration dataset and on truly large scale data.

Presentation File: 
  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect


  • Resources


    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...