Training
Training Calendar
Training Calendar
HBSGrid Next Steps Seminar: Troubleshooting Jobs/Applications on the HBSGrid Cluster
- 03 OCT 2023 11:00 AM - 11:50 AM |
- Online via Zoom
Doing research and getting started using the HBSGrid cluster is easier now than in the past. But if things aren’t working correctly, your applications won’t start, or one is new to using a compute cluster, it may not be obvious how to troubleshoot the situation or to resolve the problems quickly.
This combined seminar + demo will offer guidance to help you quickly and efficiently troubleshoot the following situations:
- • PENDing jobs
- • Failing or crashing jobs
- • Slow or poorly performing jobs
- • Software and software installation issues
Please join us: this seminar is aimed at any and all users, especially those who are new or have never used a cluster before. For the Zoom meeting information, please register.
Future seminar topics will include:
- • Increasing code/job performance and throughput
- • Writing and running parallelized code on the HBSGrid
- • Advance cluster features and running workflows
Research Data Management
- 09 FEB 2022 9:30 AM - 11:30 AM |
- Online via Zoom
Want to be more efficient and save time doing your research and collaborating with others? Looking for new ways to promote your work and make a worldwide impact? Then come to this workshop to learn techniques and services to help you manage your research data. You will learn practices that ensure that your research is documented, reproducible, and accessible long-term. This includes how to acquire specialized data for your research, resources and tools to support your use of data throughout your research lifecycle, complying with internal and external data policies and regulations, and making data from Harvard researchers available to others where feasible.
This class, a combination of seminar and discussion, will highlight robust data management and documentation practices to help you, your future self and fellow researchers be successful in these areas.
Pandas in Python
- 10 DEC 2021 1:00 PM - 3:00 PM |
- Online via Zoom
Pandas is an incredibly useful Python library that facilitates working with data. This workshop will provide a basic introduction to pandas including reading and writing files, selecting and filtering, producing summary statistics, data manipulation, and plotting. Basic experience with Python is recommended.
Large Data Handling in R
- 03 DEC 2021 10:00 AM - 12:00 PM |
- Baker Library B82 OR Online via Zoom
REGISTER FOR IN PERSON ATTENDANCE
REGISTER FOR VIRTUAL ATTENDANCE
This workshop will teach you techniques and tools for working with data that is too big to comfortably fit in your computer's memory. This is a common problem because most popular data analysis software uses in-memory data structures for speed and convenience. Historically, tools for working with out-of-memory data in R were clunky and limited, but newer solutions are more user-friendly and powerful. While the workshop examples focus on R, these tools and techniques can also be used from Python or other languages or environments, and we expect this workshop to be useful for R, Python, and other users.
Pre-Requisites: Basic familiarity with R; familiarity with the tidyverse packages and/or SQL will be helpful.
Additional Information: We may use the HBS Grid research computing cluster for demonstrations and examples, but this is not required, and all examples can be run on your local machine if you prefer. If using your own machine we recommend installing the "tidyverse", "arrow" and "duckdb" R packages.
Matching Methods in R
- 19 NOV 2021 12:00 PM - 3:00 PM |
- Online via Zoom
This is an applied part of a series of workshops on principles of causal inference and matching methods (attendance of the previous parts is not required, but recommended). During this workshop we will practice implementing matching methods in R using various packages, including MatchIt, cem, RItools, and cobalt. Please contact research@hbs.edu with any questions.
Matching Methods for Causal Inference
- 17 NOV 2021 12:00 PM - 1:30 PM |
- Online via Zoom
This is a second part of a series of seminars on principles of causal inference. During this seminar we will go into details of the matching method, one of the most popular quasi-experimental methods in causal inference. Please contact research@hbs.edu with any questions.
Principles of Causal Inference
- 10 NOV 2021 12:00 PM - 1:30 PM |
- Online via Zoom
The seminar introduces principles of causal inference for randomized experiments and observational studies. We discuss the potential outcomes framework and, through its lens, give an overview of modern quasi-experimental methods, including regression and matching. We end with a brief literature review. Please contact research@hbs.edu with any questions.
Scaling Up Work with Batch (Non-interactive) Jobs
- 11 AUG 2021 12:00 PM - 1:00 PM |
- Online via Zoom
In our series of lunchtime technical trainings, RCS staff members and guest speakers discuss topics that can be used to enhance one's research, with a focus on research methods, statistical approaches, and computing tools or workflows.
Working with code using GUI tools such as NoMachine is usually a default choice for many users. However, as the datasets grow larger, the task grows more complex, or the number of necessary repetitions increases, this approach is no longer scalable. Running batch (or background) jobs, where one does not interact with the program that is running, allows the user to scale one's work. On the HBSGrid cluster, for example, one can run hundreds of scripts at once, analyzing numerous data files simultaneously, or performing other parallelizable or automatable jobs.
This transition to background-only, non-GUI work may seem daunting: how does the program know what files to work on or write out? How do I monitor its progress? How do I even start the program, let alone hundreds of them? This session will demystify the process of transitioning to running batch jobs, give you several approaches to make this transition, and highlight a few useful tools.
*Guest users may register by contacting us at research@hbs.edu.
Practical Considerations for Managing Sensitive Data
- 19 MAY 2021 12:00 AM - 1:00 PM |
- Online via Zoom
In our series of lunchtime technical trainings, RCS staff members and guest speakers discuss topics that can be used to enhance one's research, with a focus on research methods, statistical approaches, and computing tools or workflows.
We are excited to invite Julie Goldman, Countway Research Data Services Librarian with the Harvard Library, to present this workshop on working with sensitive data.
Data sharing is becoming an increasingly prevalent and expected part of the research process. Researchers may be hesitant to share datasets about human subjects, and rightfully so. While some data can be shared respecting Institutional Review Board (IRB) and federal restrictions, other data is ultimately not publicly shareable.
This webinar will address conflicts that can arise when attempting to balance the protection of data with expectations for open data, such as restrictive language in data use agreements, IRB protocols, and consent forms.
Learning Objectives:
- Understand types of sensitive data and terminology
- Develop curation skills
- Identify institutional-level efforts to ensure safe sharing of protected data
Introduction to Tableau
- 21 APR 2021 12:00 PM - 1:00 PM |
- Online via Zoom
In our series of lunchtime technical trainings, RCS staff members and guest speakers discuss topics that can be used to enhance one's research, with a focus on research methods, statistical approaches, and computing tools or workflows.
We are excited to invite Jess Cohen-Tanugi, Visualization Specialist at the Harvard FAS Lamont Library, to present this introduction to the data visualization software Tableau. In this hour-long, hands-on workshop you'll use a sample dataset to create several different types of visualization using Tableau. You'll also learn how to combine graphs together to create interactive dashboards and how to create stories from the data.
Research Data Management
- 24 FEB 2021 1:00 PM - 12:00 AM |
- Online via Zoom
Want to be more efficient and save time doing your research and collaborating with others? Looking for new ways to promote your work and make a worldwide impact? Then come to this workshop to learn techniques and services to help you manage your research data. You will learn practices that ensure that your research is documented, reproducible, and accessible long-term. This includes how to acquire specialized data for your research, resources and tools to support your use of data throughout your research lifecycle, complying with internal and external data policies and regulations, and making data from Harvard researchers available to others where feasible.
This class, a combination of seminar and discussion, will highlight robust data management and documentation practices to help you, your future self and fellow researchers be successful in these areas.
Introduction to Version Control with Git & Gitkraken
- 20 NOV 2020 1:00 PM - 4:00 PM |
- Online via Zoom - meeting info to follow
Version control software allows you to save “versions” of files -- scripts, text files, web pages, data, etc. -- which show the changes that were made to the files over time, and allows you to backtrack if necessary and undo those changes. The ability alone – of being able to compare two versions or reverse changes, makes it fairly invaluable when working on larger projects. Even more so when collaborating in research groups.
This hands-on workshop will take you through the steps of using git and Github, to track changes, revert to older versions, and share your files with other people. Ultimately, to keep you organized, to reduce the clutter, and maintain an intelligible history of files in your projects.
This workshop is being conducted in partnership with the Data Science Services group at IQSS.