Wintersession Course Recommendations from the CDH

10 November 2021

Authors

Use your Wintersession to learn, practice, and polish your DH skills!

tim-gouw-rxLGSOM0e3U-unsplash.jpg

Wintersession is back!

Princeton’s two-week term of workshops, skillshares, collaborative experiments, and other forms of non-graded learning for students, faculty, and staff will take place from January 10 to January 23. Registration is open until November 19.

See below for a list of sessions that we recommend if you’re looking for a place to learn, practice, or polish your DH and related skills over winter break. Click through on each session for more information about Knowledge Prerequisites, Hardware/Software Prerequisites, and Session Format.

wintersession.jpg

An Archival Treasure Hunt in the Chicago Daily Tribune’s “Varied Activities of Women” Column

Did you know that only 17% of biographies on Wikipedia are of women? Join the effort to build a fuller record of women’s contributions to history by contributing to the Varied Activities of Women Project. In this two-hour workshop, you will have a chance to explore the Chicago Daily Tribune’s bi-weekly “Varied Activities of Women'' column, which ran from June 1913 to March 1928. The column featured a batch of one to two sentence summaries of news made by or considered relevant to American women. These tantalizing peaks into the lives of historical women hold endless possibilities. Why was Mme. Dieulafay the only woman in France legally permitted to wear male attire? How old was Mrs. Alice S. Blount when she first owned and edited a newspaper? What was Mrs. Walter Hancock’s first name and in what subject did she receive her masters of arts degree from Temple University? Did Catherine Kline win in her run for mayor of Cleveland, Ohio? Together we’ll embark on an archival treasure hunt using U.S. Federal census data to discover biographical details about some of the fascinating individuals mentioned in the Varied Activities of Women column. We’ll also spend time pondering what questions and future research directions the summaries suggest. Following along with the guided treasure hunt, by the end of the two hours you will produce a short biographical overview of one of the individuals mentioned in the column. Your work will be published online on the Varied Activities of Women website.

This is a single-session workshop facilitated by Emma Sarconi.

Level Up Your Python

This session will cover tips, tricks, tools, and techniques to make more effective and productive use of Python in research computing. It will cover aspects of Python in more detail than is typical in introductory treatments, with emphasis on best practices for making one’s code more “Pythonic”, on avoiding common pitfalls when using Python in research computing, and on understanding available tools within the base Python language and the broader Python ecosystem. An assortment of topics will be covered, including Python’s underlying object model, what decorators are and how they work, and useful modules and tools. The target audience is current users of Python who know the basics but would like to be more effective and professional in their Python code development. This session will be heavily hands-on.

Participants will come away with a stronger foundation of how Python works “under the hood” and of some best practices for Python programming, both in and out of scientific contexts.

This is a single-session workshop facilitated by Henry Schreiner.

R Data Wrangling: tidyverse packages tidyr & dplyr

This workshop introduces two modern R packages, both written by Hadley Wickham and part of R’s “tidyverse,” that provide intuitive tools for handling common data management tasks. The first package, tidyr, provides functions that reshape data so it conforms to a specific “tidy” structure where each variable is saved in its own column, each observation is saved in its own row, and each type of observational unit is stored in a separate table. The second package, dplyr, provides a set of functions (referred to as “verbs”) that allow you to easily subset observations, reorder observations, select specific variables, add new variables, group observations, and summarize groups of observations.

Participants will walk away with both a general understanding of “tidy” representations of data and practical knowledge of how to leverage it in R.

This is a single-session workshop facilitated by Dawn Koffman.

Introduction to Programming Using Python

Python is a programming language used for a wide variety of applications including scientific computation, image processing, text processing, file handling, graphics, database handling, and web interfaces. It is designed to be elegant, concise, and easy to learn, while offering many advanced features. This workshop is an introduction to Python, and to the resources you need to start learning and using Python, for those with little or no programming experience. Programming is best learned by doing, so the workshop is participatory, with many short, simple exercises.

Participants will become familiar with basic programming concepts, some general and some specific to Python. These will include various data types such as strings, integers, floats, lists, and dictionaries; and statements such as import, if/else, for, and try/except. They will also be made aware of various add-on modules for Python such as numpy for numerical calculations and matplotlib for plotting.

This is a double-session workshop facilitated by Matthew Cahn.

Data Analysis and Visualization for Beginners: R, Python and Stata

The Data Analysis and Visualization for Beginners workshops are hands-on and meant to introduce students to the use of R, Python, and Stata for data analysis. These workshops are intended for beginners, for those who do not have any experience with the use of statistical software, and/or for those who need a refresher. This intensive session provides the unique opportunity to see how the same type of analysis is done across different environments. We will focus on importing data, cleaning, preparing, and merging. We will go over some basics of data visualization, quantitative methods (descriptive statistics, mean comparison, linear regression), and text analysis (for R and Python: word frequencies, wordclouds, and sentiment analysis). This would be an encore from the sessions taught in Wintersession 2021.

  • Remove the fear of using statistical software for data analysis.
  • Get started on using statistical software.
  • Have a basic understanding on how the software works.
  • Perform same analysis using different environments.
  • Understand the basic components of the "Anatomy of Data Analysis”

This is a single-session workshop facilitated by Oscar Torres-Reyna.

Getting Started with the Research Computing Clusters

This workshop introduces the research computing ecosystem at Princeton: the computing clusters (Nobel, Adroit, Della, Tiger, Stellar, and Traverse), the storage system, and the data visualization machines (such as Tigressdata). After an overview of the different systems and the sorts of tasks each is geared toward, the course gives users a hands-on introduction to technical topics including: how to connect to the clusters, how to manage file storage, how to access or install additional software, and how to launch jobs through our scheduling software (SLURM). Participants will also learn the basic civics of working on Princeton’s shared systems.

Attendees will come away with the basic skills needed to connect to a research computing cluster, navigate its environment and file system, install and manage their software environment, and run programs through the SLURM scheduler. Participants will also get a very high-level overview of different parallel computing paradigms and guidance on how to assess their computing needs in order to use the Princeton resources judiciously.

This is a single-session workshop facilitated by Carolina Roe-Raymond.

What is Machine Learning and Can it Help Advance My Research?

The Center for Statistics and Machine Learning proposes a three-hour wintersession workshop (including lunch). The workshop aims to increase awareness of how machine learning could aid faculty, postdoc, and student research. No detailed prior knowledge of machine learning is assumed. The workshop will begin with an overview of crucial machine learning ideas and address three questions: What is machine learning? Where has it been particularly successful? What can it not do well (yet)? Then five faculty, from various parts of the university will give 20-30 minute presentations on how they are incorporating machine learning into their research. The workshop will then move into a question, answer, and discussion session with a boxed lunch provided. Several data scientists and a research software engineer will attend this part of the session to answer questions concerning datasets, dataset curation, and software tools for machine learning. The session will target faculty, postdocs, and advanced students wondering if machine learning can help their research program. However, space permitting, the session is open to all wintersession participants.

This is a single-session workshop facilitated by Peter Ramadge.

Data Storage & Transfer: Basics and Best Practices

It is important that researchers not spend too much time and effort transferring datasets from one place to another. This session will introduce the basics of data transfer so that participants can identify bottlenecks that slow down a transfer job and learn how to overcome them. The session will also introduce various data transfer tools that researchers can use in their daily workflow. The hands-on part will focus on the Globus transfer tool (https://www.globus.org/)

Attendees will be able to select the best data transfer tool for different transfer jobs. Attendees will also learn the best way to transfer large and small datasets from and to the Research Computing clusters (Perseus, Della, Tiger, etc) at Princeton.

This is a mini-workshop facilitated by Joon Kim and Rishi Joshi.

Database 101: May I Join You?

You’ve read through hundreds of books for research and have a terribly long Excel sheet with data that’s hard to manage; you receive a request for a report but struggle to get all of the data as fast as you’d like; you’ve heard about big tech companies getting rich off your data and you want to know how it’s done. You have been needing databases all along, you just haven’t realized it yet. In this course, you will learn the basics of databases and how they structure your data. You will learn about tables and how linking them with relationships will deliver so much meaning to what you collect. At the end of the course, you will have constructed your own data schematic that you can take to any database platform, even if it’s simply multiple sheets in an Excel file.

This is a double-session workshop facilitated by Jeff Heller.

Data Visualization in Python

This session provides an introduction to effective data visualization in Python. Several plotting packages will be discussed, including Matplotlib, Seaborn, and Plotly. Examples may include simple static 1D plots, 2D contour maps, heat maps, violin plots, and box plots. The session may also touch on more advanced interactive plots.

Attendees will be exposed to different plotting packages in Python, along with how to integrate them with NumPy and Pandas, at least at a basic level. After the session, participants will know the basic mechanics of how to generate research-quality plots using Python.

This is a single-session workshop facilitated by Jose Garrido Torres.

Data Visualization in R, using ggplot2

This workshop provides an introduction to effective data visualization in R, primarily using the graphics package ggplot2. We will discuss main concepts of the grammar that defines the graphical building blocks of that package, and we will use hands-on examples to explore ggplot2’s layered approach to creating basic and more complex graphs. The workshop will emphasize the many choices we have in creating data visualizations, and how to choose among those to make a clear and concise point to your audience. Participants should have at least basic experience with R and feel comfortable working with R data frames, but those relatively new to R may still find value in the workshop and are welcome to attend.

Attendees will come away with the ability to use the R package ggplot2, along with an iterative, layering approach, to construct polished visualizations of data that is stored in well-structured tables.

This is a single-session workshop facilitated by Jake Hofman.

Introduction to Version Control using Git 

A Version Control System (VCS) records changes to files automatically to allow for easy recall of different versions. Git is a leading modern VCS that allows users to manage and navigate the history of their files across time, across collaborators, and even across parallel versions, all in a comprehensive and consistent manner. It is easy to set up, is used across research and industry, and has grown an expansive community thanks in part to services such as Github, Bitbucket, and Gitlab. And Git is not just for code – it can track any type of plain-text file (including documents) on any computer. This workshop introduces the fundamentals of Git in an exercise-driven, hands-on format. Even though the emphasis will be on using Git (and Github) for a solo workflow, the material covered will equip users with the necessary background to start using Git collaboratively as well. It is geared toward anyone looking to learn the basics of using Git to organize their work (and, conversely, how to make their workflows Git-friendly).

Participants will leave with a solid understanding of Git foundations and a grasp on useful Git workflows. They will learn to use Git locally on their own computers, as well as in tandem with Github. This workshop will mainly focus on using Git individually and will only touch briefly on team workflows.

This is a double-session workshop facilitated by Dev Dabke.

Good Practices for Research Software Engineering

Introduction to simple, yet time-tested practices and methodologies that can have long term impacts on your productivity as a programmer as well as ensure the sustainability of the code you write. These practices are approachable and adoptable by both experienced developers and novices alike. Some examples of practices to be discussed include: writing programs for people, not computers; making incremental changes; and avoiding repetition.

Participants will leave knowing the landscape of some general practices and approaches they can immediately adopt to be more productive when writing, editing, or developing research software.

This is a mini-workshop facilitated by Ian Cosden.

UX Demystified

In a time when technology is everywhere, user experience is everywhere as well. In this workshop, we will explore big ideas that are fundamental to user experience and suggest ways they might apply to projects you are involved with. The goal is to develop definitions and concepts that are straightforward enough to appeal to those outside the UX profession while being nuanced enough for those who specialize in it.

This is a single-session workshop facilitated by Charlie Kreitzberg.

Debugging & Profiling Code, in Python and R

Looking for best practices to find bugs in your code? Looking for ways to improve the performance of the code that you write? Fortunately, there are tools available to users to speed up these tasks that are more robust than merely inserting and deleting print statements. This session will cover best practices for intermediate level code debugging and profiling to identify bugs and bottlenecks in the code that consume more than expected amount of resources. We will primarily focus on Python and R with some hands-on exercises. Participants will need to install some tools in advance to participate in the exercises--the facilitator will contact you in advance of the session to let you know how to install the needed software.

This workshop is geared toward computational researchers interested in learning debugging tools and best practices. Attendees will learn the best practices for debugging code and gain hands-on experience using debugging tools.

This is a single-session workshop facilitated by Abhishek Biswas.

Heroes Get Remembered, Legends Never Die: Preserving Your Digital Legacy

How do you want to be remembered? Who will tell your story? Digital files are more at risk of being lost than any other format -- and sooner than you’d think. Whether you are new to organizing your digital life or a seasoned Marie Kondo with your stuff, knowing where your content lives and how to access it in the far future is important. This session will help you identify and preserve your most important digital records, including your social media feeds (because social media is not an archive!) Your legacy, whether it is your time at Princeton and most importantly beyond, requires forethought so those that will come after you can understand your digital lifestyle and personal records or organization’s work without sifting through empty folders. You will be introduced to the resources and tools available to help you or your organization get started in creating your history and documenting your legacy. This session is designed to help anyone who creates records -- whether you’re a student, a student organization, or a faculty or staff member. If you have too many files on your desktop, a crowded Google Drive, or a phone full of photographs, this is the session for you.

This is a single-session workshop facilitated by Valencia Johnson.

Academic Power Tools: Zotero and Overleaf! 

Does assembling and managing your bibliographies take hours that you could better spend improving your papers (or binge watching your favorite shows)? This session will show you how Zotero, Overleaf, and other related tools can streamline your research and writing process--whether you're a physicist or a philologist! Attendees will learn the three main functions of Zotero: Collecting, organizing, and deploying citations. Approximately half of the session will be devoted to hands-on time where students will be on their way to pain-free footnotes and bibliographies!

This is a mini-workshop facilitated by Audrey Welber.

Individual Professional Development Plans: a Product and a Process for Humanities Graduate Students

We'll use ImaginePhD together to:

  • assess career-related skills, interests, and values
  • explore careers paths appropriate to humanities and social science disciplines
  • create self-defined goals
  • map out next steps for career and professional development success

Why is creating a plan so important? Having a structured plan and incremental goals aligned with your academic milestones has been shown to increase productivity and satisfaction through graduate school. As you move through your graduate studies, your interests and goals will naturally evolve and therefore you may need multiple mentors and resources to support you along the way. This session will assist you in beginning the process of reflecting, exploring and developing your plan aligned with your goals through this early phase of your graduate studies. We will explore ImaginePhD as one possible tool to track your goals and progress. Pre-work: sign-up for your free ImaginePhD account.

This is a single-session workshop facilitated by James M. Van Wyck.

Website Trainwrecks: Get Back on Track

Maybe you have a website that’s a total wreck or simply needs some improvement. Maybe you don’t have a website at all and you’re thinking of building one. Whether your website is for a club, personal, or something else (surprise us!) there is something in this session for you. We’ll cover website best practices, design trends, usability, and accessibility. You’ll critique websites and walk away with practical guidance and steps for either improving an existing website or getting off to the right start with a new one. This will include a combination of presentations, group discussions, and plenty of website surfing. No technical skills are required.

This is a double-session workshop facilitated by Web Development Services.

Archives for Historical Research 

Are you planning to work in the historical professions? Are you planning to do research in an archive? Archival research skills are invaluable to a professional historian. Librarians from Firestone Library will be providing a systematic introduction to a variety of topics in archival research including:

  • Identifying archives relevant to your area of research
  • Understanding finding aids, shelf lists, and other tools used in archives and special collections
  • Creating research plans to increase the efficiency of a visit to one or more archives or special collections
  • Finding funding opportunities for a research trip

This is a double-session workshop facilitated by Alain St. Pierre.

Introduction to Machine Learning

This course will introduce students to the conceptual foundations of machine learning (ML) and will describe a range of modern supervised and unsupervised ML methods. We will discuss the advantages, limitations, and appropriate uses of each and learn how to implement them using the Python Keras ML library.

This course is appropriate for students with some exposure to coding and will require a small amount of initial setup (installing Python and Keras). The material will be especially useful for students who want to implement ML methods for research or quantitative projects, but is open to all who are interested.

This is a multi-day program facilitated by Savannah Thais.

Solidarity-Based Organizing in the Digital Age: Building Power from the Bottom-Up

Community organizers around the world are seeking to re-ignite public support for besieged democratic institutions and to strengthen solidarity within divided societies. While navigating the complexity created by disinformation, authoritarian politics, and the emergence of new existential threats, like the COVID-19 pandemic and the climate crisis, organizers pull from a variety of tactics and strategies that leverage both the digital world and traditional power-building.

In this workshop, two organizers at the frontlines of the pro-democracy struggle in Brazil will share learnings from their 10+ years of experience working with young activists and community leaders as well as elected officials. Through a mix of hand-on exercises and reflections on activism and social change theory, they will lead Princeton students into an exploration of their own collective power and ability to influence the various communities they are a part of.

Instructors: Alessandra Orofino and Miguel Lago

This is a double-workshop facilitated by João Biehl, Kimberly de los Santos, Yi-Ching Ong, and Miqueias Mugge.

INTERFACE Boot Camp: Workshops in Tech Ethics

Modern-day computing technology has advanced to the point that it encompasses nearly all aspects of everyday life. Furthermore, with each new computing innovation, software update, and tech product, comes a whole slew of new, complicated opportunities and issues that society needs to grapple with. These manifest in news articles about Apple’s child safety photo-scanning program, bias in machine learning, AI-based art, and so on. We at INTERFACE want to talk about it.

Discussions of tech ethics are interesting in themselves. They are also existentially crucial. In the real world, responsibility for regulating and making judgements on new technologies is widely dispersed. Government representatives often lag behind the activities in tech hubs like Silicon Valley. Committees and engineers at tech companies are put into powerful decision-making positions. Private individuals leave their mark on a collective tech-based consciousness through their activities both on and off the Internet, from making a Wikipedia post, to taking a selfie, to clicking on an ad.

INTERFACE exists to help us bridge this murky gap in understanding between technology and social impact. Our vision for this boot camp is that INTERFACE officers will give 2 presentations and run roundtable discussions on topics of our choosing, both as a model and a starting point. Afterwards, we will guide participants through researching and making a presentation of their own. At the end, we will have a mini-presentation symposium with boba!

This is a double-session workshop facilitated by Sabrina Reguyal and Hien Pham of INTERFACE.

Carousel Photo by Tim Gouw on Unsplash