Middlebury

MATH 1230

DataScience Across Disciplines

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Biology, Geography, History, Mathematics/Statistics and Sociology. This course will use the R programming language. No prior experience with R is necessary.

BIOL 1230: Students enrolled in Professor Casey’s afternoon section will use the tools of data science to investigate the drivers of tick abundance and tick-borne disease risk. To do this students will draw from a nation-wide ecological database.

GEOG 1230: In this section, we will investigate human vulnerability to natural hazards in the United States using location-based text data about hurricane and flood disasters from social media. We will analyze data qualitatively, temporally, and spatially to gain insights into the human experience of previous disasters and disaster response. We will present findings using spatial data visualizations with the aim of informing future disaster preparedness and resilience.

HIST 1230: In U.S. history, racial differences and discrimination have powerfully shaped who benefited from land and farm ownership. How can historians use data to understand the history of race and farming? Students will wrangle county- and state-level data from the U.S. Census of Agriculture from 1840-1912 to create visualizations and apps that allow us to find patterns in the history of race and land, to discover new questions we might not know to ask, and to create tools to better reveal connections between race, land, and farming for a general audience.

MATH/STAT 1230: Students will explore pediatric healthcare data to better understand the risks correlated with various childhood illnesses through an emphasis on the intuition behind statistical and machine learning techniques. We will practice making informed decisions from noisy data and the steps to go from messy data to a final report. Students will become proficient in R and gain an understanding of various statistical techniques.

SOCI 1230: Do sports fans care about climate change? Can sports communication be used to engage audiences on environmental sustainability? In this section of the course, students will use the tools of data science to examine whether interest in sports is associated with climate change knowledge, attitudes and behaviors, as well as other political opinions. Participants will use survey data to produce visualizations and exploratory analyses about the relationship between sports fandom and attitudes about environmental sustainability.
Subject:
Mathematics
Department:
Mathematics
Division:
Natural Sciences
Requirements Fulfilled:
DED SCI WTR

Sections in Spring 2024, School Abroad Italy (Florence)