Skip to Main Content

How Do I Find ~ Statistics and Data

Data Science Tutors

What is the Data Science Tutor Program?

Data Science Tutors (DSTs) are R&ID Research Tutors who specialize in data science. During their shifts, they can assist students with understanding and interpreting data, providing peer research assistance, and instructing them in the use of data analysis tools. We have four Data Science Tutors in Spring '24: Andrew Gou, Brian Hu, Adam Koplik, and Cynthia Yang. See their friendly faces here!

How do I contact Data Science Tutors?

You can contact them in the following ways during their shifts:

  • In Burke Library they are at the Research & Design service desk on the first floor.
  • Start an Ask Us chat (the "Ask Us" button on the right side of this webpage).
  • Call 315-859-4735.
  • Send an email to:


Which data analysis tools do they support?

The Data Science Tutors specialize in different data analysis tools, complementing each other's skill sets. See their specialties below:

  • C++ (for data science): Adam
  • Git (for data science): Andrew
  • MS Excel: Adam, Brian, Cynthia
  • Python (for data science): Andrew
  • R: Adam, Andrew, Brian, Cynthia
  • SPSS: Andrew
  • SQL: Andrew
  • Stata: Andrew, Brian, Cynthia

Questions? Send us an email at

When are the Data Science Tutors available in Spring '24?

Our DSTs are available on the following days during the academic semester for drop-in consultations.

Tae Hyun Lim (he/him)
Data Science/Analysis Research Librarian

Burke Library 354 (3rd Floor):
Tuesday 9:00 am to noon

Adam Koplik (he/him)

Burke Library:
Sunday 7:00-10:00 pm
Wednesday 8:00-10:00 pm

Andrew Gou (he/him)

Burke Library: 
Monday 6:00-10:00 pm
Wednesday 6:00-7:00 pm

Brian Hu (he/him)

Burke Library:
Sunday 6:00-7:00 pm
Tuesday 7:00-10:00 pm
Wednesday 7:00-8:00 pm

Cynthia Yang (she/her)

Burke Library:
Tuesday 6:00-7:00 pm
Thursday 6:00-10:00 pm


How to Develop a Research Question

1. Define Your Research Question

State your research question without describing the sources or data. This will help you identify a variable or variables (underlined below). 


Q1. Which characteristics of voters explain their vote choices in the 2020 presidential election

Q2. Do free trade agreements (FTAs) promote the members' bilateral international trade


2. Define Your Measurements

Find a specific language that best describes your concepts.  This will help you find the data set that best captures your interest. 


Q1, characteristics of voters:

  • education: the highest degree that the person completed
  • income: annual household income, in USD
  • race: the person's self-identified race

Q2, countries' bilateral international trade

  • Annual bilateral trade volume, including imports and exports, in USD

3. Identify Population, Unit of Analysis, and Unit of Observation

Who are you interested in studying? Who or what is being described by your variable(s)? What is the unit in your data set? This will also help you choose a data set. 

(Micro-level) <-----Individuals, households, cities, states/provinces, countries -----> (Macro-level)


Q1: Those who are eligible to vote in the 2020 presidential election, at the individual-level

Q2: Country dyads in the world, at the country-level

4. Identify Time Frame and Frequency

At what point in time do you want to know this about the people, institutions, or products you identified? How often do you want to know about them?


Q1: The data are collected in 2020 (before the election), and are not repeated over time. 

Q2: As many years as possible, repeated every year

5. Identify the Group Structure of Your Data

Are you looking for data collected at regular intervals over time? Identifying what sort of time series may be helpful as you search for data.

  • Cross-sectional: collected at the same point of time for several individuals
  • Longitudinal/Panel: data collected at a sequence of time points for each of a sample of individuals
  • Time Series: data collected at a sequence of time points, usually at a uniform frequency
  • Pooled cross-sectional time series: a mixture of time series data and cross-section data


6. Think about Your Data Analysis Methods and Tools

Don't know which data analysis method to choose? The model choice depends on your variable types and your data structure. Use this flowchart and the UCLA IDRE's guide to find the method for your analysis and its example code.  

Also, choose a data analysis tool. For example, for quantitative analyses, researchers often use R, SPSS, Stata, Python, MATLAB, or Mathematica. For qualitative analyses, NVivo and MAXQDA are often used. All of these programs are available to Hamilton students, faculty, and staff free of charge.

We also have online training resources for these programs. Contact AskUs ( if you have any questions about the resources. 

* Adapted from Nicole Scholtz' guide to Finding Data at the University of Michigan

Hamilton College, 198 College Hill Road, Clinton, NY 13323 • 315-859-4735 • Copyright © 2017 The Trustees of Hamilton College. All rights reserved.