July 14: Introduction to Data Mining

Readings


Morning (9am to 12pm)

  • Introduction to data and digital methods for measuring, describing and analyzing text and numeric datasets with Lisa Rhody.
  • Hands-on Session 1: Google N-Grams Viewer and Bookworm
  • Demo Session 1: Identifying and preparing specific datasets: Looking together at example datasets, we will identify three types of data that we want to work with and identify important attributes of a usable dataset for description, measurement, and analysis.
    • Anatomy of tabular data
    • Anatomy of textual data
  • Break for lunch

Afternoon (1-4pm)

  • Demo Session 2: Finding the right tool for your questions: A quick overview followed by hands-on activities
  • Hands-on Session 2: Using Voyant, participants will perform word frequency, corpus grid, corpus summary, and keyword in context analysis.
  • Demo Session 3: Examining large corpora of texts to detect trends and patterns.
  • Close 4pm. Bus: 4:15pm

Homework

  • Think about the kinds of data you have, or might want to work with, and what steps are needed to get it ready for the appropriate method of analysis you learned about today.

Sites

Tools

Reference

Next Day