Quick link to course material (Please, use you Bowdoin Google account).
How can digital techniques enhance our textual interpretations? How can we make sense of an increasing number of textual sources in a timely manner? What kind of new interpretative questions can be raised and answered by computer-based text analysis?
In this course, we will apply a set of text analysis techniques to various liberal arts disciplines in order to gain insights on how these tools are used in their specific areas of interest. Examples will include using distant reading on vast literary corpora, investigating authorship of Supreme Court opinions, retracing the history of mobile devices by “reading” 30 years of newspaper, and interpreting Raphael's masterpiece “School of Athens”.
No previous computer programming experience is required. Parallel labs will be provided for leveraging programming skills.
Acquire and apply a set of techniques to organize, analyse, interpret and present arguments based on digital textual analysis
Understand how digital text analysis techniques can be integrated into several liberal arts disciplines
Critically evaluate shortfalls and limitations of each technique
Extract quantified features of texts in order to augment explanatory capabilities of textual based inquiries
Jan 27
Ignatow - Text Mining - Introduction
Miner - Text Mining Methodology
Jan 29
Jockers - Text Mining and the Humanities
Feb 3
Ignatow - Acquiring Data
Ignatow - Scraping and Crawling
Feb 5
Ignatow - Scraping and Crawling (same as last Class)
Feb 10
Rockwell and Sinclair - Hermeneutica - Introduction
Feb 12
Rockwell and Sinclair - Measured Words
Feb 17
Rockwell and Sinclair - From Concordance to Analytics
Feb 19
Changsoo - How are immigrant Workers Represented in Korean Newspapers
Juola - Using N-gram to Measure Cultural Complexity
Feb 24
Hermeneutica Essays - Chapters 4, 6, 8 and 10 - Available online : http://hermeneuti.ca/
Feb 26
Bird, Klein, Edwards - NLP with Python - Introduction to NLTK
Mar 2
Ignatow - Basic Text Processing
1st Dimension - Liberal Arts Context
a.) Politics and Rhetoric - How inaugural presidential discourses can be analysed through basic digital techniques?
b.) Literary Studies - Analyze central motifs of literary works, e.g., Sherlock Holmes, Around the World in 80 days, Harry Potter
c.) History of Philosophy - Assess the dominant interpretation of Raphael's School of Athens.
How Plato and Aristotle works can be compared in regards to their metaphysical and epistemological positions? (Clustering)
d.) Foreign Language Literary Studies - A critique of clustering techniques applied to Galileo's texts
e.) Geography - Assessing the drugs quest in the U.S through text analysis of congressional hearings - Topic Modeling
f.) Government and Legal Studies - Investigate the authorship of Bush vs Gore Supreme Court case - Classification/Machine Learning.
2nd Dimension - Text Analysis phases
Corpus Gathering
Corpus cleaning and normalization
Basic textual metrics
Introduction to Natural Language Processing
Documents Clustering
Documents Classification
20% In-class Quizzes: Information Retrieval practices.
20% Practical Assignments: Problem sets based on required readings and techniques discussed during classes. You are expected to submit answers by 8:00 am on the due date (typically before classes). These assignments are intended to be self-guided study opportunities, and focused on the digital techniques we will be discussing.
20% Midterm Exam (Individual)- It will cover theoretical and practical contents.
40% Final Project (Groups of 2 or 3 students)- You will be writing a digital analysis and interpretation that investigates an aspect of your major or minor field of study (or other interests you may have). The grading criteria breakdown is as follows:
10% Corpus Gathering and Preparation
15%: Context and problem definitions
35%: Digital Text Analysis techniques and theoretical concepts applied to the project
25%: Integration of digital analysis to the interpretation
15% Final Presentation
Project Timeline:
Initial Corpus Definition - Meeting during office hours until Feb. 20th
Context and Possible problems to be explored - Submitted by Midterm
Digital Essay Draft - Apr. 2nd
Final written submission - May, 6th
Final project submission - Finals period
Collaboration
One of the principal components of a DCS course is collaboration. However, you should always be clear on what part of the work you hand in is your own, what parts come from other sources, and what parts are collaborative. As a rule of thumb, we distinguish between interacting with another student using any written medium (e.g. pencil and paper, email, looking at their code) and having broad discussions with them. Unless you work with another student in a group, you are not allowed to exchange information through a written medium with him/her or provide answers to problem sets through conversation. This is a zero-tolerance policy.
It is permissible to use materials available from other sources such as the Internet (understanding that you get no credit for using the work of others) as long as: 1) You acknowledge explicitly which aspects of your assignment were taken from other sources and what those sources are. 2) The materials are freely and legally available. 3) The material was not created by a student at Bowdoin as part of this or another course this year or in prior years. To be absolutely clear, if you turn in someone else's work you will not receive credit for it; on the other hand, if you acknowledge it, at least you will not violate the Honor Code. All write-ups, reviews, documentation and other written material must be original and may not be derived from other sources.
Academic Honesty
We assume that every student is abiding by the Code of Academic and Social Conduct to which they agreed upon matriculation at Bowdoin: http://www.bowdoin.edu/studentaffairs/student-handbook/college-policies/ If you find yourself overwhelmed, unsure about what to do, without materials to use, or otherwise feeling like you have no choice but copying from the web and calling it your own work, contact Professor Nascimento before you make a very bad decision. DO NOT jeopardize your academic career for an assignment in this course. Talk to us first, because if we suspect any kind of cheating or plagiarism we will pursue the matter to the fullest extent allowable by Bowdoin policy.
Religious Holidays
No student is required to take an examination or fulfill any other scheduled course requirement on recognized major religious holidays. Students are expected to declare their intention to observe religious holidays at the beginning of the semester. Please contact Professor Nascimento within the first week of classes to make arrangements if you will be missing classes due to religious holidays.
Electronic Responsibility
“My computer died” and “I only saved to the lab laptop” are not valid excuses for having nothing to show for your work on the day an assignment is due. We are going to discuss and practice responsible management of electronic files so that you are never caught without a very recent copy of your very important work. Having nothing to show on the due date will be a sign that you started the assignment very late.
Accessibility
Bowdoin College is committed to ensuring access to learning opportunities for all students. Students seeking accommodations based on disabilities must register with the Student Accessibility Office. Please discuss any special needs or accommodations with me at the beginning of the semester or as soon as you become aware of your needs; I am eager to work with you to ensure that your approved accommodations are appropriately implemented.