BIA660 Web Analytics

From Hanlon Financial Systems Lab Web Encyclopedia
Jump to: navigation, search

Course Catalog Description


The course covers:
  • Introduction to Python
  • Data collection from the Web
  • Parsing and cleaning of structured and unstructured text
  • Text mining
  • Introduction to Natural Language Processing (NLP)
  • Topic Modeling
  • Supervised and unsupervised learning algorithms

Students will be organized to teams of 4-5 people. Student teams will work on a large project that will determine the largest percentage of the class grade. A teammate evaluation survey will be conducted twice during the semester and will contribute to the class grade.

Lecture structure:

For the first 30-40 minutes of the lecture the instructor presents a new concept to the students, typically via the presentation and discussion of a python script that solves a practical problem. The students are then given a relevant assignment that they must complete in-class. During this time, the instructor assists the students and provides hints toward the solution of the assignment. These assignments contribute to the class grade.

Campus Fall Spring Summer
On Campus X X
Web Campus


Professor Email Office
Theodoros Lappas Babbio 639

Course Resources


Readings will be assigned each week. Links will be provided on the course website.


Grading Policies

In-class Assignments        60 points
Final Team Project          35 points
Peer Evaluations            25 points
TOTAL                       120 points

(A student needs 93 points for an A) Team Evaluations are mapped to a multipler in [0,1] which is then applied to the team's project grade to compute individual student grades.

Lecture Outline

Topic Reading
Week 1 Orientation Week
Week 2 Introduction to Python I
Week 3 Introduction to Python II
Week 4 Using Python for web scraping I
Week 5 Using Python for web scraping II
Week 6 Tet Mining I (Regex)
Week 7 Text Mining II (NLTK)
Week 8 Text Mining III (Opinion Mining)
Week 9 Supervised Learning I
Week 10 Supervised Learning II
Week 11 Supervised Learning III
Week 12 Special Topic I (Unsupervised Learning, Clustering, Visualization, Topic Modeling)
Week 13 Special Topic II
Week 14 Project Presentations