BIA656 Statistical Learning and Analytics

From Hanlon Financial Systems Lab Web Encyclopedia
Jump to: navigation, search



Course Catalog Description

Introduction

The significant amount of corporate information available requires a systematic and analytical approach to select the most important information and anticipate major events. Statistical learning algorithms facilitate this process understanding, modeling and forecasting the behavior of major corporate variables.
Campus Fall Spring Summer
On Campus X X
Web Campus

Instructors

Professor Email Office
Dragos Bozdog
dbozdog@stevens.edu Babbio 429A
German Creamer
german.creamer@stevens.edu Babbio 637



More Information

Course Description

This course introduces time series and statistical and graphical models used for inference and prediction. The emphasis of the course is in the learning capability of the algorithms and their application to several business areas.

Prerequisites: Basic course in probability and statistics at the level of MGT 620 or BIA 654 Multivariate data analytics.

 

Course Outcomes

Students will:

• Learn the fundamental concepts of time series analysis and statistical learning algorithms.
• Explore existent and new applications of time series and statistical learning methods to business problems, and to generic classification problems.

• Learn to solve analytical problems in groups and effectively communicate its results.


Course Resources

Textbook

Foster Provost and Tom Fawcett, . Data Science for Business, O’Reilly, 2013.

Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning . Springer-Verlag, New York,. 2010 (downloadable at http://www-stat.stanford.edu/~tibs/ElemStatLearn/).

Hal Daumé III, A Course in Machine Learning (downloadable at http://ciml.info/)

 

Case Pilgrim Bank A (602104), Harvard Business School

 

Additional References

Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani, An Introduction to Statistical Learning with Applications in R, Springer, 2013 (downloadable at http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Sixth%20Printing.pdf) (ISLR)

Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
R.O. Duda, P.E. Hart and D.G. Stork, Pattern Classification, John Wiley & Sons, 2001.
Tom M. Mitchell, Machine Learning, McGraw-Hill Series in Computer Science, 1997.
A. Rajaraman, J. Ullman Mining of Massive Datasets Book (very useful for big data problems)
Mohammed Zaki and Wagner Meira Jr. Mohammed Zaki and Wagner Meira Jr. Data Mining and Analysis: Fundamental Concepts and Algorithms (draft)

 

Software

R and Weka (http://www.cs.waikato.ac.nz/ml/weka) are the main software packages that will be used. No prior knowledge of Weka is required.



Grading

Grading Policies

Assignments

The course will have a main project and several assignments/cases of data analysis and several labs. The assignments must be submitted electronically through the course website before the beginning of the class of the assigned day. Each student must submit his/her own report. You should also include the Readme, log and code files if you used a script or wrote a program. E-mail submissions will not be accepted.

Project:
The project requires that participants build a decision support system (DSS) based on one of the methods explored in this course. Each project must be developed by groups of three students and they should present a project proposal at the middle of the semester. PhD students should prepare an academic paper that counts as the final project for this course. The paper should be oriented to conferences such as "Innovative Applications of Artificial Intelligence Conference" or “International Conference on Information Systems.” The paper should also be based on a theoretical or applied exploration of one of the methods studied in this course or any other data analysis method approved by the instructor.

Grades:
Assignments: 11% Team project: 25% Participation: 4% Midterm: 30% Final: 30% Total Grade: 100%

Software: Python is the main software packages that will be used. If you are not proficient in Python, you should participate in the Python bootcamp offered by the school at the beginning of the semester.

Class policy: No late homework will be accepted.

Re-grades: If you dispute the grade received for an assignment, you must submit, in writing, your detailed and clearly stated argument for what you believe is incorrect and why. This must be submitted by the beginning of the next class after the assignment was returned. Requests for re-grade after the beginning of class will not be accepted. A written response will be provided by the next class indicating your final score. Be aware that requests of re-grade of a specific problem can result in a regrade of the entire assignment. This re-grade and written response is final; no additional re-grades or debate for that assignment.

Ethics and Cooperation: You are allowed to discuss lecture and textbook materials, and how to approach assignments.

You cannot share ideas in any written form: code, pseudocode or solutions. You cannot submit someone else's work found through internet or any other source, or a modification of that work, with or without that person's knowledge, regardless of the circumstances under which it was obtained, copied, or modified. Of course, no cooperation is allowed during exams.


Lecture Outline

Topic Reading
Week 1
Week 2
Week 3
Week 4
Week 5
Week 6
Week 7
Week 8
Week 9
Week 10
Week 11
Week 12
Week 13
Week 14