Data Mining and Business Analytics in Knowledge Services

Instructor

Dr James G. Shanahan

Location Campus (JB 156); SVC (Room 303)
Time

6:00PM-9:30PM Thursdays

From January 9, 2014 to March 12, 2014

with a final exam during week 11 of the quarter (Week of March 17, 2014)

Recorded Lectures

Click here 

Instructor Office Hours:

By Appointment only
Email:

James.Shanahan_AT_ gmail.com and Shanahan_AT_soe.ucsc.edu

In your emails please use the following subject line format otherwise responses might be late or overlooked

"TIM 209 Winter 2014: topic of email" E.g., "TIM 209 Winter 2014: location of final exam"

 

Introduction

Welcome to TIM 209 (formerly known as ISM 209). TIM 209 is scheduled for Thursday  6-9:30pm (First class on Thursday, January 9, 2014). 

Have you ever wondered what is data science or about the following big data analytics problems:

  • How $1 Million was won for a movie recommender  (Netflix Prize)

  • How to win $3 Million for a healthcare prediction problem
  • How online advertising (a $110 billion industry worldwide) works?
  • How a search engine (such as Google, Bing) figures out which documents to show in response to a search query
  • How does Facebook suggest new friends? Which items get posted to your wall?
  • How data science help Obama win the 2014 election?

Objectives

This class will focus on on connecting theory with reality in the field of big data analytics, with a particular emphasis on developing theories, and engineering solutions that solve some of the above problems. The class will focus on refreshing skills in linear algebra, statistics, optimization that will form a foundation for studying indepth the following areas and topics:

  • supervised machine learning techniques (such as linear regression, logistic regression, support vector machines (SVMs)
  • ensemble methods, e.g., gradient boosted decision trees
  • unsupervised machine learning (clustering kmeans, expectation maximization)
  • matrix factorizations, SVD, PCA, recommender systems
  • graph mining, social network analysis, Personalized PageRank, Hubs and Authorities 
    Web spam and TrustRank
  • metrics, perfomance curves, bias variance tradeoff

The objective of this course is to go deep on a number of representative techniques in each of the above areas, some of which are workhorses of industry such as online advertising and healthcare. Each lecture will be a composition of theory, geometry and code (R primarily), and example problems. You will end up coding up many of the algorithms that we cover (as homework and as class projects).

You will learn some of the following skills:

  • Learn the core subjects of optimization theory: gradient descent; convex optimization. See how they are used every day in machine learning and in online advertising.
  • Learn core workhorse techniques in supervised machine learning and in unsupervised machine learning that used every day in online advertising, marketing, and healthcare
  • Analyze intelligent support systems for marketing decisions as well as develop mathematical models for optimizing sales, marketing, and pricing decisions in high tech
  • Stochastic Recommenders: review classical approaches to collaborative filtering while also looking at recent developments in the field of stochastic recommenders with applications to ecommerce and online advertising
  • Learn basics of graph mining
  • Learn basics of dynamic programming and Markov decision processes(MDP). Look at applications setting policies for online adverting to optimize various business objectives. Time permitting!

The course emphasis will be tuned to the class composition and interest.

Prerequisites: Students are expected to be mathematically mature, and to have had prior exposure to undergraduate linear algebra at the level of  MATH21 or  AMS10  and probability/statistics at the level of AMS 131 or MPE 107. •We will provide a refresher in the form of a “boot camp” early in the course, to enable students to relearn basics required for the course.

 

Course Logistics

The UCSC campus classroom is Engineering 2, room 156. 
Directions to UCSC campus:   http://www.soe.ucsc.edu/about/directions

It will be telecast between the new UCSC Silicon Valley location, 2505 Augustine Drive Santa Clara (Conference Room 303). Yahoo Maps can be found herehttp://maps.yahoo.com/#mvt=m&lat=37.43137&lon=-122.168924&zoom=12&tt=ucsc%20extension&tp=1&ioride=us

UCSC (Engineering 2, room 156); UCSC Silicon Valley (Room 303)

 

Instructors and Assistants