John Elder Workshop

When:

October 2nd, 9:00am - 5:00pm

Location:

Herbert Suite on the ground floor of Herbert Park Hotel, Ballsbridge, Dublin 4

Tickets:

Tickets available in cart. €645 + VAT. Includes John Elder's ebook on Data Mining - Handbook of Statistical Analysis and Data Mining Applications (Book of the year in math).

Intended Audience:

Interested in the true nuts and bolts.

Knowledge Level:

Familiar with the basics of predictive modeling. Predictive analytics has proven capable of enormous returns across industries – but, with so many core methods for predictive modeling, there are some tough questions that need answering.

Workshop Introduction from John Elder

What you will learn:

1: The tremendous value of learning from data.
2: How to create valuable predictive models for your business.
3: Best Practices by seeing their flip side: Worst Practices.

This one-day session surveys standard and advanced methods for predictive modeling.

Dr. Elder will describe the key inner workings of leading algorithms, demonstrate their performance with business case studies, compare their merits, and show you how to pick the method and tool best suited to each predictive analytics project. Methods covered include classical regression, decision trees, neural networks, ensemble methods, uplift modeling and more.

The key to successfully leveraging these methods is to avoid “worst practices”. It's all too easy to go too far in one's analysis and “torture the data until it confesses” or otherwise doom predictive models to fail where they really matter: on new situations.

Dr. Elder will share his (often humorous) stories from real-world applications, highlighting the Top 10 common, but deadly, mistakes. Come learn how to avoid these pitfalls by laughing (or gasping) at stories of barely averted disaster.

If you'd like to become a practitioner of predictive analytics – or if you already are, and would like to hone your knowledge across methods and best practices, this workshop is for you.

Course Outline:

I. Pattern Discovery: An Executive Summary

Data Mining or Data Dredging?
Computer vs. Human: Mining and Visualization
Example Projects from Science and Business
Ingredients for Success
Modern Modeling Algorithms
Bundling Models to Increase Accuracy
Example: Identify Bat Species

II. Getting Going

Technical disciplines contribute
Stages of an analytic project
Setting up the data file
Example project: Fraud Detection
Lift Charts to display model quality
Decision Trees to fit data

III. Clustering and Nearness

Commercial Products’ Algorithms

Unsupervised Learning

Clustering

Principal Components

Nearest Neighbor

IV. Neural Networks

Logistic (sigmoidal) transformation

Example

V. Re-Sampling - essential for validation

The danger of over-fit and over-search
Cross-Validation
Bootstrap
Target Shuffling
Example: find sweet spot for strikes in baseball

VI. Visualization

Projections and projection pursuit

Visualizing numbers, text, and links

Density graphs: Drug discovery application

VII. Ensembles

Bagging (with CART example)

Boosting

Bundling different models (with Credit Scoring example)

VIII. Top 10 Data Mining Mistakes

Lack data

Focus on Training

Rely on 1 technique

Ask the wrong question

Listen (only) to the data

Future leakage

Discount pesky cases

Extrapolate

Answer every inquiry

Sample without care

Believe the best model

When:

October 2nd, 9:00am - 5:00pm

Location:

Herbert Suite on the ground floor of Herbert Park Hotel, Ballsbridge, Dublin 4

Tickets:

Tickets available in cart. €645 + VAT. Includes John Elder's ebook on Data Mining.

Core Machine Learning and Data Science Techniques

When:

Location:

Tickets:

Intended Audience:

Knowledge Level:

Workshop Introduction from John Elder

What you will learn:

Course Outline:

I. Pattern Discovery: An Executive Summary

II. Getting Going

III. Clustering and Nearness

IV. Neural Networks

V. Re-Sampling - essential for validation

VI. Visualization

VII. Ensembles

VIII. Top 10 Data Mining Mistakes

When:

Location:

Tickets:

Core Machine Learning and
Data Science Techniques