Familiar with the basics of predictive modeling. Predictive analytics has proven capable of enormous returns across industries – but, with so many core methods for predictive modeling, there are some tough questions that need answering.
Workshop Introduction from John Elder
What you will learn:
1: The tremendous value of learning from data.
2: How to create valuable predictive models for your business.
3: Best Practices by seeing their flip side: Worst Practices.
This one-day session surveys standard and advanced methods for predictive modeling.
Dr. Elder will describe the key inner workings of leading algorithms, demonstrate their performance with business case studies, compare their merits, and show you how to pick the method and tool best suited to each predictive analytics project. Methods covered include classical regression, decision trees, neural networks, ensemble methods, uplift modeling and more.
The key to successfully leveraging these methods is to avoid “worst practices”. It's all too easy to go too far in one's analysis and “torture the data until it confesses” or otherwise doom predictive models to fail where they really matter: on new situations.
Dr. Elder will share his (often humorous) stories from real-world applications, highlighting the Top 10 common, but deadly, mistakes. Come learn how to avoid these pitfalls by laughing (or gasping) at stories of barely averted disaster.
If you'd like to become a practitioner of predictive analytics – or if you already are, and would like to hone your knowledge across methods and best practices, this workshop is for you.
Course Outline:
I. Pattern Discovery: An Executive Summary
Data Mining or Data Dredging?
Computer vs. Human: Mining and Visualization
Example Projects from Science and Business
Ingredients for Success
Modern Modeling Algorithms
Bundling Models to Increase Accuracy
Example: Identify Bat Species
II. Getting Going
Technical disciplines contribute
Stages of an analytic project
Setting up the data file
Example project: Fraud Detection
Lift Charts to display model quality
Decision Trees to fit data
III. Clustering and Nearness
Commercial Products’ Algorithms
Unsupervised Learning
Clustering
Principal Components
Nearest Neighbor
IV. Neural Networks
Logistic (sigmoidal) transformation
Example
V. Re-Sampling - essential for validation
The danger of over-fit and over-search
Cross-Validation
Bootstrap
Target Shuffling
Example: find sweet spot for strikes in baseball
VI. Visualization
Projections and projection pursuit
Visualizing numbers, text, and links
Density graphs: Drug discovery application
VII. Ensembles
Bagging (with CART example)
Boosting
Bundling different models (with Credit Scoring example)
VIII. Top 10 Data Mining Mistakes
Lack data
Focus on Training
Rely on 1 technique
Ask the wrong question
Listen (only) to the data
Future leakage
Discount pesky cases
Extrapolate
Answer every inquiry
Sample without care
Believe the best model
When:
October 2nd, 9:00am - 5:00pm
Location:
Herbert Suite on the ground floor of Herbert Park Hotel, Ballsbridge, Dublin 4
Tickets:
Tickets available in cart. €645 + VAT. Includes John Elder's ebook on Data Mining.