References

Online Books & Educational Tools

Category Resource Description
Modeling An Introduction to Statistical Learning Statistical and machine learning approaches to learning from data. Includes companion website for Python examples.
Explainability Interpretable Machine Learning A practical overview of techniques for making ML models more transparent, including SHAP.
Visualization UW Interactive Data Lab Curriculum Book on statistical visualization using Vega-Lite and Altair.
Visualization Fundamentals of Data Visualization Principles and examples of clear, effective visual communication.
Time Series Forecasting: Principles and Practice: R, Python Comprehensive introduction to forecasting methods. Includes like exponential smoothing and ARIMA. Versions for R and Python available.
Data Imputation Flexible Imputation of Missing Data Methods to handle missing data, with emphasis on multiple imputation.
Fraud Detection Fraud Detection Handbook Applied techniques for detecting fraud in highly imbalanced datasets. Includes instructions on using a fraud data simulator.
Statistics & Probability OpenIntro Statistics Introduction to statistics and probability. Also includes links to YouTube videos explaining the concepts.
Statistics & Probability SeeingTheory Visual introduction to statistics and probability.

Tools

  • SDV - Python library for creating tabular synthetic data.
  • permetrics - Python library for performance metrics of machine learning models. Documentation site includes quick explanations of each metric.

Resources used to make this guide

  • Quarto: Extensive publishing system. Supports jupyter notebooks and markdown.
  • bootswatch: Collection of free themes for Bootstrap-based sites.