SVM
UFPE
represent random variables (r.v.);
represent realizations of random variables;
represent random vectors;
represent realizations of random vectors;
represent random matrices;
represent realizations of random matrices;
dimension of features, variables, parameters
sample size
\(i\)-th observation, instance (e.g., \(i=1, ..., n\))
\(j\)-th feature, variable, parameter (e.g., \(j=1, ..., p\))
number of folds in k-fold CV
People often loosely refer to all three (MMC, SVC, SVM) as “support vector machines”. We will distinguish them.
Suppose we have \(n\) training observations \(x_1, \dots, x_n\) in \(p\)-dimensional space, with class labels \(y_1, \dots, y_n \in \{-1, 1\}\).
A separating hyperplane perfectly separates training observations by class:
This can be written concisely as: \[ y_i (\beta_0 + \beta_1 x_{i1} + \dots + \beta_p x_{ip}) > 0 \quad \text{for all } i=1, \dots, n \]
If a separating hyperplane exists, we can use it for classification. A new observation \(\mathbf{x}^*\) is classified based on the sign of \(f(\mathbf{x}^*) = \beta_0 + \sum \beta_j x_j^*\).
The MMC is the solution to the optimization problem:
Maximize \(M\) (the margin) with respect to \(\beta_0, \beta_1, \dots, \beta_p, M\) Subject to:
The SVC is the solution to the optimization problem:
Maximize \(M\) with respect to \(\beta_0, \dots, \beta_p, \epsilon_1, \dots, \epsilon_n, M\)
Subject to:
Standard SVMs are for binary classification. How to extend to \(K > 2\) classes?
Two main approaches:
Aprendizado de Máquina: uma abordagem estatística, Izibicki, R. and Santos, T. M., 2020, link: https://rafaelizbicki.com/AME.pdf.
An Introduction to Statistical Learning: with Applications in R, James, G., Witten, D., Hastie, T. and Tibshirani, R., Springer, 2013, link: https://www.statlearning.com/.
Mathematics for Machine Learning, Deisenroth, M. P., Faisal. A. F., Ong, C. S., Cambridge University Press, 2020, link: https://mml-book.com.
An Introduction to Statistical Learning: with Applications in python, James, G., Witten, D., Hastie, T. and Tibshirani, R., Taylor, J., Springer, 2023, link: https://www.statlearning.com/.
Matrix Calculus (for Machine Learning and Beyond), Paige Bright, Alan Edelman, Steven G. Johnson, 2025, link: https://arxiv.org/abs/2501.14787.
Machine Learning Beyond Point Predictions: Uncertainty Quantification, Izibicki, R., 2025, link: https://rafaelizbicki.com/UQ4ML.pdf.
Mathematics of Machine Learning, Petersen, P. C., 2022, link: http://www.pc-petersen.eu/ML_Lecture.pdf.
Statistical Machine Learning - Prof. Jodavid Ferreira