Nick Korbit

I am a PhD student in the DYSCO lab at IMT Lucca, advised by Mario Zanon and Alberto Bemporad. I work on practical second-order methods for large-scale machine learning, with an emphasis on Gauss-Newton structure, matrix-free solvers, and JAX-native systems. I also build open-source tooling for curvature-aware training.

Prior to starting my PhD, I worked on autonomous delivery robots at Starship Technologies and risk modeling at CompatibL.

Email / Google Scholar / GitHub / LinkedIn

Research

I study how to retain curvature structure in modern training without paying the full cost of second-order methods. My research focuses on Gauss-Newton approximations, matrix-free solvers, and scalable algorithms that balance curvature quality, stability, and efficiency.

	Fast Gauss-Newton for Multiclass Cross-Entropy Mikalai Korbit, Mario Zanon Preprint, 2026 arXiv FGN decomposes multiclass softmax GGN into a true-vs-rest term plus a positive semidefinite within-competitor covariance, retains only the former while preserving the exact softmax loss and gradient, and computes damped updates via matrix-free row-space CG without applying the full C-by-C softmax covariance.
	Second-Order, First-Class: A Composable Stack for Curvature-Aware Training Mikalai Korbit, Mario Zanon ECML PKDD 2026 code / arXiv We introduce Somax, a systems framework for curvature-aware training that treats a second-order update as a planned JIT-compiled step. It exposes curvature operators, solvers, preconditioners, damping policies, and telemetry as composable modules, and shows that planning and module choices materially affect overhead and time-to-accuracy.
	Exact Gauss-Newton Optimization for Training Deep Neural Networks Mikalai Korbit, Adeyemi D. Adeoye, Alberto Bemporad, Mario Zanon Neurocomputing, 2025 arXiv We present EGN, a stochastic second-order optimizer that computes the Gauss-Newton direction by solving an exact low-rank system in mini-batch space, making second-order updates practical when the parameter dimension is much larger than the batch size.
	Incremental Gauss-Newton Descent for Machine Learning Mikalai Korbit, Mario Zanon Preprint, 2024; revised 2026 arXiv IGND turns the rank-one per-sample Gauss-Newton matrix into a lightweight stochastic optimizer for scalar-output objectives. The update is a closed-form scalar normalization of the stochastic gradient, requiring no curvature matrix or linear solve, and improves robustness to feature scaling in supervised and reinforcement learning experiments.

Software

I develop and maintain Somax, a JAX-native stack for curvature-aware training. It provides a composable framework for implementing, studying, and benchmarking second-order optimization methods at scale.

Somax: Second-Order Methods in JAX
Mikalai Korbit
code / arXiv

Somax treats second-order optimization as an explicit training-step pipeline rather than a monolithic optimizer. Curvature construction, linear solves, damping, preconditioning, update transforms, and telemetry are separated into swappable components, then exposed through a stable step interface for JAX training loops.

Teaching

Machine Learning Experiments for Researchers (Winter 2026)
IMT School for Advanced Studies Lucca
Course Co-creator and Instructor
Practical companion to the Machine Learning course by Prof. Alberto Bemporad

Machine Learning Experiments for Researchers (Winter 2025)
IMT School for Advanced Studies Lucca
Course Co-creator and Instructor
Practical companion to the Machine Learning course by Prof. Alberto Bemporad

Kudos to Jon Barron for this template!