Verified Deep Learning with Lean 4

2 Tensor: Calculus Foundations

The algebraic framework: 1D \(\mathsf{Vec}\), 2D \(\mathsf{Mat}\), 3D \(\mathsf{Tensor3}\) types, the partial-derivative function \(\operatorname {pdiv}\) and its structural rules, and three VJP record types (\(\mathsf{HasVJP}\), \(\mathsf{HasVJPMat}\), \(\mathsf{HasVJP3}\)) that package a backward function together with its correctness claim.

Axiom 1 Partial derivative
#

The partial derivative function. For \(f : \mathbb {R}^{m} \to \mathbb {R}^{n}\), \(\operatorname {pdiv}\, f\, x\, i\, j\) is the \((i, j)\)-entry of the Jacobian at \(x\).

Axiom 2 Chain rule
#

\(\operatorname {pdiv}(g \circ f)\, x\, i\, k = \sum _j \operatorname {pdiv}f\, x\, i\, j \cdot \operatorname {pdiv}g\, (f\, x)\, j\, k\).

Axiom 3 Sum rule
#

Linearity of the derivative.

Axiom 4 Product rule
#

Pointwise Leibniz rule for elementwise products.

Axiom 5 Identity Jacobian
#

\(\operatorname {pdiv}(\mathrm{id})\, x\, i\, j = \delta _{ij}\).

Axiom 6 Constant has zero Jacobian
#
Axiom 7 Gather / reindex Jacobian
#

Covers permutations, reshapes, slicing. Generalizes pdiv_id.

Axiom 8 Row-independence for matrices
#

Applying a vector function per-row to a matrix has block-diagonal Jacobian. The one genuinely-new primitive at the \(\mathsf{Mat}\) level.

Theorem 9 Finite-sum rule
#

Linearity extended to arbitrary finite sums by induction.

Theorem 10 VJP chain rule
#

Given \(\mathsf{HasVJP}\, f\) and \(\mathsf{HasVJP}\, g\), get \(\mathsf{HasVJP}\, (g \circ f)\).

Theorem 11 Additive fan-in VJP
#

VJP of \(f + g\). Used for residual connections.

Theorem 12 Multiplicative fan-in VJP
#

VJP of elementwise product. Used for Squeeze-and-Excitation.

Theorem 13 Identity VJP
#
Theorem 14 Matrix-level chain rule
#

Lifts vjp_comp via the \(\mathsf{Mat.flatten}\) bijection.

Theorem 15 Matrix-level additive fan-in
#
Theorem 16 Matrix-level identity
#
Theorem 17 Scalar-scale Jacobian
#

Phase 6 derivation.

Theorem 18 Transpose Jacobian
#
Theorem 19 Matmul Jacobian, left factor fixed
#

Phase 6: derived, not axiomatized.

Theorem 20 Matmul Jacobian, right factor fixed
#
Theorem 21 Matmul VJP, left factor fixed
#
Theorem 22 Matmul VJP, right factor fixed
#
Theorem 23 Scalar-scale VJP
#
Theorem 24 Transpose VJP
#
Theorem 25 Row-wise lifting of any \(\mathsf{HasVJP}\)
#

Phase 8 generic helper: lifts any \(\mathsf{HasVJP}\, (g : \mathbb {R}^{n} \to \mathbb {R}^{p})\) to a \(\mathsf{HasVJPMat}\) on \(\mathbb {R}^{m \times n} \to \mathbb {R}^{m \times p}\).

Theorem 26 3D chain rule
#
Theorem 27 3D VJP chain rule
#
Theorem 28 3D additive fan-in
#