Math-heavy fixture
Display equations and inline math interleave with prose. KaTeX or MathJax may render math *after* widget mount; the walker should treat math containers as no-mark zones (they don't contain natural-language prose).
Per plan/01_renderer.md, math blocks render as <div class="math-display" data-block-id="…"> and inline math is wrapped in <span class="math-inline">. Both should be skipped by the term walker.
Gradients and Jacobians
The gradient of a scalar function is the vector of partial derivatives:
The Jacobian generalizes the gradient to vector-valued functions :
The Hessian is the matrix of second-order partials. In the LaTeX source above, the words "gradient", "Jacobian", and "Hessian" do not appear — they're only in the surrounding prose, which is where they should be marked.
Eigenvalues and SVD
An eigenvalue and eigenvector satisfy . The SVD factors any matrix as:
where holds the singular values. PCA is the special case where we use the SVD of a centered data matrix to find directions of maximum variance.
Norms
The L2 norm of is . The L1 norm is . The generic norm in prose refers to whichever is in scope.
Softmax and cross-entropy
The softmax of a vector is:
The inverse is the logit. Cross-entropy loss between a target distribution and a prediction is . KL divergence is .
Attention
In self-attention, queries , keys , and values combine via scaled dot product:
The numerator's normalization is cosine similarity when and are unit-norm. The whole operation acts on tensors of shape (batch, heads, seq, d_k). einsum notation compresses the math: bhqd, bhkd -> bhqk.
Positional encoding adds a sinusoidal vector to each token embedding so the dot-product attention can distinguish positions:
Inline math density
Inline math in dense paragraphs: when and , the chain rule gives . The terms "gradient" and "attention" appear in this very paragraph and should be marked in the prose, but the math expressions , , should NOT be touched by the walker.