Just some quick notes on orthogonal projections onto regular level sets from a linear-algebraic and a differential point of view.
$\newcommand{\R}{\mathbb{R}}$ $\newcommand{\from}{:}$ $\newcommand{\pin}[1]{\tilde{#1}}$ $\DeclareMathOperator{\im}{im}$ $\DeclareMathOperator{\grad}{grad}$ $\DeclareMathOperator{\id}{I}$ $\DeclareMathOperator{\dim}{dim}$
If $A: U \to V$ is a linear map between Euclidean spaces, there exist a powerful and simple way to build the orthogonal projections onto its kernel and onto its image: the Moore–Penrose inverse. Indeed there always exists1 a linear map $\pin{A} \from V \to U$ characterized by
If $A$ is invertible then $\pin{A} = A^{-1}$.
Note that $\pin{A}A \from U \to U$ and $A\pin{A} \from V \to V$ are endomorphisms. They are the maps we are interested in:
$A\pin{A} \from V \to V$ is the orthogonal projection
- onto $\im{A\pin{A}} = \im{A} =\left(\ker{\pin{A}}\right)^{\perp}$
- along $\ker{A\pin{A}} = \ker{\pin{A}} = \left(\im{A}\right)^{\perp}$
$\pin{A}A \from U \to U$ is the orthogonal projection
\[\tag{1} \text{onto }\im{\pin{A}{A}} = \im{\pin{A}} =\left(\ker{A}\right)^{\perp}\]along $\ker{\pin{A}{A}} = \ker{A} = \left(\im{\pin{A}}\right)^{\perp}$
The Moore-Penrose inverse can be computed in closed form, and this procedure is particularly simple in the case of full-ranked $A$. Let $\dim{U} = n$ and $\dim{V} = m$ with $A \from U \to V$, so $A \in \R^{m \times n}$.
All these notions generalize neatly to generic (as opposed to Euclidean) inner product spaces;2 at the moment I’m interested in the full-rank, surjective, non-Euclidean case.
Let $A \from U \to V$ be a full-rank linear map between vector spaces, with $n = \dim{U} > \dim{V} = m$ (so $A$ is surjective and not injective).
Equip $U$ be equipped with the Riemannian metric $g$. This means that $g$ is a smooth field of inner products, that is for each $x \in U$ we have an inner product $g_x: U \times U \to \R$.
For each $x \in U$ we want to build the orthogonal projection $P_x \from U \to U$ onto the kernel of $A$ with respect to the inner product $g_x$. This turns out to be23
\[\tag{3} P_x = \id - g_x^{-1}A^T \left( A g_x^{-1} A^T \right)^{-1} A\]We don’t prove this here, but just perform some consistency checks and develop an intuition of where this might come from.
Thus $(3)$ is the non-Euclidean generalisation of $(1)$ in the full-rank, surjective case $(2)$ ✅.
Dimensionally, a vector in $U$ has shape $n\times1$ (column vector), and $P_x$ must be $n \times n$. Indeed $g_x$ and $g_x^{-1}$ are $n \times n$, $A$ is $m \times n$, $A^T$ is $n \times m$, and all works out well.
Different point of view now. Let $S$ be a submanifold4 of a Riemannian manifold $(M,g)$. If $X$ is a vector field on $M$ then its orthogonal projection with respect to $g$ onto $S$ is the vector field $X_S$ on S given by
\[X_S = X - \frac{g(X,n)}{g(n,n)} \, n\]where $n$ is any vector field in $M$ that is normal to $S$.
If $F \from M \to N$ is a smooth submersion between smooth manifolds, that is a smooth function whose differential is surjective,5 then its level sets $S_y = {x \in M: F(x) = y}$ are smooth submanifolds of $M$ of dimension $\dim{M} - \dim{N}$, for all $y \in N$.
If $N = \R$ then $S$ has codimension $1$, and the tangent space to $S$ at a point $x$ is the kernel of the differential of $F$, $T_{x}S = \ker{d_xF}$, so that the gradient of $F$ is a vector field in $M$ normal to $S$:
\[g_x(\grad{F}_x, u) = d_xF\cdot{u} = 0 \quad \text{ for all } u \in T_{x}S.\]Thus if $X$ is a vector field in $(M,g)$, its orthogonal projection $X_S$ onto the level set $S$ of a smooth submersion $F \from M \to \R$ is
\[\tag{4} X_S = X - \frac{g(X,\grad{F})}{g(\grad{F},\grad{F})} \, \grad{F}.\]Let now $A \from M = \R^n \to \R$ be a full-rank linear surjection, and let $S = \ker{A}$ be its zero level set. In this case, Eq. $(4)$ should reduce to Eq. $(3)$.
In matrix form we have that $A \in \R^{1 \times n}$, and by linearity the differential of $A$ can be identified with $A$ itself, and thought of as a row vector: $d_xA \cong A: T_{x}\R^n \cong \R^n \to T_{Ax}\R \cong\R$. The gradient of $A$ at a point $x$ is the $n\times 1$ column vector
\[\grad_{x}A = g_x^{-1} \left( d_{x}A \right) = g_{x}^{-1}A^T,\]where in the second term the $(2,0)$ tensor $g_{x}^{-1}$ acts on the $(0,1)$ tensor $d_x{A}$ by canonical duality pairing, and the last term is to be read as matrix equation; dimensions are consistent since $g_x^{-1}$ is $n \times n$ and $A^T$ is $n \times 1$.
By $(4)$ the orthogonal projection of a vector field $X$ in $M$ onto $S$ with respect to $g$ is
\[\begin{split} X_S & = X - \frac{g(X,\grad{A})}{g(\grad{A},\grad{A})} \, \grad{A} \\ & = X - \frac{AX}{Ag^{-1}A^T} \, g^{-1}A^T \\ \end{split}\]which is dimensionally consistent (the fraction is a scalar, and $X$ and $g^{-1}A^T$ are $n\times 1$), and is precisely Eq. $(3)$ in the case $m = 1$.
S. Roman, S. Axler, and F. Gehring, Advanced linear algebra, 3rd ed. Springer, 2008. ↩
K. Kamaraj and K. C. Sivakumar, “Moore-Penrose inverse in an indefinite inner product space,” JAMC, vol. 19, no. 1, pp. 297–310, Mar. 2005 ↩ ↩2
F. Alvarez, J. Bolte, and O. Brahic, “Hessian Riemannian gradient flows in convex programming,” SIAM J. Control Optim., vol. 43, no. 2, pp. 477–501, Jan. 2004, doi: 10.1137/S0363012902419977. ↩
J. M. Lee, Introduction to Smooth Manifolds, 2nd ed. in Graduate Texts in Mathematics. Springer-Verlag New York, 2012. ↩
This can be relaxed to regular level sets of smooth functions that are not necessarily submersions by requiring the level sets to not contain critical points; cf An exercise on regular level sets. ↩