clean up:
http://en.wikipedia.org/wiki/Spectral_theorem
...and say something or other about Jordan algebras
(connection to spectral theorem,
interpretation of positives as quadratic form / semipositive spectral values)
Reynold's quotient is basically Tv^2, so the derivative is 2Tv.
[it's just notation and non-commutativity what cause pain]
*** split off Jordan into separate section ***
(on hermitian operators)
. just say "they are Jordan algebras"
(and maybe "real spectral theorems are about formally real Jordan
algebras";
dunno what complex etc. spectral theorems are about)
. formally real: semipositive are positive semidefinite
...which is an intrinsic characterization
(it's b/c self-adjoints can be viewed as real-valued functions,
which have a natural formally real structure, viz, are your values all semipositive?)
in spectral theorem terms, has semipositives on diagonal
[it's pretty clear from spec thm that hermitians are formally real:
it's just diagonal in some basis]
(BTW, used in physics b/c want commutative, power-associative, partially ordered
observables -- which is exactly a formally real Jordan algebra.
Get a logic structure from the ordering, but note that it's not distributive,
hence "quantum logic" (quantum b/c comes from (hermitian) operators))
the behavior of addition and Jordan product on the principal axes is
*weird*
(rather like the behavior of composing rotation on axis of rotation in
SO(3), say)
...dunno if we can get insight.
multiplication of quadratic forms:
the quad form pov is $(\Sym^2 V)^* < (V \otimes V)^*$
interpreting as operator is
$(V \otimes V)^* = V^* \otimes V^* = \Hom(V,V^*) \isoto \Hom(V,V)$
(the last step is the weird one; in Hilbert space world, it's
that every functional is dot product with a vector)
An intrinsic product is:
$$(V \otimes V)^* \otimes (V \otimes V)^* \to (V \otimes V)^*$$
...and corresponds to product of operators;
it's contraction on the middle two factors
(corresponding to composition $W^* \otimes W \to K$)
which in this case is exactly the existing inner product;
$V^* = V$ requires the inner product to be non-degenerate.
...so I really think it's a sort of multiplication.
$q(v)r(v)/\norm{v}$
A Jordan POV on the spectral theorem(s)?
. it's a property of representations of Jordan algebras
. ...and good things are a Jordan algebra?
BTW, normal aren't even closed under *sum*
...so you're not going to get much of an algebra structure out of them.
[it's a *messy* set]
EG:
0 1 0 1
-1 0 + 1 0
\chapter{Spectral theorems}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Abstract}
I give conceptual proofs, with detail and background,
for the real and complex spectral theorems in finite dimensions.
The key ingredient in both proofs is the existence of an eigenvector;
beyond that they are quick and formal.
I discuss both the key steps and the formality in detail, from several points of view.
The spectral theorems are key results in mathematics and its applications,
both in that they are \emph{used} crucially in many areas
(Principal Component Analysis, quantum mechanics, principal curvatures)
and that the proofs \emph{use} important parts of other areas math
(Morse-Bott functions, the Nullstellensatz).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Statement}
The spectral theorem characterizes which operators
on an inner product space are orthonormally diagonalizable.
\begin{thm}[Real spectral theorem]
An operator $T$ on a real inner product space
is orthonormally diagonalizable
iff it's self-adjoint (aka, symmetric: $T=T^*$).
\end{thm}
\begin{thm}[Complex spectral theorem]
An operator $T$ on a complex inner product space
is orthonormally diagonalizable
iff it's normal ($[T,T^*]=0$).
\end{thm}
I do not know the statements for quaternions or octonions;
I believe that for quaternions it's ``orthodiagonalizable iff normal'',
as for complexes, and by essentially the same proof.
\begin{defn}
By \Def{inner product space} we mean \emph{finite dimensional} inner
product space. Lovers of acronyms may call these the spectral theorems
for a fdips.
\end{defn}
\begin{defn}
\Def{Orthonormally diagonalizable} means
``has an orthonormal basis of eigenvectors''
(equiv., orthogonal).
Module-theoretically, $V$ breaks up as an \emph{orthogonal} direct sum of
$1$-dimensional $K[T]$-modules.
Linear-algebraically, $T$
``preserves an orthogonal set of axes'', meaning there it
preserves a set of $n$ orthogonal lines ($1$-dimensional subspaces).
\end{defn}
\begin{defn}
An operator $T$ is \Def{self-adjoint} if it equals its adjoint: $T=T^*$.
For a real operator, adjoint agrees with transpose, so self-adjoint means $T=T^t$,
i.e., it's symmetric.
\end{defn}
\begin{defn}
\Def{Normal}\footnote{An overused term; note the conflict with ``orthonormal'',
where normal means ``unit norm''. Usually nominalized as \emph{normality} (rarely as normalcy).}
here means ``commutes with adjoint: $TT*=T*T$''.
More Lie-theoretically, the bracket with the adjoint is zero: $[T,T^*]=0$.
\end{defn}
These are both real, \emph{closed} conditions,
because they are defined by an equality involving complex conjugation.
They are both preserved under unitary transformations.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Proofs}
We present brief proofs, then give details.
\begin{proof}[Real]
A self-adjoint operator is orthogonally semisimple:
if $T$ preserves a subspace $W < V$, $T=T^*$ also preserve $W^\perp$.
$T$ has an eigenvector (take a critical point of the Rayleigh quotient
and look at the derivative there).
By induction, it's orthonormally diagonalizable.
``Only if'' is trivial.
\end{proof}
\begin{proof}[Complex (abstract)]
Given two commuting operators, they preserve a common flag.
So if $T$ is normal, $T,T^*$ preserve a flag -- but since $T^*$ preserves
the flag, $T$ preserves the perp flag also, so $T$ preserves the orthogonal
set of axes of the flag, i.e., $T$ is orthonormally diagonalizable.
``Only if'' is trivial.
\end{proof}
\begin{proof}[Complex (matrix)]
Two commuting operators are simultaneously upper-triangulizable
by unitary change of basis.
$T$ is upper triangular iff $T^*$ is lower triangular, so $T^*$ (and likewise $T$)
is both upper and lower triangular, hence diagonal.
``Only if'' is trivial: diagonal matrices commute with other diagonal matrices, which their adjoint is.
\end{proof}
\section{Proof details and lemmata}
The two theorems strike me as distinct, and describing different phenomena,
as self-adjoint operators and normal operators are quite different:
I don't have a unified point of view on them both.
The key ingredients of the proofs is the existence of an eigenvector:
\begin{itemize}
\item Complex operators always have an eigenvector.
\item Real operators do not generally have an eigenvector;
one needs to use self-adjointness.
\item A pair of complex operators does not generally have a common eigenvector;
one needs to use commuting.
\end{itemize}
Beyond that, it's formal: it follows immediately from algebra,
similarly for both.
So in the real/self-adjoint case the question is
``why do self-adjoint operators have an eigenvector?'',
which I prove using topology \& calculus, and (complex) algebra.
In the complex/normal case the question is
``why do commuting operators have a common eigenvector?'',
which I prove using algebraic geometry, linear algebra,
and Lie theory.
\subsection{Rayleigh quotient}
\begin{proof}[Sketch of topological/calculus proof that self-adjoint operators have an eigenvector]
Critical points of the Rayleigh quotient (as a function on projective space) are eigenvectors,
(we relate $Tv$ to the derivative),
and the Rayleigh quotient has critical points (for instance its max).
\end{proof}
Working backwards:
[What's going on is that at a principal axis,
the function is (locally -and- globally) a quadratic form,
so the derivative vanishes,
so principal axes are critical points.]
We use the Rayleigh quotient to -find- the eigenvectors.
The Rayleigh quotient is just the quadratic form
divided by scale:
it's homogeneous of degree 0 (instead of degree 2)
and (thus) is constant on lines.
It thus descends to a function on projective space
and is the quintisential example of a function on a quotient
(it's a function upstairs, invariant on the fibers).
If you prefer, it's a function on the sphere
(which is suspiciously equal at antipodes)
...but this is just using the sphere mod antipodes as a model for
projective space.
It is a nice function, a Morse-Bott function; if the spectral values
are distinct, it's a Morse function, and yields the standard (regular)
CW-complex structure on $\RP^{n-1}$.
(which lifts to the standard (2-fold symmetric) regular CW-complex
structure on the sphere)
[the ``regular'' is b/c standard CW-complex structure on RP is
regular, while usual on S^n small]
The principal axes are the critical points:
the critical points are the projectivization of the principal axes
(note that if two spectral values agree, then you don't get a
principal axis for them, you get a principal plane -- likewise if
there is a 3-fold or higher coincidence; in this case the critical set
is the projective of the principal plane etc., and you get a
Morse-Bott function).
[This is all easy to see for a quadratic form in standard form.]
(It's also a potential for an infinitesimal dynamical system;
...of which T is the discrete version (?)
eigenvectors are the fixed points.)
Topology tells us that there are at least $n$ critical points
(that's the minimum number of cells to generate the homology of
$\RP^{n-1}$), and in fact there are exactly $n$
(I don't see an easy way to see this),
but all we need to know is that there is at least 1, which there
is because $R_T$ is a continuous function on a compact space,
hence has a maximum.
\begin{lem}
If $T$ is self-adjoint,
$v$ is an eigenvector of $T$ iff it is a critical point of $R_T$.
\end{lem}
\begin{proof}
Compute the derivative of $R_T$ at $v$ in the $w$ direction;
take\footnote{Ok, so we are using the sphere model.}
$\norm{v}=1$ and denote $R_T(v)$ by $c$:
\begin{align*}
\partial_w R_T (v)
&= \lim_{\epsilon \to 0} \frac{1}{\epsilon}
\left(R_T(v+\epsilon w) - R_T(v)\right)\\
&= \lim_{\epsilon \to 0} \frac{1}{\epsilon}
\left(\frac{\left}
{\left}
- c\right)\\
&= \lim_{\epsilon \to 0} \frac{1}{\epsilon}
\left(\left
-c \left
\right)\\
&= \lim_{\epsilon \to 0} \frac{1}{\epsilon}
\left( \left + 2\epsilon\left
+ \epsilon^2\left\right)
-c \left( \left + 2\epsilon\left
+ \epsilon^2\left \right)
\right)\\
&= \lim_{\epsilon \to 0} \frac{1}{\epsilon}
\left( 2\epsilon\left
-2\epsilon c\left
+ \epsilon^2\left
- \epsilon^2\left
\right)\\
&= 2 \left( \left - c\left \right)
\end{align*}
Note that this is zero in the $v$ direction (as we'd expect,
since $R_T$ is constant on lines),
so we just look in the perpendicular direction ($w \perp v$);
since we're really on projective space, that's what we
should be doing anyway.
Then the second term disappears and the derivative simplifies to:
$$\partial_w R_T (v) = 2 \left$$
(Not bad! It's also exactly what you'd expect from a matrix,
which makes the computation more transparent.)
If $v$ is a critical point, then the derivative vanishes,
so $\left = 0$ for all $w \perp v$, so
$\left
(Yes, we've taken a derivative with domain projective space!
We can do this because it's a subquotient of a vector space,
and the function is presented upstairs,
so we can work in the vector space.)
Can also take a Rayleigh quotient for complex operators (on Hermitian
space),
and for self-adjoint it's real and can do same analysis,
even to getting cell structure (on CP^{n-1}).
distiguish function on quotient
(notationally?)
$R_T$ is linear in $T$
$R_T$ vanishes on skew-sym
it injects self-adjoint operators into $\cO_{\RP^{n-1}}$
[This is a *topological/calculus* proof:
we translated $T$ to being a function, which we then do calculus on.]
the above computation can be made more concise
{ped}to get an elementary proof,
take a big proof and take out a crucial and sufficient computation
{ped}abuses of notation
using the same symbol for two distinct but related concepts
(EG a function, and the induced function on a quotient)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
dynamics:
actually, it's the maximum -magnitude- that's the attracting fixed
point
...and if you have +c and -c, you get a periodic orbit!
(er, that's over the reals)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Rayleigh quotient}
The Rayleigh quotient is
\begin{align*}
R_T(v) &:= \frac{\left}{\left}\\
&:= \frac{\norm{v}_T}{\norm{v}}
\end{align*}
(more discussion)
R_T is a sort of distance (it's (1 -minus- distance between span v and
span Tv) * scale factor)
BTW, recall that standard cell structure on P^n is Schubert cells
corresponding to standard flag.
%%%%%%%%%%%%%%%%%%%%%
comments on others proofs
The statement:
$T$ is upper-triangular, so $TT^*=T^*T$ implies that $T$ is diagonal
is too rough and glib: why does ut & commut with adjoint mean diagonal?
- the only upper triangular matrices that are normal are diagonal
...but this isn't obvious to me!
- real case
+ ortho semisimple is immediate and formal (as written above!)
Pf 1: maximize Rayleigh quotient
(NB: also note that if orthonormally diagonalized,
top eigenvector maximizes Rayleigh quotient)
[this is really looking at it from the POV of quadratic forms;
we use operator POV to see ortho semi-simple,
and form POV to get maximum]
%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Algebraic (Galois theory) proof}
The Galois theory proof is to first prove for complex operators,
then do real operators as the complex operators that are fixed under conjugation.
The spectral theorem (ortho-diagonalizable)
for complex \emph{self-adjoint} operators is easy (see below);
the eigenvalues are real because it's self-adjoint.
Now the operator is actually diagonalizable over the reals by Galois theory:
a complex vector is an eigenvector iff its conjugate is,
so real part is (as is the imaginary part, as a real vector; mult by i!).
More concretely, diagonalize using real part of diagonalizing matrices.
(A priori you may be worried that the real part of purely imaginary eigenvector
yields a zero (so you don't get a non-zero eigenvector),
but you can take the imaginary part instead.)
Alternatively, you can just show that it has an eigenvector using Galois theory,
and then use semisimplicity.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Complex: commuting operators}
I don't understand normal operators geometrically or topologically;
I do understand commuting operators though.
A commuting pair of operators, or more generally a commutative algebra,
behaves in many ways like a single operator (resp., a $1$-dimensional algebra).
These statements and proofs all work for larger commuting family,
meaning both ``$k$ commuting operators'',
and more abstractly, a commutative $K$-algebra,
without distinguished basis, but the notation gets ickier,
so we will usually prove just for a pair.
Note that the inductive proofs assume a choice of basis for the algebra.
For all of these, we need only common eigenvector,
as we can then induct to get common flag
(commuting is preserved under quotients).
Sometimes we can prove that they preserve a common flag directly?
\subsubsection{Structure of a single operator}
Recall structure of a single operator.
\begin{thm}[Existence of eigenvector]
An operator $T$ on a complex finite dimensional vector space $V$
has a non-trivial eigenvector $v$.
\end{thm}
\begin{thm}[Orthonormal basis for a flag]
Given a Hermitian form, a complete flag has a unique orthogonal set of axes,
and a unique orthonormal basis adapted to the flag
(up to multiplication by $U(1)\times \dots \times U(1)$).
\end{thm}
indeed, preserves a flag:
\begin{thm}[Schur decomposition (abstract)]
An operator $T$ on a complex finite dimensional vector space $V$
preserves a complete flag.
\end{thm}
\begin{proof}
$T$ has an eigenvector
\end{proof}
\begin{thm}[Schur decomposition (matrix)]
Every square matrix over the complex numbers is unitarily upper-triangulizable:
it is conjugate to an upper triangular matrix by unitary ones.
\end{thm}
The matrix proof follows immediately from the abstract form
by translating everything into matrix terms.
\subsubsection{Common eigenvector}
A common eigenvector for $T$ and $U$ is a vector $v$ such that
$Tv=\lambda v$ and $Uv = \mu v$; the eigenvalue is $(\lambda,\mu)$.
Indeed, $v$ is an eigenvector for the entire algebra $A$ generated by
$T$ and $U$, so we get a map $\lambda\from A \to K$
, and we call this the
eigenvalue\footnote{Though this term is used for a rather differnet problem.}.
\begin{defn}
Given a module $M$ over a $K$-algebra $A$, a eigenvector is a vector $v$
such that for all $a \in A$, $av = \lambda(a)v$ for some map of $K$-algebras\footnote{Meaning $K < A \mapsto K$, and it respects addition and multiplication; it's not just a map of vector spaces.}
$\lambda\from A \to K$; we call $\lambda$ the eigenvalue.
More naturally, it's a cyclic (rank 1) submodule.
Note that we haven't taken $A$ to be commutative, but since $K$ is a commutative algebra, an eigenvector vanishes on the derived algebra, and descends to the commutator.
\end{defn}
In general there are no common eigenvectors:
for instance, if
$A=\begin{psmallmatrix}1 & 0\\ 0 & -1\end{psmallmatrix}$
and $B=\begin{psmallmatrix}0 & 1\\1 & 0\end{psmallmatrix}$
then they simply have different eigenspaces, with no overlap.
\subsubsection{Proofs of common eigenvector}
\begin{proof}[Algebraic geometry]
A commutative algebra of operators $A$ on a vector space $V$
mean $V$ is a module over $A$.
Since $V$ is finite-dimensional over $K$ and $A$ isn't,
it has a non-trivial annihilator $I < A$.
Eigenvalues on $A$ ($K$-algebra maps $A \to K$)
correspond to closed points of $\Spec A$ (maps $\Spec K \to \Spec A$);
$\MSpec A = A^* := \Hom_K(A, K)$.
By the weak Nullstellensatz, $V(I) := \Spec A/I \subset \Spec A$ is non-empty,
and this point is an eigenvalue.
\end{proof}
This is a generalization of the usual (characteristic polynomial)
proof of the existence of an eigenvector: the fundamental theorem of algebra
is the 1-dimensional form of the weak Nullstellensatz.
\begin{proof}[Linear algebra]
\begin{description}
\item[They preserve each other's eigenspaces]
Let $V_\lambda$ be an eigenspace for $A$.
$B$ preserves $V_\lambda$:
given $w \in V_\lambda$,
$A(Bw)=B(Aw)=B(\lambda w) = \lambda(Bw)$.
\item[They have a simultaneous eigenvector]
$A$ has an eigenvector $v$, with eigenvalue $\lambda$,
thus $B$ restricted to $V_\lambda$ has an eigenvector,
which is a simultaneous eigenvector.
(Note that you can't conclude that if $v$ is an eigenvector
for $A$, it is also an eigenvector for $B$.)
\end{description}
\end{proof}
Further: they also preserve each other's (generalized) eigenspaces,
so these interlace. By this I mean that if we write
$\tilde V_\lambda$ and $\tilde W_\mu$ for the generalized eigenspaces for $A$
and $B$, then the vector space is a direct sum of
$$\tilde U_{(\lambda,\mu)} = \tilde V_\lambda \cap \tilde W_\mu$$
which follows by breaking up each $V_\lambda$ into generalized eigenspaces for $B$.
\begin{proof}[Topological Generator]
"For closed statements,
true for 1 implies true for commuting family"
Given a commuting family of operators,
take a topological generator.
It stabilizes a flag, so the whole family does.
\end{proof}
[related point:
the normalizer of Borel is Borel or something like that;
but note that a particular upper triangular matrix
may commute with lower triangular ones
(trivial EG is scalar matrices)]
\begin{proof}[Morse theory]
Analogously to real case's Rayleigh quotient:
use /
(this is always real; presumably a critical point
is an eigenvector for a normal operator?)
I don't understand meaning of normality from this POV
\end{proof}
= = =
(doesn't seem to help)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Algebro-geometric picture}
The algebro-geometric picture for the classification of an operator
(or a family of commuting operators) is rather pretty:
$V$ is (roughly) a collection of points in $\Spec A$.
Properly, it's a coherent sheaf over $\Spec A$ (geometric picture of a module);
the stalk over eigenvalues is the generalized eigenspace, while the reduced part is the eigenspace. You can somewhat think of an eigenspace of dimension $d$ as a collection of $d$ points over $\lambda$, except that writing $d$ points implies a basis: it's just a $d$-dimensional space over $\lambda$ (and fat/unreduced points correspond to generalized eigenspaces).
For a topological generator: given a rank~1 subalgebra $L < A$
(where $L = K[T]$), this induces a map $\Spec A \to \Spec L$,
which generically sends the points of $V$ to different points of $\Spec L$.
\subsubsection{Just need common eigenvector}
In the above proofs,
. instead of using "commuting have common flag", can more prosaically
say "commuting have common eigenvector" and induct;
this is a bit easier than talking about flags b/c you can use perps.
The "commuting have common flag" goes pretty far with just commuting,
*then* uses normality;
this eigenvector proof doesn't go far with commuting,
and uses normality early
\subsubsection{Proofs I don't know}
I don't know a direct proof (i.e., without using the whole complex spectral theorem) that
``normal implies semisimple'' (this seems to be true in the real case too).
This would yield a very quick proof of spectral theorem (namely:
it's semisimple and has an eigenvector: take the perp and induct),
There may be some other topological proof that commuting operators stabilize a common flag by thinking of commuting operators as a torus action on the flag manifold.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Discussion}
\subsection{Field vs.~theorem}
\begin{thm}[``Real'' spectral theorem]
An operator $T$ on a (real or complex) inner product space
is orthonormally diagonalizable with real eigenvalues
iff it's self-adjoint (aka, symmetric: $T=T^*$).
\end{thm}
I don't state/think of it this way b/c the proof is rather different
ok, what I mean is:
Complex case very easy directly (like real self-adjoint):
- self-adjoint is ortho semisimple
- there's an eigenvector
- by induction, ortho diagonalizable
- On each component, it's real (b/c self-adjoint on line iff real)
(no messing with flags)
...as corollary to complex normal,
just add:
- On each component, it's real (b/c self-adjoint on line iff real)
On
A complex self-adjoint operator
...or can prove just like real self-adjoint,
...and then note that 1-dimensional self-adjoint means real
relation between 'em:
neither is special case
\begin{tabular}{lll}
& $\bR$ & $\bC$\\
self-adjoint & orthogonally diagonalizable & orthogonally diagonalizable, with real eigenvalues \\
normal & (messy) & orthogonally diagonalizable \\
\end{tabular}
classification of real normal operators
is a messy statement, exactly as with "Jordan" form over reals
Simply, you again get ortho semisimplicity,
but the simple modules are [a] and
[a -b]
[b a]
(with negative discriminant)
[you should think of rotation by an angle other than $k\pi$;
this is real and normal but not self-adjoint,
and this is basically the only thing that happens (up to scaling):
a -b
b a]
\subsection{General orthogonal classification of operators}
A generic operator stabilizes a single flag
classification of operators generally (on fdips)
Schur form is best you can do
\subsection{Jordan algebras}
What algebraic structure does the set of self-adjoint (or normal) operators have?
I don't know of any for normal operators\footnote{They are the kernel of the diagonal
$A \to (A,A^*)$, followed by the Lie bracket, but I don't know what that tells us.},
but self-adjoint operators form a Jordan algebra,
and indeed are most of the simple formally real Jordan algebras.
That is, define the Jordan product on an associative algebra as:
$$A.B := \half (AB + BA)$$
This is the symmetrization of the product (by analogy with the Lie bracket,
which is the skew-symmetrization); conventionally we divide by $2$
in the Jordan product (so that $a.a=a^2$) but not in the Lie bracket.
Note that self-adjoint operators are a sub-Jordan algebra, as
for self-adjoint operators $A,B$,
$$(A.B)^*=\half (AB + BA)^* = \half (B^*A^*+A^*B^*)=\half(AB+BA)=A.B$$
(more generally, $(A.B)^*=B^*.A^*=A^*.B^*$).
Define a Jordan algebra axiomatically as an algebra that satisfies:
\begin{description}
\item[JA1: commutativity] $a.b=b.a$
\item[JA2: the Jordan identity]\footnote{Yes, this looks weird.
In the context of formally real algebras,
it's equivalent to power associative.} $(a.b).(a.a)=a.(b.(a.a))$
\end{description}
The \Def{simple formally real Jordan algebras}\footnote{Simple means what it should;
formally real means that if $\sum a_i^2 = 0$ then all $a_i=0$;
it means that sums of squares can be thought of as ``positive'';
you may have heard of formally real fields.} are:
\begin{itemize}
\item self-adjoint operators over $\bR,\bC,\bH$ (denoted $\fh_n(\bK)$)
\item self-adjoint operators over $\bO^3$ (octonions: $\fh_3(\bO)$);\\
this is the\footnote{only simple formally real}
exceptional\footnote{the rest are called \Def{special}} Jordan algebra:
it doesn't embed in an associative algebra
\item the spin factors\footnote{The free Jordan algebra on an inner product space $V$,
subject to $v.v=\left$ -- very like Clifford algebras, which are free associative algebras mod the same relation. Indeed, the spin factors have the same representations
as the corresponding Clifford algebras.}
\end{itemize}
This list is naturally analogous to the classification of simple Lie algebras,
but I don't know the exact correspondence.
I don't know what ``self-adjoints are Jordan, and most Jordans are self-adjoint''
but it's nice to know.
\subsection{Intuitive?}
Should the (statements) of the spectral theorems strike us as mysterious?
The adjoint map $A \to A^*$ (in matrix terms, the transpose)
corresponds to the orthogonal structure,
so it's reasonable to expect that orthogonal classification of operators
should be related to the behavior of the adjoint,
and that ``good'' operators should be well-behaved with respect to their adjoint.
Thus that self-adjoint operators should be good (the real spectral theorems)
is reasonable.
The complex spectral theorem is mysterious to me because I don't have a good feel
for ``normal''.
From the matrix point of view, self-adjoint and normal are basis-independent,
(symmetric isn't obviously basis-independent from just staring at matrix
definition)
so we'd expect them to have other descriptions or good classifications.
If you look at a flag that an operator stabilizes (and an orthogonal set of axes for it;
equivalently, the matrix of the operator in an orthonormal basis adapted to the flag),
then (because adjoint means transpose, or perp flag), the theorems look reasonable:
Schur decomposition makes them pretty clear.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Semisimplicity}
Be careful:
don't say "semisimple" for "diagonalizable";
this is true over the complexes,
(b/c only simple are 1D)
but is *not* true over the reals
(a rotation in 2D is simple but not diagonalizable)
oh! unitary/orthogonal operators are also semisimple:
$T^*=T^{-1}$ means semi-simple
...and that's why orthogonal operators are all rotations!!!
\begin{proof}[Abstract proof (real)]
Self-adjoint operators are semisimple, because if $T$ preserves $W < V$,
then $T=T^*$ preserves $W^\perp$.
\Def{Orthogonally semisimple} is a non-standard term;
I use it to mean ``semisimple as an operator on an orthogonal space'',
i.e., the representation breaks up as an \emph{orthogonal} direct sum of simple modules.
!!
As a consequence of the complex spectral theorem,
we see that normal operators are semisimple
...but
For the real case, we show semisimplicity first,
then existence of an eigenvector
semisimplicity
. for complex, semisimple iff diagonalizable
. for real, can also preserve planes (rotation etc.)
Semisimple means
Key idea:
- complementary
semisimple means have complement;
with an inner product, there is a natural choice of complement (the perp)
[the same idea is used for compact groups:
K[G] is semisimple]
If preserve $W < V$, then preserve $W^\perp$.
EG of semisimple but not ortho semisimple:
regular semisimple (diagonal with different eigenvalues)
w/r/t basis (1,0) (1,1)
Concretely,
A
\section{Examples}
Normal operators include:
- self-adjoint/Hermitian: $T^*=T$
- skew-Hermitian: $T^*=-T$
- Unitary: $T^*=T^{-1}$
...as in all cases $T,T^*$ generate a 1-dimensional algebra, hence commutative
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
urk...
one round-about proof idea of real:
(sorta)
complexify, diagonalize,
then note that eigenvalues are real b/c self-adjoint.
Problem: why is it *real* ortho diagonalizable?
Real case requires hard work to show that there is an eigenvector.
\begin{lem}
A self-adjoint operator has an eigenvector.
\end{lem}
\begin{proof}
For complex, all operators have an eigenvector (don't need self-adjoint).
For real,
(hard work)
\end{proof}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Key to understanding real self-adjoint operators
is to think of them as quadratic forms.
(Since we know the answer, we can work backwards from it:
we know that it can be put in standard form,
so the question is just how to find the axes.)
(copy this to discussion of eigenvectors;
an eigenvector is a fixed point of map on projective space;
Euler char tells you about existence of complex/odd real,
and that is a *non*constructive proof;
don't see how to interpret self-adjoint as map of proj space)
Key question:
- how do you find an eigenvector?
algebraic:
in algebraically closed (complex) case, use char poly
and existence of eigenvalue.
geometric:
- (Axler gives some)
- for self-adjoint, looking at Rayleigh quotient
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
in good basis, complex case says $T^t=T$
...but not in other bases
BTW, what about symmetric (not self-adjoint) operator on complex space?
not much to say?
space of all self-adjoints is a Jordan algebra,
(but not an algebra; the Jordan structure is *weird*)
but space of normal matrices is not
(sum of normals isn't normal)
...so algebraically less structured (afaict)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Pf:
(slightly trickier in real case: need to prove that have real
eigenvalues)
As above, but conjugate pair of complex eigenvalues yields
(if you only conjugate by real orthogonal matrices) a
non-symmetric 2x2 square on the diagonal (which is all you get if
you just assume normal), so you can't have any.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
hmmm....the Jordan algebra structure;
actually,
$\{A,B\} := AB+B^*A^*$
maps into self-adjoint operators
(you might want a $\half$ in there)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Elementary proof of real case:
- can do 2D by hand
(it's just homogeneous completing the square!)
- 3D case gets messy:
given x,y,z, can eliminate xy cross term,
but then if you want to eliminate xz, you reintroduce xy
(b/c you have yz, so z -> ax + bz)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Geometrically (more globally),
can also look at the orbits under orthogonal group of quadratic forms;
spectral theorem says you hit a diagonal one.
[stabilizer is O(n) on each eigenspace;
for regular ones, this is just $O(1)^n = C_2^n$]
IE, Cartan algebra conjugated by orthogonal group (maximal compact)
yields all symmetric matrices.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
BTW, this stuff really doesn't work in finite characteristic;
EG, consider
1 1
1 1
in characteristic 2.
Symmetric, but not diagonalizable.
Indeed, nilpotent!
(Over Q it is diagonalizable, orthogonally (natch)OA)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Applications}
\subsection{Real spectral theorem}
All applications I know use the \emph{form} point of view,
not the \emph{operator} point of view.
You can call the diagonal entries the ``eigenvalues'',
but that's confusing, because that's interpreting the form as an operator,
which is opaque.
Instead, I prefer to call them the spectrum or principal axes or principal weights.
(The eigenvalue point of view is probably useful for computations.)
\subsubsection{Statistics: Principal Component Analysis}
Covariance and variance are a symmetric bilinear form and the corresponding quadratic form on random variables with mean zero (it's semipositive\footnote{Aka, positive semidefinite}
because variance is semipositive\footnote{Aka, non-negative.}).
Given $n$ random variables (each with mean zero), you can define the covariance matrix,
which is the matrix for the covariance form.
Apply the spectral therom to orthodiagonalize it,
which yields $n$ uncorrelated (but not in general independent) variables.
Ordering the spectrum in decreasing order, the vectors corresponding to the higher weights
are the principal components (the main factors): the main causes of variance in the data.
% FIXME: is this the right name for diagonalizing the covariance matrix?
If you do this for empirical data (where you have more observations than dimensions of data),
you instead apply the SVD (Singular Value Decomposition, v.i.).
\subsubsection{Principal axes of an ellipsoid or quadratic form}
As far as I know, this is the original result, and the source of geometric intuition:
every ellipsoid (positive definite quadratic form) has orthogonal principal axes.
Indeed, you can do this for all quadratic forms (as for hyperbolic forms):
they are (orthogonally) diagonalizable.
\subsubsection{Differential Geometry: Principal Curvatures}
The spectral theorem yields the principal curvatures and directions on Riemannian manifolds
(extrinsically, for a surface in $\bR^3$,
as the principal values an axes of the second fundamental form;
intrinsically, I think as the principal values and axes of the metric tensor?).
(Note that the shape operator (Weingarten map) is the operator interpretation
of the second fundamental form.)
\subsubsection{Sylvester's law}
inertia, aka signature, is (positive,negative,zero)
rank and signature
Pseudo-Riemannian geometry (Minkowski 3+1 space-time)
- conjugate by diagonal matrices with all 1s except one entry,
which yields multiplication by $\alpha^2$ in that entry
...and get Sylvester's theorem
ellipsoids, hyperboloids (1 or 2 sheets)
BTW, many results go by the name Sylvester's theorem
\subsection{Complex spectral theorem}
- Complex Hilbert space version key in quantum mechanics
- spectra in physics: atom, simple harmonic oscillator, etc.
{ped}proof often not useful! (duh; elaborate:
if want to use, just give and apply;
the point of proving something is both certainty and *understanding*
(same as point of rigor)
For certainty, an elementary proof is fine;
for understanding, you really want to get into details!!
NB: shape operator is an operator POV on curvature?
[math applications of spectral theorem:
Laplacian -> orthogonality of sin/cos; (also d'Alembertian?)
Fourier theory]
BTW, the natural operation on quadratic forms is the Jordan product;
basically [(x+y)^2 - x^2 - y^2]/2
Spectrum of an operator vs. Spec of a ring:
$\Spec_{\text{linalg}} T = \Spec_{\text{alggeom}} K[T]$
(be a bit careful: V is a module over K[x],
so what exactly do you mean by "spectrum" of a module?)
[not related to spectrum in algebraic topology]
...and yes, it *is* related to atomic spectra!
Note on genericity:
generic matrix is regular semisimple
(distinct eigenvectors)
still true for normal/self-adjoint
(proof: characteristic poly distinct roots generically)
\section{Generalizations}
General spectral theorems often given \emph{sufficient} conditions
for an operator to be orthodiagonalizable:
``If you satisfy the following, you are orthodiagonalizable'',
and these are sufficient\footnote{No pun intended} for many applications.
Most satisfying are complete characterizations: if and only if.
\subsection{Singular Value Decomposition}
Singular Value Decomposition is the analog for a map between inner product spaces,
which is always true, and is (partly thus) very useful in numerical applications.
It's straight-forward:
combine classification of maps between different vector spaces
and classification of inner on a single vector space:
- take a kernel, take a perp
- take the image, take a perp
\ker^\perp T \isoto \im T
pull back the inner product on the target
Note that there are no conditions on T
[Actually, don't need to take a kernel;
just pull back inner product]
\subsection{Hilbert spaces}
Generalizations
Real form
Also true for *compact* self-adjoint operator on Hilbert spaces,
(compactness is necessary)
key issue is existence of eigenvector
[generalization of diagonalizable is "multiplication operator"]
and more generally,
and leads to Spectral theory.
? spectral decomposition
BTW,
T^*T yields a semipositive operator, positive iff $T$ is injective,
as $ = $
BTW, the Rayleigh quotient is useful numerically!
(while the characteristic polynomial isn't,
as the problem is often illconditioned)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{References}
Real case via Rayleigh quotient:
http://www.math.harvard.edu/~elkies/M55a.99/spectral.html
The complex case is essentially the proof in ``Linear Algebra Done Right'', by Axler, pp.~132--137.
Expositionary reference:
Farenick, Douglas R.; Pidkowich, Barbara A.F.
The spectral theorem in quaternions.
Linear Algebra Appl. 371, 75-102 (2003). [ISSN 0024-3795]
http://dx.doi.org/10.1016/S0024-3795(03)00420-8