Preliminaries¶

Maps on Euclidean Space¶

Suppose \(U \subset \mathbb{R}^n\) is an open set, \(F: U\rightarrow \mathbb{R}^m\), defined by \(F(x)=y\), where \(x=(x^1, \cdots, x^n)\), \(y=(y^1, \cdots, y^m)\). Use \(\pi^\alpha: \mathbb{R}^n \rightarrow \mathbb{R}\) to be the projection to the \(\alpha\)-th coordinate, i.e. \(\pi^\alpha (x^1, \cdots, x^n)=x^\alpha\). Then \(y=F(x)\) could be represented as

\[ y=F(x)=(f^1(x), \cdots, f^n(x)),\quad x\in U \]

where \(f^\alpha=\pi^\alpha \circ F: U\rightarrow \mathbb{R}\), which is called the component function.

If each component function of \(F\) is differentiable (\(C^k, C^\infty, C^\omega\)) at \(a \in U\), then we call \(F\) is differentiable (\(C^k, C^\infty, C^\omega\)) at \(a \in U\).

If \(F\) is differentiable on \(U\), then

\[ \frac{\partial (f^1, \cdots, f^m)}{\partial (x^1, \cdots, x^n)}= \left[ \begin{array}{ccc} \frac{\partial f^1}{\partial x^1}& \cdots & \frac{\partial f^1}{\partial x^n}\\ \vdots & \ddots & \vdots\\ \frac{\partial f^m}{\partial x^1} & \cdots & \frac{\partial f^m}{\partial x^n} \end{array} \right] \]

whose each element is a function on \(U\). We call the above matrix Jacobi matrix, denoted \(DF\). When \(F\) is \(C^k\), \(DF\) is \(C^{k-1}\).

The following theorem is parallel to that in one-variable function.

Theorem

Suppose \(U \subset \mathbb{R}^n\) is an open set, map \(F: U\rightarrow \mathbb{R}^m\) is differentiable, iff there exsits a linear map \(A: \mathbb{R}^n\rightarrow \mathbb{R}^m\) and \(R(x, a)=(r^1(x,a), \cdots, r^m(x,a))\) such that

\[ F(x)=F(a) + A(x-a) + \|x-a\| R(x,a),\quad \lim_{x\rightarrow a}\|R(x,a)\|=0. \]

Proof

From the proof, we could know that the above \(A\) could be denoted by \(DF(a)\).

Suppose \(U,V\) are open subsets of \(\mathbb{R}^n\), \(\mathbb{R}^m\), maps \(F: U\rightarrow V\), \(G:V\rightarrow \mathbb{R}^p\), then the map \(H=G\circ F: U\rightarrow \mathbb{R}^p\) is called the composition of \(F\) and \(G\). Parallel to the composition of one-variable functions, we have the following chain rules.

Chain rules

Suppose \(F, G, H\) are defined as above. If \(F\) is differentiable at \(a\in U\), and \(G\) is differentiable at \(F(a)\in V\), then \(H\) is differentiable at \(a\) and holds the following equation.

\[ GH (a)=DG(F(a)) \cdot DF(a). \]

Proof

By definitions.

Readers could prove that if \(F\) and \(G\) are \(C^k\) maps, then \(H=G\circ F\) is also a \(C^k\) map.

Inverse Function Theorem¶

Example. Suppose \(F:\mathbb{R}^m\rightarrow \mathbb{R}^m\) is a homogeneous linear transformation, i.e.

\[ F(x)=A\cdot x,\quad x\in \mathbb{R}^m \]

or

\[ f^i = \sum_{j=1}^m a^i_j x^j. \]

Easy to show that \(DF(x)=A\). If \(A\) is inversible, then \(F\) is a diffeomorphism.

Inverse function theorem

Suppose \(U\subset \mathbb{R}^m\) is an open set, \(F: U\rightarrow \mathbb{R}^m\) is a \(C^k\) map. If for \(a\in U\), \(DF(a)\) is inversible, then there exists an open neighborhood \(W\in U\), such that \(F: W\rightarrow F(W)=V\) is a \(C^k\) diffeomorphism. Furthermore, if \(x\in W\), \(y=F(x)\), then the differential of \(F^{-1}\) at \(y\) is

\[ DF^{-1}(y)=(DF(x))^{-1}. \]

Without generality, in the following proof, we assume \(F(0)=0\) and \(DF(0)=I\).

Lemma 1Proof for Lemma 1Lemma 2Proof for Lemma 2Proof for Theorem

There exsits an open neighborhood \(W\) of \(a\), such that \(F|_W: W\rightarrow \mathbb{R}^m\) is injective. Furthermore, for all \(x, y\in W\),

\[ 2\|F(x)-F(y)\|\geq \|x-y\|. \]

Define \(G:U\rightarrow \mathbb{R}^m\) by \(G(x)=x-F(x)\), which satisfies \(G(0)=0\) and \(DG(0)=\mathbf{0}\). Since \(F\in C^1(U)\), we have \(DG(x)\) is continuous. Therefore, there exists a real number \(r>0\), such that \(\overline{B}_r(0) \subset U\) and each element of \(DG(x)\) is less than \(1/(2m)\) for \(x\in \overline{B}_r(0)\). Thus

\[ Tr (DG^T(x) DG(x)) \leq \frac{1}{4m^2} \cdot m^2\leq \frac{1}{4},\Rightarrow \|DG(x)\|\leq \frac{1}{2},\quad x\in \overline{B}_r(0). \]

For all \(x_1,x_2\in \overline{B}_r (0)\), by changing the difference into integral and chain rule

\[ G(x_2)-G(x_1)=\int_0^1 \frac{d}{dt} [G(x_1+(x_2-x_1)t)]dt=\int_0^1 DG(x_1+(x_2-x_1)t) (x_2-x_1)dt \]

take norm and we have

\[ \|G(x_2)-G(x_1)\|\leq \int_0^1 \|DG(x_1+(x_2-x_1)t)\| \|x_2-x_1\|dt\leq 1/2 \|x_2-x_1\|. \]

At last, by introducing triangle inequality, we have

\[ 1/2 \|x_2-x_1\| \geq \|(x_2-x_1)-(F(x_2)-F(x_1))\| \geq \|x_2-x_1\| -\|F(x_2)-F(x_1)\| \]

and we get the result.

\(\square\)

Suppose \(W\subset \mathbb{R}^m\) is an open set, and map \(F:W\rightarrow \mathbb{R}^m\) which satisfies for all \(x\in W\), \(DF(x)\) is inversible, then \(F(W)\) is an open set, i.e. \(F\) is an open map. If \(F\) is one-to-one, then \(F^{-1}\) is continuous.

We only need to prove that for any \(a\in W\), \(F(W)\) is an open neighborhood of \(F(a)\). To be more specific, by translation, for any ball \(\overline{B}_r(0) \subset W\), \(F(\overline{B}_r(0))\subset \overline{B}_{r/2}(0)\subset F(W)\).

By Lemma 1, there exists \(r>0\), such that \(\|G(x_2)-G(x_1)\|\leq 1/2 \|x_2-x_1\|\) for \(x\in \overline{B}_r(0)\). Let \(x_2=0\), \(G(x_2)=0\), and we have \(\|G(x_1)\|\leq 1/2\|x_1\|\). For a given \(y\in \overline{B}_{r/2}(0)\), define \(T: \overline{B}_{r}(0)\rightarrow \mathbb{R}^m\), by \(T(x)=y-G(x)\). Actually, since

\[ \|y-G(x)\|\leq \|y\|+ \|G(x)\|\leq r/2 + r/2=r, \quad \forall x\in \overline{B}_{r}(0) \]

So \(T\) actually maps into itself. Since \(\|T(x_2)-T(x_1)\|\leq \|G(x_2)-G(x_1)\|\leq 1/2 \|x_2-x_1\|\), \(T\) is a contraction map, so there exists a unique fixed point \(x\in \overline{B}_{r}(0)\) such that \(T(x)=x\), which means \(y=F(x)\), meaning \(y\in F(W)\). The above contraction map method is usually used to proving the solution for an equation, i.e. the range of a function.

\(\square\)

Since \(DF(x_0)\) is inversible, there exists an open neighborhood \(W\) of \(x_0\) such that \(|DF(x)|\neq 0\) for \(x\in W\). By lemma 1, we have \(F(x)|_W\) is injective. By lemma 3, \(F(W)\) is open, so \(F\) is a homomorphism.

Denote \(H\) as the inverse of \(F\). We first show that it is \(C^1\). For any \(x\in W\), let \(y = F(x), y_0=H(x_0)\), expand it at \(x_0\),we have

\[ \begin{align*} F(x)&=F(x_0) + DF(x_0) (x-x_0) + o(\|x-x_0\|),\\ \Rightarrow y&=y_0 + DF(H(y_0))(H(y)-H(y_0)) + o(\|H(y)-H(y_0)\|) \end{align*}\]

so multiply both sides \(DF(x_0)^{-1}\) since \(|DF(x_0)|\neq 0\), we have

\[ H(y)=H(y_0) + DF(x_0)^{-1} (y-y_0) + o(\|H(y)-H(y_0)\|) \]

the rest item could be \(o(\|H(y)-H(y_0)\|)=o(\|y-y_0\|)\) since \(\|x-x_0\|\leq 2\|F(x)-F(x_0)\|\).

So we have \(DH(y_0)=DF(x_0)^{-1}\).

Now we show that \(H\) is \(C^k\). We prove by induction. Assume \(H\) is \(C^{l}\) for \(l\leq k-1\), then by

\[ DH = (DF \circ H)^{-1} \]

which means \(DH\) is \(C^l\), thus \(H\) is \(C^{l+1}\). Since \(F\) is \(C^k\), we have \(H\) is \(C^k\).

\(\square\)

We have the following corollaries.

Implicite function theorem

Suppose \(U,V\) are open subsets of \(\mathbb{R}^m\) and \(\mathbb{R}^n\). The map \(F:U\times V\rightarrow \mathbb{R}^n\) is \(C^k\). If for \(x_0\in U\), \(y_0 \in V\), and map \(y\mapsto f(x_0,y)\) whose differential at \(y_0\), i.e \(D_2 f(x_0,y_0)\) is inversible, then there exists a neighborhood \(U_0\subset U\) of \(x_0\), and a uniquely determined \(C^k\) map \(g: U_0\rightarrow \mathbb{R}^n\), such that \(g(x_0)=y_0\). Furthermore, for each \(x\in U_0\), we have

\[ f(x,g(x))= f(x_0,y_0). \]

Proof

Similar to what we have in Differential geometry, we consider a higher dimensional map \(F: U\times V\rightarrow \mathbb{R}^m\times \mathbb{R}^n\), \((x,y)\rightarrow (x, f(x,y))\), then

\[ DF(x_0,y_0)=\left[\begin{array}{cc} I & 0\\ D_1 f(x_0,y_0)& D_2 f(x_0,y_0) \end{array}\right], \]

since \(D_2 f(x_0,y_0)\) is inversible, \(DF(x_0,y_0)\) is inversible. Then by Inverse function theorem, there exists a small neighborhood \(U_0\times V_0\) of \((x_0,y_0)\) such that \(F\) is its unique inverse map. Define projection \(\pi: \mathbb{R}^m\times \mathbb{R}^n\rightarrow\mathbb{R}^n\), \((x,y)\mapsto y\), and let

\[ g(x)=\pi \circ F^{-1}(x, f(x_0,y_0)), \quad x_0\in U_0 \]

so \(g\) is the desired map. This is because \(F(x, g(x))=(x, f(x, g(x)))= (x, f(x_0, y_0))\).

\(\square\)

Rank Theorem

Suppose \(A, B\) is open set on \(\mathbb{R}^m\), \(\mathbb{R}^n\), \(F: A\rightarrow B\) is a \(C^k\) map, and \(DF(x)=r\) for all \(x\in A\). Assume \(a\in A\), \(b=F(a)\in B\), then there exist open neighborhood \(A_0\subset A, B_0\subset B\) of \(A, B\), and a \(C^k\) homeomorphism \(u: A_0\rightarrow U \subset \mathbb{R}^m\), \(v: B_0\rightarrow V\subset \mathbb{R}^n\), such that \(v\circ F\circ u^{-1}: U\rightarrow V\) has the following form

\[ v\circ F\circ u^{-1}(x^1, \cdots, x^m)=(x^1, \cdots, x^r, 0,\cdots, 0) \in \mathbb{R}^n. \]

Proof

\(\square\)