Skip to content

Solutions of Equations in One Variables

基本想法 | Basic ideas for Solving Equation

To find the solution of an equation, we hope to have an iteration method which takes good advantage of Computer resources like

\[ \begin{equation} \pmb{x}^{k} = f(\pmb{x}^{k-1}) \label{eq: iterative eq} \end{equation} \]

for \(k = 1, 2, \cdots n\). We use \(\pmb{x}\) instead of \(x\) because the above iteration method also applies to solving linear system.

Hopefully, if the above equation converges, that is, for \(k \rightarrow \infty\), \(\pmb{x}^{k-1}\rightarrow \pmb{x}^*\), \(\pmb{x}^{k}\rightarrow \pmb{x}^*\), and the equation becomes

\[ \pmb{x}^* = f(\pmb{x}^*) \]

where \(\pmb{x^*}\) is the sulution of the equation to be solved.

If the above thought gets right, then we can consider the converging speed of the iterative process, which makes great sense in practical applications. That is, in a given definition of distence,

\[ \frac{\|\pmb{x}^{k+1} - \pmb{x}^*\|}{\|\pmb{x}^{k}-\pmb{x}^*\|^\alpha} \]

to be small as much as possible for each \(k\).

二分法 | the Bisection Method

This method is quite intuitive. By choosing two end points \(a, b\), we get another point (Mid-point here)

\[ p = a+\frac{b-a}{2} \]

Then update \(a, b\) by evaluating whether \(f(p)>0\) or not to narrow down the interval.

What is interesting is the stopping procedure. Readers can see the following question if interested.

When we calculate the new point \(p\) based on \(a, b\), we need to judge whether \(p\) is an appropriate answer. Apart from \(f(p)=0\), which condition do you think is the best?

  1. \((b-a)/{|\min{(a, b)}|}<\epsilon\)
  2. \(|p-p_{prev}|=(b-a)/2 < \epsilon\)
  3. \(f(p)<\epsilon\)
  • 1
  • 2
  • 3

Choose 1, which is close to relative error, currently the best.

2: consider \(\{p_n\}=\sum\limits_{i=1}^{n}\frac{1}{k}\).

3: easy to see.

The converging speed can discribed as the following:

\[ |x_n- x^*| < \frac{(b-a)}{2^{n}} \]

不动点法 | Fixed-Point Iteration

As we said previously in Basic ideas for solving equation, we hope to find an iterative relation such that the converging point \(x^*\) is exactly what we want, which in this case, means that

\[ f(x^*) = 0 \]

So intuitively, we ask: Whether can we derive a relation from

\[ \begin{equation} f(x) = 0 \label{zero-equation} \end{equation} \]

to

\[ x = g(x) \]

for iterative method?

The answer is, of course, YES!

One of the easist way to transform is adding \(x\) to both sides of equation \(\ref{zero-equation}\), but in most cases this does not work. Because we rely on \(f(x)\) ifself for the convergence!

Thus, it is necessary to find the condition for \(x = g(x)\) to converge. The following theorem

gives a Sufficient condition.

不动点存在定理 | Fixed-Point Theorom

Let \(g \in C[a, b]\) be such that \(g(x) \in [a, b]\), for all \(x\) in \([a, b]\). Suppose, in addition, that \(g'\) exists on \((a, b)\) and that a constant \(0 < k < 1\) exists with

\[ |g'(x)|\leq k \quad \forall x \in (a, b) \]

Then for any initial number \(p_0 \in [a, b]\), the sequence \(\{p_n\}_{n=0}^{\infty}\) defined by

\[ p_n = g( p_{n−1}) \quad n \geq 1 \]

converges to the unique fixed point \(p \in [a, b]\).

Proving it is easily.

  • using the differential mean value theorem.

\(\forall n \geq 1, \exists \zeta_n \in (p_{n-1}, p) \subset (a, b)\), we have

\[ |p_n-p| = |g(p_{n-1}) - g(p)| = g'(\zeta_n)|p_{n-1}-p|\leq k|p_{n-1}-p| \]

by induction, we have

\[ |p_n-p|\leq k^{n}|p_0-p| \]

Let \(n \rightarrow \infty\), \(|p_n-p| \rightarrow 0\), that is, \(p_n\) converges to \(p\).

What we use in the proof will benefit us in identifying the speed of converging process.

牛顿法 | Newton's Method

This method is also a fixed-point method. There are two perspectives to get the inspirations.

  • version1: shrink the derivative of the iterative function \(g(x)\).
  • version2: using Taylor's expansion.

We can know that given a random function \(f(x)\), it may not be convergent to some point \(x^*\) for \(x^{k} = f(x^{k-1}) + x^{k-1}\) in a given interval. So the queation is, can we formulate a function \(g(x)\) such that \(x^{k} = g(x^{k-1})\) is convergent?

The answer is, again, YES!

The following content tells us we can formulate \(g(x)= x - f(x)/(f'(x))\) such that \(g'(x) < 1\) in a given interval.

Readers can easily see that

\[ \begin{align*} g'(x) &= 1 - \frac{f'^2(x)-f''(x)f(x)}{f'^2(x)}\\ &=\frac{f''(x)f(x)}{f'^2(x)} \end{align*} \]

if we add some constrictions, it can be easy to make \(g'(x)<1\).

Here we make use of the Taylor's expansion(or the derivatives) of the goal function.

Suppose that \(f \in C^2[a, b]\), Let \(x_0 \in [a, b]\) be an approximation to \(x^*\) such that \(f(x^*) \neq 0\) and \(|x_0-x^*|\) is "small". Consider the first Taylor polynomial for \(f(x)\) expanded at \(x_0\):

\[ f(x) = f(x_0) + (x-x_0)f'(x_0) + \frac{(x-x_0)^2}{2}f''(\zeta). \]

If we let \(x = x^*\), and according to \(f(x^*)=0\), we get

\[ 0 = f(x_0) + (x^*-x_0)f'(x_0) + \frac{(x^*-x_0)^2}{2}f''(\zeta) \]

neglecting the square item, we get

\[ 0 \approx f(x_0) + (x^*-x_0)f'(x_0) \]

to represent \(x^*\), we get

\[ x^* = x_0 - \frac{f(x_0)}{f'(x_0)} \]

Then we can define the iterative relation as

\[ x_n = x_{n-1} - \frac{f(x_{n-1})}{f'(x_{n-1})} \]

The following statement guarrantees the convergence of the above iterative method.

牛顿法收敛条件 | conditions for convergence of Newton's method

Let \(f \in C^2[a, b]\). If \(p \in (a, b)\) is such that \(f (p) = 0\) and \(f'( p) \neq 0\), then there exists a \(\delta > 0\) such that Newton’s method generates a sequence \(\{p_n\}_{n=1}^{\infty}\) converging to \(p\) for any initial approximation \(p_0 \in [p − \delta, p + \delta]\).

Prove it.

  • make use of the condition \(f(p) = 0\) and \(f'(p)\neq 0\)

for \(x \in (a, b)\), we aim to find a narrower interval \((x^*-\delta, x^*+\delta)\) to have \(g(x)\) map into itself. That is,

\[ g(x)\leq k, \forall k\in (0,1) \]

Firstly, \(f'(p)\neq 0\) implies that \(\exists \delta_1 >0\) such that \(f'(x)\neq 0, \forall x \in [x^* - \delta, x^*+\delta]\subset [a, b]\).

THus, we have

\[ g'(x) = \frac{f(x)f''(x)}{(f'^2(x))} \]

capable of dividing non-zero numbers.

Since \(f\in C^2[a,b]\), we have \(g' \in C^1[x^*-\delta_1, x^*+\delta_1]\)
for the exact solution \(x^*\), we have \(f(x^*)=0\), so

\[ g'(x^*) = 0 \]

which implies that \(\exists 0<\delta < \delta_1\), such that

\[ g'(x)\leq k, \forall k \in [x^*-delta, x^*+\delta] \]

By differential Mean Value Theorem, for \(x \in [x^*-delta, x^*+\delta], \exists \zeta \in [x, x^*]\) such that

\[ |g(x)-g(x^*)|=g'(\zeta)|x - x^*|\leq k|x-x^*|<|x-x^*| \]

which means that \(g\) maps into itself. By Fixed-Point Theorom, the sequence defined by Newton's method converges.

  • Secant Method It may not be easy to find derivarive of function \(f\), so we can use difference instead. That is, we have to store two adjacent points for calculating differnce
\[ f'(x^{k}) \approx \frac{f(x^{k}) - f(x^{k-1})}{x^{k}-x^{k-1}} \]

generate \(p_{k+1}\) using the above approximation and iterate.

收敛阶数 | Order of Convergence

So how to identify the speed of convergence? The following definition gives a glimpse.

Suppose \(\{p_n\}_{n=1}^{\infty}\) is a sequence that converges to \(p\), with \(p_n \neq p (\forall n)\). If positive constants \(\lambda\) and \(\alpha\) exist with

\[ \lim_{n\rightarrow \infty}{\frac{|p_{n+1}-p|}{|p_n-p|^{\alpha}}}=\lambda \]

then \(\{p_n\}_{n=0}^{\infty}\) converges to \(p\) of order \(\alpha\), with asymptotic error constant \(\lambda\).

  • (i) If \(\alpha=1 (\lambda<1)\), the sequence is linearly convergent.
  • (ii) If \(\alpha=2\), the sequence is quadratically convergent.

The following theorem gives a sufficient condition for linear convergence.

线性收敛的充分条件 | sufficient condition of linear convergence

Let \(g \in C[a, b]\) be such that \(g(x) \in [a, b], \forall x \in [a, b]\). Suppose, in addition, that \(g\) is continuous on \((a, b)\) and a positive constant \(k < 1\) exists with

\[ |g'(x)|\leq k \quad \forall x \in (a, b) \]

If \(g'(p) \neq 0\), then for any number \(p_0=p\) in \([a, b]\), the sequence

\[ p_n=g(p_{n-1}) \quad \forall n \geq 1 \]

converges only linearly to the unique fixed point \(p\) in \([a, b]\).

Prove it.

Prove that linear convergence represents \(\exists alpha=1, lambda\), such that \(\lim\limits_{n\leftarrow \infty}\frac{|p_{n+1}-p^*|}{|p_{n}-p^*|} = \lambda\).

  • multiple roots We see that the speed of convergence is limited by multiple roots.

Here we have modified Newton's Method:

\[ g(x)= x - \frac{f(x)f'(x)}{f'^2(x)-f(x)f''(x)} \]

加速收敛 | Accelerating convergence

  • Aitken's \(\Delta^2\) Method

Suppose \(\{p_n\}_{n=0}^{\infty}\) is a linearly convergent sequence with limit \(p\). To motivate the construction of a sequence \(\{\hat{p}_n\}_{n=1}^{\infty}\) that converges more rapidly to \(p\) than does \(\{p_n\}_{n=0}^{\infty}\), let us first assume that the signs of \(p_n-p\), \(p_{n+1}-p\) and \(p_{n+2}-p\) agree and that \(n\) is sufficiently large that

\[ \frac{p_{n+1}-p}{p_n-p} \approx \frac{p_{n+2}-p}{p_{n+1}-p} \]

Then solving for \(p\) gives

\[ p \approx \frac{p_{n+2}p_n-p_{n+1}^2}{p_{n+2}-2p_{n+1}+p_n} \]

And to get \(p_n\) out gives

\[ \begin{align*} p &\approx p_n - \frac{(p_{n+1}-p_n)^2}{p_{n+2} - 2p_{n+1} + p_n}\\ \Rightarrow \hat{p}_n &= p_n - \frac{(\Delta p_n)^2}{\Delta p_{n+1}-\Delta p_{n}} \quad \text{(denote $\Delta p_n = p_{n+1} - p_n$)}\\ &= p_n - \frac{(\Delta p_n)^2}{\Delta^2 P_{n}} \end{align*} \]
  • Steffensen's Method

The following thought is based on that the generated sequence \(\hat{p}\) is a better approximation to true \(p^*\). We make use of the constructed sequence \(\{\hat{p}_n\}\) to update the original sequence \(\{p_n\}\). That is, after generating a new \(\hat{p}\), we can update \(p_0 \leftarrow p\).