Solutions of Equations in One Variables
基本想法 | Basic ideas for Solving Equation¶
To find the solution of an equation, we hope to have an iteration method which takes good advantage of Computer resources like
for \(k = 1, 2, \cdots n\). We use \(\pmb{x}\) instead of \(x\) because the above iteration method also applies to solving linear system.
Hopefully, if the above equation converges, that is, for \(k \rightarrow \infty\), \(\pmb{x}^{k-1}\rightarrow \pmb{x}^*\), \(\pmb{x}^{k}\rightarrow \pmb{x}^*\), and the equation becomes
where \(\pmb{x^*}\) is the sulution of the equation to be solved.
If the above thought gets right, then we can consider the converging speed of the iterative process, which makes great sense in practical applications. That is, in a given definition of distence,
to be small as much as possible for each \(k\).
二分法 | the Bisection Method¶
This method is quite intuitive. By choosing two end points \(a, b\), we get another point (Mid-point here)
Then update \(a, b\) by evaluating whether \(f(p)>0\) or not to narrow down the interval.
What is interesting is the stopping procedure. Readers can see the following question if interested.
When we calculate the new point \(p\) based on \(a, b\), we need to judge whether \(p\) is an appropriate answer. Apart from \(f(p)=0\), which condition do you think is the best?
- \((b-a)/{|\min{(a, b)}|}<\epsilon\)
- \(|p-p_{prev}|=(b-a)/2 < \epsilon\)
- \(f(p)<\epsilon\)
- 1
- 2
- 3
Choose 1, which is close to relative error, currently the best.
2: consider \(\{p_n\}=\sum\limits_{i=1}^{n}\frac{1}{k}\).
3: easy to see.
The converging speed can discribed as the following:
不动点法 | Fixed-Point Iteration¶
As we said previously in Basic ideas for solving equation, we hope to find an iterative relation such that the converging point \(x^*\) is exactly what we want, which in this case, means that
So intuitively, we ask: Whether can we derive a relation from
to
for iterative method?
The answer is, of course, YES!
One of the easist way to transform is adding \(x\) to both sides of equation \(\ref{zero-equation}\), but in most cases this does not work. Because we rely on \(f(x)\) ifself for the convergence!
Thus, it is necessary to find the condition for \(x = g(x)\) to converge. The following theorem
gives a Sufficient condition.不动点存在定理 | Fixed-Point Theorom
Let \(g \in C[a, b]\) be such that \(g(x) \in [a, b]\), for all \(x\) in \([a, b]\). Suppose, in addition, that \(g'\) exists on \((a, b)\) and that a constant \(0 < k < 1\) exists with
Then for any initial number \(p_0 \in [a, b]\), the sequence \(\{p_n\}_{n=0}^{\infty}\) defined by
converges to the unique fixed point \(p \in [a, b]\).
Proving it is easily.
- using the differential mean value theorem.
\(\forall n \geq 1, \exists \zeta_n \in (p_{n-1}, p) \subset (a, b)\), we have
by induction, we have
Let \(n \rightarrow \infty\), \(|p_n-p| \rightarrow 0\), that is, \(p_n\) converges to \(p\).
What we use in the proof will benefit us in identifying the speed of converging process.
牛顿法 | Newton's Method¶
This method is also a fixed-point method. There are two perspectives to get the inspirations.
- version1: shrink the derivative of the iterative function \(g(x)\).
- version2: using Taylor's expansion.
We can know that given a random function \(f(x)\), it may not be convergent to some point \(x^*\) for \(x^{k} = f(x^{k-1}) + x^{k-1}\) in a given interval. So the queation is, can we formulate a function \(g(x)\) such that \(x^{k} = g(x^{k-1})\) is convergent?
The answer is, again, YES!
The following content tells us we can formulate \(g(x)= x - f(x)/(f'(x))\) such that \(g'(x) < 1\) in a given interval.
Readers can easily see that
if we add some constrictions, it can be easy to make \(g'(x)<1\).
Here we make use of the Taylor's expansion(or the derivatives) of the goal function.
Suppose that \(f \in C^2[a, b]\), Let \(x_0 \in [a, b]\) be an approximation to \(x^*\) such that \(f(x^*) \neq 0\) and \(|x_0-x^*|\) is "small". Consider the first Taylor polynomial for \(f(x)\) expanded at \(x_0\):
If we let \(x = x^*\), and according to \(f(x^*)=0\), we get
neglecting the square item, we get
to represent \(x^*\), we get
Then we can define the iterative relation as
The following statement guarrantees the convergence of the above iterative method.
牛顿法收敛条件 | conditions for convergence of Newton's method
Let \(f \in C^2[a, b]\). If \(p \in (a, b)\) is such that \(f (p) = 0\) and \(f'( p) \neq 0\), then there exists a \(\delta > 0\) such that Newton’s method generates a sequence \(\{p_n\}_{n=1}^{\infty}\) converging to \(p\) for any initial approximation \(p_0 \in [p − \delta, p + \delta]\).
Prove it.
- make use of the condition \(f(p) = 0\) and \(f'(p)\neq 0\)
for \(x \in (a, b)\), we aim to find a narrower interval \((x^*-\delta, x^*+\delta)\) to have \(g(x)\) map into itself. That is,
Firstly, \(f'(p)\neq 0\) implies that \(\exists \delta_1 >0\) such that \(f'(x)\neq 0, \forall x \in [x^* - \delta, x^*+\delta]\subset [a, b]\).
THus, we have
capable of dividing non-zero numbers.
Since \(f\in C^2[a,b]\), we have \(g' \in C^1[x^*-\delta_1, x^*+\delta_1]\)
for the exact solution \(x^*\), we have \(f(x^*)=0\), so
which implies that \(\exists 0<\delta < \delta_1\), such that
By differential Mean Value Theorem, for \(x \in [x^*-delta, x^*+\delta], \exists \zeta \in [x, x^*]\) such that
which means that \(g\) maps into itself. By Fixed-Point Theorom, the sequence defined by Newton's method converges.
- Secant Method It may not be easy to find derivarive of function \(f\), so we can use difference instead. That is, we have to store two adjacent points for calculating differnce
generate \(p_{k+1}\) using the above approximation and iterate.
收敛阶数 | Order of Convergence¶
So how to identify the speed of convergence? The following definition gives a glimpse.
Suppose \(\{p_n\}_{n=1}^{\infty}\) is a sequence that converges to \(p\), with \(p_n \neq p (\forall n)\). If positive constants \(\lambda\) and \(\alpha\) exist with
then \(\{p_n\}_{n=0}^{\infty}\) converges to \(p\) of order \(\alpha\), with asymptotic error constant \(\lambda\).
- (i) If \(\alpha=1 (\lambda<1)\), the sequence is linearly convergent.
- (ii) If \(\alpha=2\), the sequence is quadratically convergent.
The following theorem gives a sufficient condition for linear convergence.
线性收敛的充分条件 | sufficient condition of linear convergence
Let \(g \in C[a, b]\) be such that \(g(x) \in [a, b], \forall x \in [a, b]\). Suppose, in addition, that \(g\) is continuous on \((a, b)\) and a positive constant \(k < 1\) exists with
If \(g'(p) \neq 0\), then for any number \(p_0=p\) in \([a, b]\), the sequence
converges only linearly to the unique fixed point \(p\) in \([a, b]\).
Prove it.
Prove that linear convergence represents \(\exists alpha=1, lambda\), such that \(\lim\limits_{n\leftarrow \infty}\frac{|p_{n+1}-p^*|}{|p_{n}-p^*|} = \lambda\).
- multiple roots We see that the speed of convergence is limited by multiple roots.
Here we have modified Newton's Method:
加速收敛 | Accelerating convergence¶
- Aitken's \(\Delta^2\) Method
Suppose \(\{p_n\}_{n=0}^{\infty}\) is a linearly convergent sequence with limit \(p\). To motivate the construction of a sequence \(\{\hat{p}_n\}_{n=1}^{\infty}\) that converges more rapidly to \(p\) than does \(\{p_n\}_{n=0}^{\infty}\), let us first assume that the signs of \(p_n-p\), \(p_{n+1}-p\) and \(p_{n+2}-p\) agree and that \(n\) is sufficiently large that
Then solving for \(p\) gives
And to get \(p_n\) out gives
- Steffensen's Method
The following thought is based on that the generated sequence \(\hat{p}\) is a better approximation to true \(p^*\). We make use of the constructed sequence \(\{\hat{p}_n\}\) to update the original sequence \(\{p_n\}\). That is, after generating a new \(\hat{p}\), we can update \(p_0 \leftarrow p\).