Approximation Theory
最佳平方逼近 | Best Square Approximation¶
引入 | Lead-in¶
We call a linear space with norm if there exists a function \(\|\cdot\|\) that satisfies 3 properties(check here).
So here we introduce
to be the best square approximation of element \(x\), where \(x\in X\) is a linear space with norm 2 and \(Y\) is a subspace of \(X\).
We can also introduce norm with the definition of inner product. In Euclid space, \((\cdot,\cdot)\) is defined as
It is easy to see that \(\|x\|_2=\sqrt{(x,x)}\). So we can give a more specific problem of inner product space. Assume \(\varphi_i\), \((i=1,2\cdots,n)\) are \(n\) linearly irrelevant elements in inner product space \(X\), choose \(f\in X\), then subset
has a best approximation of \(f\), which is defined as
we call the \(\varphi\) that enables the above equation to be the best square appromation element.
最佳平方逼近元 | Properties of Best Square Appromation Element¶
Sufficient and Necessary Condition for best square approximation element
Assume \(X\) is an inner product space, \(f\in X\), \(\varphi^*\in \varPhi_0\) is the best square approximation element, if and only if
"\(\Leftarrow\)".
"\(\Rightarrow\)".
The above theorem gives a general method to solve the best square approximation element. That is, we define
then we can solve \(G\pmb{\alpha}^*=\pmb{\beta}\) for the best square approximation element
The above theorem also tells us, if \(\{\varphi_i\}_{i=1}^n\) are group of linearly irrelevant elements, then \(G\) has rank \(n\) and the parameters \(\{\alpha_i\}_{i=1}^n\) is unique. And it is easy to give the error estimation form
The geometrical meaning is also clear, if we define \(\Delta=f-\varphi^*\) and then
which means \(\varphi^*\) is the orthogonal projection of \(f\) on \(\varPhi_n\).
But actually it is not so easy to solve the above linear system, so we consider another way in specific situation.
有限函数空间上的最佳平方逼近 | Best Square Approximation on \(L_\rho^2[a,b]\)¶
Weight Function
Assume \(\rho(x)\) is Lebesgue integrable on \([a,b]\) and is \(0\) on at most only one set of measure zero, then we call \(\rho(x)\) the weight function.
If \(\rho(x)\) is the weight function on \([a,b]\) then we denote all the measurable function \(\rho(x)f^2(x)\) that is Lebesgue integrable on \([a,b]\), by \(L^2_\rho[a,b]\), which is a linear space. The inner product is defined by
which satisfies the 4 properties of inner product definition. So we can introduce natural norm
then \(L^2_\rho[a,b]\) is a linear space with norm, so we can consider the best approximation problem.
Because \(L^2_\rho[a,b]\) is a special inner product space \(X\), so the former discussion about the best square approximation element can be directly applied here. That is, if \(\{\varphi_i\}_{i=1}^n\) is a group of linearly irrelevant function in \(L^2_\rho[a,b]\), then the subset
is call the generalized polynomial space, whose element is called generalized polynomial, or polynomial for short. The best square approximation of \(f\) on \(\varPhi_n\) is defined by
and so does the best square approximation element \(\varphi^*\) is called the best square approximation polynomial(BSAP).
There is another way to get the BSAP.
If we choose
we can easily solve the approximation polynomial \(\sum\limits_{i=0}^\infty a_ix^i\) by letting \(\frac{\partial E}{\partial a_k}=0\), and get
正交多项式 | Orthogonal Polynomials¶
If we limit the base function of the subspace, we can simplify the matrix \(G\) and simplify the solving process. Actually, we are narrawing down the condition number of \(G\) by using orthogonal polynomials. Readers can use exactly the same statement from abstrat inner product space and its corresponding statement of orthogonal elements.
Orthogonal Polynomials
If \(\{\varphi_i\}_{i=1}^n \subset L^2_\rho[a,b]\) satisfy
where \(\sigma\) is a non-zero number, then we call \(\{\varphi_i\}_{i=1}^n\) are orthogonal on \([a,b]\) according to weight \(\rho(x)\). Furthermore, if we limit
then we call \(\{\varphi_i\}_{i=1}^n\) are normalized orthogonal system.
So from above definition we can get BSAP more easily with orthogonal functions \(\{\varphi_i\}_{i=1}^n\)
and its corresponding error
Q1. Use \(y=2^{ax+b}\) to approximate the following 3 points.
\(x_i\) | 0 | 1 | 4 |
---|---|---|---|
\(f(x_i)\) | 1 | 2 | 8 |
\(w\) | 1 | 1 | 1 |
离散最小二乘法 | Discrete Least Squares Approximation¶
This problem can be discribed as the best square approximation on Euclid space. That is, if \(y\in \mathbb{R}^n\), \(\{\pmb{x}_i\}_{i=1}^m\subset \mathbb{R}^n\) is a group of linearly irrelevant vectors, then
is a subspace of \(\mathbb{R}^n\).
So we can consider the best square approximation problem
where \(\|\cdot\|_2\) is the Euclid norm, i.e. \(\|\pmb{x}\|_2=\pmb{x}^T\pmb{x}\), \(x\in \mathbb{R}^n\)
That is, we have to solve
If we denote \(\pmb{x}_i=(x_{1i},x_{2i},\cdots,x_{ni})^T\) and
then it is not so hard to detect that the above equation can be rewritten as
Cause \(A^TA\) has full rank, we can get \(\pmb{\alpha^*} =(A^TA)^{-1}A^T\pmb{y}\).
最佳一致逼近 | Best Uniform Approximation¶
引入 | Lead-in¶
All the continuous function defined on \([a,b]\) compose a infinite-dimensional linear space, denoted by \(C_{[a,b]}\). To simplify our discription, we introduce norm \(\|\cdot\|_\infty\) to denote Chebyshev Norm
With the following theorem explored by Weierstrass, we can find an algebraic polynomial to sufficiently approximate function \(f\) \(\in C_{[a,b]}\).
Weierstrass First Approximation Theorem
For all given function \(f(x)\in C_{[a,b]}\), \(\forall \varepsilon>0\), \(\exists p(x)\) which is an algebraic function, s.t.
Prove it by Bernstein Polynomials.
But what we care more about is whether we can find a polynomial of degree no more than \(n\) to approximate function \(f(x)\)?
The answer to the above question results in Chebyshev Approximation.
Consider set
It is not hard to see that \(P_n\) is a subspace of \(C_{[a,b]}\) with \(n+1\) dimension. And the goal is to get
which is called the best uniform approximation of \(f(x)\).
最佳一致逼近的特征 | Characteristics of Best Uniform Approximation¶
We can represent the best uniform approximation with the introduction of deviation points.
Deviation Point Set
we define
to be positive and negative deviation point set, and define \(E(f)=E^+(f)\cup E^-(f)\) to be deviation point set.
Recall that \(f(x)\in C_{[a,b]}\), so \(E^+(f)\) and \(E^-(f)\) are both bounded closed set.
Alternating Point Set
If set \(\{x_1,x_2,\cdots,x_k\}\) \(\subset E(f)\) satisfies
then we call the above set Alternating Point Set. We call them them maximal alternating point set, if there does not exist an alternating point set of size larger than \(k\).
We can construct maximal alternating point set by the following way.
Construction of Maximal Alternating Point Set
Let \(S_1=\inf E^+(f)\cup E^-(f)\), construct \(x_k\) recursively
where
We assure \(x_k\in S_k\) when we let \(x_k=\inf S_k\) because \(S_k\) is bounded closed set. The recursion assures \(x_{k+1}\) is the minimal deviation point with opposite sign to \(x_k\), which of course assures \(\{x_i\}\) is monotonically increasing. We can prove that, if \(f\neq 0\), the above constructed set is maximal and finite(to be proved by readers).
Property of best uniform approximation on \(C_{[a,b]}\)
Assume \(f\in C_{[a,b]}\) and \(f\notin P_n\). If \(p(x)\) is the best uniform approximation polynomial of \(f(x)\) on \([a,b]\), then \(f-p\) has at least \(n+2\) alternating points.
By contradiction.
Vallée-Poisson Theorem
Assume \(f\in C_{[a,b]}\), if there exists polynomial \(p\in P_n\), such that \(f-p\) has at least \(n+2\) points \(x_1,x_2,\cdots,x_{n+2} \in [a,b]\) alternated with positive and negative, then
By contradiction.
Chebyshev Theorem
For all \(f\in C_{[a,b]}\), \(f\notin P_n\), \(p\in P_n\) is the best uniform approximation of \(f\) if and only if, \(f-p\) has at least \(n+2\) alternating points on \([a,b]\).
By Sequence Theorem.
Uniqueness of best uniform approximation
If \(f\in C_{[a,b]}\), then there exists a unique polynomial \(p\in P_n\) that is the best uniform approximation of \(f\).
By prove two best uniform approximation polynomials \(p_1,p_2\) are the same.
Prove Uniqueness.
Assume \(p_1,p_2\in P_n\) are both the best uniform approximation polynomials of \(f\),i.e.
then define \(p_0=(p_1+p_2)/2\), then
which means \(p_0\) is also the best uniform approximation polynomial of \(f\). By Chebyshev Theorem, there exists \(n+2\) alternating points \(x_0,x_1,\cdots,x_{n+1}\).
Note that
which means "=" must holds, i.e.
and
which means \(p_1(x_k)=p_2(x_k)\), that is, \(p_1-p_2\) has \(n+2\) roots, so \(p_1=p_2\).
第一类Chebyshev多项式 | First Class Chebyshev Polynomials¶
Coonsider \(n+1\) extreme points of \(cosn\theta\) on \([0,\pi]\). If \(p(x)\) is the polynomial of best uniform approximation of \(f\) on \([a,b]\), then \(f-p\) has at least \(n+2\) alternating points.
We define \(x = \cos\theta\) and
is a polynomial of degree \(n\), i.e. \(T_n(x)\in P_n\). It is easy to see the resursive definition
By definition, we have properties for first class Chebyshev Polynomials.
Properties of First Class Chebyshev Polynomials
(1) Prove the recursive form of definition.
(2) the coefficient of item with the highest degree of \(T_n(x)\) is \(1/2^{n-1}\).
(3) \(|T_n(x)|\leq 1\), \(\forall |x|\leq 1\).
(4) \(T_n(x)\) has \(n\) different real roots
(5) \(\{\cos(k\pi/n)\), \(k=0,1,\cdots,n\}\) are a maximal alternating point set of \(T_n(x)\) on \([-1,1]\).
(6) \(T_n(x)=(-1)^nT_n(-x)\).
(7)
(1) by combination property of triangular functions.
(7) by the orthogonal property of triangular functions.
With the above property, we can see the following theorem.
\(T_n(x)\) is the is the best uniform approximation of 0
\(T_n(x)\) is the best uniform approximation of function \(f\equiv 0\), that is, \(\forall\) Monicpolynomial \(p\in P_n\)
Use \(n\) Polynomial \(P_n(x)\) to approximate function \(f\) on region \([-1,1]\), its remainder
- Minimizing Approximation Error on Arbitrary Intervals
The technique for choosing points to minimize the interpolating error is extended to a general closed interval \([a, b]\) by using the change of variables
Q. Find the best approximating polynomial of \(f (x) = e^x\) on \([0, 1]\) such that the absolute error is no larger than \(0.5\times 10^4\).
\(a=0\), \(b=1\), so
So the actual function to be approximated is
So
幂级数的降维 | Economization of Power series¶
This part is also called Reducing the Degree of Approximating Polynomials.
Consider approximating an arbitrary \(n\)th-degree polynomial
with a polynomial of degree at most \(n − 1\).
To let \(\max\limits_{x\in [-1,1]}|P_n(x)-P_{n-1}(x)|\) to be mininal, equals to let it be
附录: 有限区间上的正交多项式 | Appendix: Orthogonal polynomials on \(L^2_\rho[a,b]\)¶
性质 | Properties¶
Properties
If \(\omega_0(x), \omega_1(x),\cdots\) are orthogonal polynomials on space \(L^2_\rho[a,b]\) by orthogonalizing power series, then is must follow
(i) \(\omega_n(x)\) is a \(n\)th algebraic polynomial.
(ii) \(\forall p \in P_k\), \(k\leq n\), \(p\) can be represented as
(iii) \(\omega_n(x)\) is orthogonal to all polynomials whose degree is less than \(n\), that is,
首一正交多项式 | Construction of Monic Orthogonal Polynomials¶
Construction of Monic Orthogonal Polynomials
Assume \(\{\overline{\omega}_i(x)\}_{i=0}^\infty\) are Monic Orthogonal Polynomials, then they satisfy the following recurrence relation
where
By using the property of orthogonal polynomials.
To simplify the notation, we temporarily use \(\omega_n(x)\) to replace \(\overline{\omega}_n(x)\).
We focus on \(x\omega_n(x)\), which is a \(n+1\)th polynomial, so it can be represented by \(\omega_0,\omega_1,\cdots,\omega_n\), i.e.
where \(c_i\) are parameters to be determined.
Now we notice that \((\omega_n,\omega_s)=0\), \(s\leq n-1\), so we first employ inner product on both sides of \(\ref{eq1}\) with \(\omega_s\) (\(s=0,1,\cdots,n-2\))
Because we have an exact meaning of inner product, that is, integral form, so we have \((x\omega_n, \omega_s)=(\omega_n, x\omega_s)\), then
for \(s=0,1,\cdots,n-2\), we get
and
So equation \(\ref{eq2}\) becomes
which means \(c_s=0\), \(s=0,1,\cdots n-2\). Then we rewrite equation \(\ref{eq1}\)
In a similar way, we try employing inner product on both sides of the above equation \(\ref{eq3}\) with \(\omega_{n-1}\)
which gives \(c_{n-1}=(x\omega_n, \omega_{n-1})/(\omega_{n-1},\omega_{n-1})\). Notice
so \(c_{n-1}=(\omega_n, \omega_{n})/(\omega_{n-1},\omega_{n-1})\).
In a similar way, employing inner product on both sides of the above equation \(\ref{eq3}\) with \(\omega_{n}\)
which gives \(c_{n}=(x\omega_n, \omega_{n})/(\omega_{n},\omega_{n})\).
Rewrite equation \(\ref{eq3}\) and we prove the theorem.
零点分布 | Roots of Orthogonal Polynomials¶
Roots of Orthogonal Polynomials
\(n\)th Orthogonal Polynomial \(\omega_n(x)\) has \(n\) distinct roots on \([a,b]\).
By contradiction. First show that \(\omega_n(x)\) must have root, and then show it is not multiple root. Finally show the number of roots must be equal to \(n\). All the proof can be done by making use of properties.
常见的正交多项式 | Common Orthogonal Polynomials¶
-
Legendre Polynomial (on \(L^2[-1,1]\))
-
First class Chebyshev Polynomial (on \(L^2_\rho[-1,1]\) with \(\rho=1/\sqrt{1-x^2}\))
-
Second class Chebyshev Polynomial (on \(L^2_\rho[-1,1]\) with \(\rho=\sqrt{1-x^2}\))
-
Laguerre Polynomial (on \(L^2(0,\infty)\) with \(\rho=e^{-x}\))
-
Hermite Polynomial (on \(L^2(\infty,\infty)\) with \(\rho=e^{x^2}\))