Markov Process & chain¶

Definition of Markov Process

Assume \(\{X(t): t\in T\}\) is a stochastic process. If \(\forall n\), \(t_i\in T\), \(i=1,2,\cdots,n\), and \(t_1<\cdots<t_n\), we have state \(x_1,\cdots,x_n\in S\), the conditional probability \(X(t_n)\) satisfies

\[ \begin{align*} P&(X(t_n)<x_n\mid X(t_0)=x_0,\cdots X(t_{n-1})=x_{n-1})\\&=P(X(t_n)<x_n\mid X(t_{n-1})=x_{n-1}),\quad x_n\in \mathbb{R}, \end{align*} \]

then \(X(t)\) is called Markov Process.

When the state space \(S\) is discrete, then we call the above process Markov Chain.

When both the state space \(S\) and time space \(T\) are discrete, then the above process is called discrete Markov Chain.

Transition Probability Distribution

We call the following conditional distribution

\[ F(t_{n-1}, x_{n-1}; t_n, x_n)=P(X(t_n)<x_n\mid X(t_{n-1})=x_{n-1}). \]

trransition probability distribution.

For discrete Markov chain, \(\forall n,m \in T\), \(i,j \in S\), we define transition probability

\[ p_{ij}(m,n)=P(X(n)=j\mid X(m)=i) \]

which means the probability of transiting from state \(i\) at time \(m\) to state \(j\) at time \(n\). Apparently, we have \(p_{ij}(m,n)\geq 0\) and

\[ \sum_{j=1}^\infty p_{ij}(m,n)=1. \]

Define transition matrix of \(n\) steps

\[ P(m,m+n)= [p_{ij}(m, m+n)]_{|S|\times |S|} \]

Specifically, we focus on transition matrix of one step. If \(P(m,m+1)\) is irrelevant with \(m\), then we call the above Markov chain time-homogeneous. So we denote

\[ p_{ij}=p_{ij}(1)=p_{ij}(m,m+1). \]

C-K Equation¶

Chapman-Kolmogorov Equation

Assume \(\{X_n\}\) is a discrete Markov Chain, and its state space is \(S\). Then \(\forall h,l\in \mathbb{N}^+\), and \(\forall m\in \mathbb{N}\),

\[ p_{ij}(m,m+h+l)=\sum_{r=0}^\infty p_{ir}(m,m+h) \cdot p_{rj}(m+h,m+h+l). \]

which is called Chapman-Kolmogorov Equation.

Proof

By definition,

\[ \begin{align*} p_{ij}&(m,m+h+l)\\ &=P(X_{m+h+l}=j\mid X_m=i)\\ &=\frac{P(X_{m+h+l}=j, X_m=i)}{P(X_m=i)}\\ &=\frac{\sum_{r=0}^\infty P(X_{m+h+l}=j, X_{m+h}=r, X_m=i)}{P(X_m=i)}\\ &=\sum_{r=0}^\infty \frac{P(X_{m+h+l}=j, X_{m+h}=r, X_m=i)}{P(X_{m+h}=r, X_m=i)}\frac{P(X_{m+h}=r, X_m=i)}{P(X_m=i)}\\ &=\sum_{r=0}^\infty P(X_{m+h+l}=j\mid X_{m+h}=r, X_m=i)\cdot P(X_{m+h}=r\mid X_m=i)\quad \text{Totel probability formula}\\ &=\sum_{r=0}^\infty P(X_{m+h+l}=j\mid X_{m+h}=r) P(X_{m+h}=r\mid X_m=i)\quad\text{By Markov property}\\ &=\sum_{r=0}^\infty p_{ir}(m,m+h) \cdot p_{rj}(m+h,m+h+l). \end{align*} \]

Corollary: Multiple transition Probability

Transition probablity of one step of a discrete Markov chain could determine the transition probability of multiple steps.

Proof

By induction. For \(h=2\), we have

\[ p_{ij}(m,m+2)=\sum_{r=0}^\infty p_{ir}(m,m+1)\cdot p_{rj}(m+1,m+2). \]

Assume for \(h=k\), the theorem holds. Then for \(h=k+1\), we have

\[ p_{ij}(m,m+k+1)=\sum_{r=0}^\infty p_{ir}(m, m+k)\cdot p_{rj}(m+k, m+k+1). \]

If we write a matrix, then

\[ P(m,m+k+1)=P(m,m+k)P(m+k,m+k+l). \]

Note here is the premultiplication of matrices.

Determination of the distribution of discrete Markov Chain

Assume \(\{X_n\}\) is a discrete Markov Chain, and its state space is \(S\). The finite distribution of \(X_n\) is determined if all the transition probability of one step \(p_{ij}(m,m+1)\) and the initial distribution \(p_i=P(X_0=i)\) is determined.

Proof

\(\forall k\in \mathbb{N}^+\), \(\forall 0<m_1<\cdots<x_k\in T\), and \(i_1,\cdots,i_k\in S\), we have

\[ \begin{align*} P(&X_{m_k}=i_k,\cdots,X_{m_1}=i_1)\\ &=\sum_{i=0}^\infty P(X_{m_k}=i_k,\cdots,X_{m_1}=i_1, X_{0}=i)\\ &=\sum_{i=0}^\infty P(X_{m_k}=i_k\mid X_{m_{k-1}}=i_{k-1}, \cdots, X_{m_1}=i_1, X_{0}=i)\cdot P(X_{m_{k-1}}=i_{k-1},\cdots,X_{m_1}=i_1, X_{0}=i)\\ &=\sum_{i=0}^\infty p_{i_{k-1}i_{k}} (m_{k-1}, m_k) \cdot P(X_{m_{k-1}}=i_{k-1} \mid X_{m_{k-2}}=i_{k-2}\cdots,X_{m_1}=i_1, X_{0}=i)\\ &=\sum_{i=0}^\infty p_i \cdot p_{i i_1}(0,m_1)\cdot \prod_{j=1}^{k-1} p_{i_j i_{j+1}}(m_{j}, m_{j+1}) \end{align*} \]

State Space¶

Now we focus on time-homogeneous discrete Markov Chain.

Invariance of Time

Assume \(X_n\) is a time-homogeneous discrete Markov Chain, then its transition probability of \(h\) steps

\[ p_{ij}(m,m+h) \]

is irrelevant with time \(m\).

Proof

Use C-K equation

\[ \begin{align*} p_{ij}(m,m+h)&=\sum_{r_1=0}^\infty p_{ir_1}(m,m+1)\cdot p_{r_1j}(m+1,m+h)\\ &=\sum_{r_1=0}^\infty p_{ir_1} \sum_{r_2=0}^\infty p_{r_1r_2}\cdot p_{r_2j}(m+2,m+h)\\ &=\sum_{r_1=0}^\infty p_{ir_1}\sum_{r_2=0}^\infty p_{r_1r_2}\cdots \sum_{r_{h-1}=0}^\infty p_{r_{h-1}r_{h}} \end{align*} \]

which is only dependent on steps \(h\). This is similar to stationary increment property.

Accessibility & Communicating Class¶

Accessibility of state

If there exsits \(N>0\) such that \(p_{ij}(N)>0\), then we call that state \(i\) could access state \(j\), denoted by

\[ i\rightarrow j. \]

If \(i\rightarrow j\) and \(j\rightarrow i\), then we call \(i\), \(j\) communicate, denoted by

\[ i\leftrightarrow j. \]

Properties of accessibility

(i) If \(i\rightarrow j\) and \(j\rightarrow k\), then \(i\rightarrow k\).

(ii) Reflexivity. If \(i\leftrightarrow j\), then \(j\leftrightarrow i\).

(iii) Transitivity. If \(i\leftrightarrow j\) and \(j\leftrightarrow i\), then \(i\leftrightarrow i\).

Proof

(i) There exsits \(N_1,N_2>0\), s.t. \(p_{ij}(N_1)>0\) and \(p_{jk}(N_2)>0\), then by C-K equation

\[ p_{ik}(N_1+N_2)=\sum_{r=0}^\infty p_{ir}(N_1)p_{rk}(N_2)\geq p_{ij}(N_1)p_{jk}(N_2)>0. \]

By Properties of accessibility, the following communicating class defines an equivalent class.

Communicating class

We denote \(C(j)\) to be the communicating class of state \(j\), which is actually the collection of all states mutually accessible from \(j\), which has the following properties.

(i) If \(i\leftrightarrow j\), then \(C(i)=C(j)\).

(ii) If two state set \(C_1\), \(C_2\) are communicating classes, then either \(C_1=C_2\) or \(C_1\cap C_2=\varnothing\).

If any two states of a state space \(S\) communicate, i.e. all states are in one common communicating class, then we call the Markov chain irreducible. This is the smallest unit we shall focus on.

Proof

(i) Easy to see. \(\forall k\in C(i)\), we have \(k\leftrightarrow i\). Since \(i\leftrightarrow j\), we have \(k\leftrightarrow j\), so \(k\in C(j)\). The other direction is the same.

(ii) By definition of communicating class, there exists \(i\), \(j\) such that

\[ C_1=C(i),\quad C_2=C(j). \]

If \(C_1\cap C_2\neq \varnothing\), then choose one \(g\in C_1\cap C_2\), which satisfies \(g\leftrightarrow i\) and \(g\leftrightarrow j\). So

\[ i\leftrightarrow j. \]

which means \(C_1=C(i)=C(j)=C_2\) by (i).

\(\square\)

Some books also defines the following notions, which does not help actually. But readers could check it and make it as an example.

Other notions of states

If \(C(j)\neq\varnothing\), then we call \(j\) a return state. If \(C(j)= \varnothing\), then we call \(j\) a non-return state.

In further study, we would see a non-return state must be a transient state, but a return state should be decomposed carefully.

In the following discussion, we would notice that a communicating class partitions the categories of states, which is for transitivity of period, Transitivity of Recurrent state, Transitivity of positive and null recurrent states.

The following definition is a criterion for checking whether a state is recurrent, which would be of importance in Relationship between communicating class and recurrent state after we introduce limit theorem.

Closed state class

We call set of state \(C\) a closed state set, if for all state \(j\not\in C\), and \(i\in C\), \(j\not\leftrightarrow i\).

If a closed state set has only one element \(j\), then we call \(j\) an absorbing state.

Period of State¶

Now we focus on return state, other than absorbing state or non-return state, since both of which does not have periodical phenomenon.

Definition of period of state

We denote \(d(j)\) to be the period of a return state \(j\), defined by

\[ d(j)=gcd\{n: p_{jj}(n)>0, n\geq 1\}. \]

if \(d(j)=1\), we call \(j\) non-periodical state, which includes the absorbing state \(k\), whose \(d(k)=1\).

Usually, if state \(i\) has period \(d\), then for all \(n\neq 0(\text{mod } d)\), we have \(p_{ii}(n)=0\). Conversely speaking, if \(n\equiv 0(\text{mod } d)\), \(p_{ii}(n)>0\) do not hold necessarily.

To be elaborate, we have the following theorem.

Theorem for period

If state \(i\) has a period \(d\), then there exsits \(M\in\mathbb{N}^+\), such that for all \(n>M\),

\[ p_{ii}(nd)>0. \]

Here we have the first transitivity of communicating class, which is for period.

Transitivity of period

If \(i\leftrightarrow j\), then \(d(i)=d(j)\).

Proof

Use a special loop. Since \(i\leftrightarrow j\), so there exsits \(n,m>0\) such that \(p_{ij}(n)>0\) and \(p_{ji}(m)>0\), so

\[ p_{ii}(n+m)\geq p_{ij}(n)p_{ji}(m)>0. \]

So \(d(i)\mid n+m\).

By definition of \(d(j)=gcd\{n_1,n_2,\cdots,n_k,\cdots\}\), we have \(p_{jj}(n_k)>0\) for all \(k=1,\cdots\). So for each \(k\),

\[ p_{ii}(n+m+n_k)\geq p_{ij}(n)p_{jj}(n_k)p_{ji}(m)>0. \]

So \(d(i)\mid n+m+n_k\). Combined with the above two, we have \(d(i)\mid n_k\), which means \(d(i)\mid d(j)\).

Similarly we have \(d(j)\mid d(i)\).

Ergodicity of State¶

Here we dig deeper into the probability itself for return states. Notice some return states are possible not to return in the long run, while others are always possible to return. So here comes two equivalent languages to describe the phenomenon.

First-passage Probability of State in n Steps¶

Definition of passage probability of State

Define the first-passage probability from state \(i\) to \(j\) after \(n\) steps.

\[ f_{ij}(n)=P(X_n=j,X_{n-1}\neq j,\cdots, X_1\neq j \mid X_0=i). \]

Define Ultimate probability from state \(i\) to \(j\)

\[ f_{ij}=P(X_m=j,\exists m\mid X_0=i). \]

Note that

\[ f_{ij}=\sum_{n=0}^\infty f_{ij}(n), \]

which is the core of our analysis.

The following relationship is useful in proving limit theorem, though it might not help in distinguishing return states in this part.

Theorem for relationship between first-passage probability and passage probability

For any state \(i,j\in S\), \(\forall n\geq 1\),

\[ p_{ij}(n)=\sum_{m=1}^n f_{ij}(m) p_{ij}(n-m). \]

Proof

\[ \begin{align*} p_{ij}(n)&=P(X_n=j\mid X_0=i)\\ &=\sum_{m=1}^n P(X_n=j,X_m=j,X_{m-1}\neq j,\cdots, X_1\neq j\mid X_0=i)\quad \text{This is a partition}\\ &=\sum_{m=1}^n \frac{P(X_n=j,X_m=j,X_{m-1}\neq j,\cdots, X_1\neq j, X_0=i)}{P(X_0=i)}\\ &=\sum_{m=1}^n P(X_n=j\mid X_m=j,X_{m-1}\neq j,\cdots, X_1\neq j, X_0=i)\cdot P(X_m=j, \cdots,X_1\neq j \mid X_0=i)\\ &=\sum_{m=1}^n p_{jj}(n-m) f_{ij}(m). \end{align*} \]

\(\square\)

First Hitting Time as Random Variable¶

Another version using First Hitting Time

We define

\[ T_j=\inf\{n\geq 1: X_n=j\} \]

to be First Hitting Time of \(j\).

Then

\[ P(T_j=n\mid X_0=i)=f_{ij}(n). \]

and

\[ P(T_j<\infty\mid X_0=i)=f_{ij}. \]

Notice that the above "\(<\infty\)" means the chain could reach state \(j\) from \(i\) in finitely many steps.

Recurrence & Transient¶

Now we could come to the core defintion in this part, here we use from the above two equivalent definitions.

Definition of Recurrent State, self-reachable

If \(f_{jj}=1\), or \(P(T_j<\infty\mid X_0=j)=1\), then state \(j\) is called recurrent state.

If \(f_{jj}<1\), or \(P(T_j<\infty\mid X_0=j)<1\), then state \(j\) is called transient state.

For recurrent state \(j\), we have \(\sum_{n=1}^\infty f_{jj}(n)=1\). So \(f_{jj}(n)\) becomes a discrete distribution.

Now we could talk about the mathematical characteristics of the above distribution and its practical meaning. From Another version using First Hitting Time, we have

\[ P(T_j=n\mid X_0=j)=f_{jj}(n), \quad n=1,2,\cdots \]

Definition of Mean Recurrence Time

For recurrent state \(j\), Mean recurrence time is defined as ME of \(T_j\)

\[ \tau_i=\mu_i=E(T_j\mid X_0=j)=\sum_{ij1}^\infty nf_{jj}(n). \]

If \(\mu_i<+\infty\), then \(j\) is called Positive recurrent state. If \(\mu_i=+\infty\), then \(i\) is called Null recurrent state.

A non-periodic positive recurrent state is also called Ergodic state.

Another version using an equation

Define \(r_0=1\), \(r_k=P(T_i>k\mid X_0=i)\), \(k\geq 1\), then

\[ \mu_i=E(T_i\mid X_0=i)=\sum_{k=0}^\infty r_k. \]

This could be explained by this. We use the definition of expectation. Since

\[ P(T_i>k\mid X_0=i)=\sum_{n=k+1}^\infty P(T_i=n\mid X_0=i)=\sum_{n=k+1}^\infty f_{ii}(n) \]

So

\[ \begin{align*} \sum_{k=0}^\infty r_k&=\sum_{k=0}^\infty P(T_i>k\mid X_0=i)\\ &=\sum_{k=0}^\infty \sum_{n=k+1}^\infty f_{ii}(n)\\ &=\sum_{n=0}^\infty n f_{ii}(n)\quad \text{by triangle choice}\\ &=\mu_i \end{align*} \]

Green Function¶

Actually, it is really hard to calculate \(f_{ii}(n)\) in some complex Markov chain. So we have to give another description of recurrent state. The core function is called Green Function

\[ G(j)=\sum_{n=1}^\infty p_{jj}(n). \]

rth-passage Probability of State in Infinite Steps¶

rth-passage Probability

Define rth-passage probability

\[ Q_{ij}(r)=P(\forall R\geq r, \exists \{r_i\}_{1\leq i\leq R}, r_i\geq 1, s.t. X_{r_i}=j\mid X_0=i) \]

which means the probability for the chain to pass \(j\) in at least \(r\) times with initial state \(i\). Easy to have

\[ f_{ij}=Q_{ij}(1), \]

so rth passage probability is based on steps \(n\rightarrow \infty\).

Define \(Q_{ij}:=\lim\limits_{r\rightarrow \infty}Q_{ij}(r)\) to mean the probability that the chain reaches \(j\) in infinitely many times from initial state \(i\), which would be interpreted by nth hitting time more naturally.

Properties of rth-passage probability

(i) Monotonically decreasing.

\[ Q_{ij}(1)\geq Q_{ij}(2)\geq \cdots \]

(ii) If \(Q_{ij}=1\), which means starting from state \(i\), we would reach state \(j\) infinitely many times. So by property (i), we have \(Q_{ij}(n)=1\).

(iii) If \(Q_{jj}=1\), then \(Q_{jj}(1)=f_{jj}=1\), state \(j\) is a recurrent state.

Relationship between \(f_{ij}\) and \(Q_{ij}\)

For any state \(i,j\in S\), and parameter \(r\geq 1\), we have the iterative equation

\[ Q_{ij}(r+1)=f_{ij} Q_{jj}(r) \]

Proof

Still use first-passage to partition the space.

\[ \begin{align*} Q_{ij}(r+1)&=P(\forall R\geq r+1, \exists \{r_i\}_{1\leq i\leq R}, r_i\geq 1, s.t. X_{r_i}=j\mid X_0=i)\\ &=\sum_{n=1}^\infty P(\forall R\geq r, \exists \{r_i\}_{1\leq i\leq R}, r_i\geq n+1, s.t. X_{r_i}=j, X_n=j, X_{n-1}\neq j,\cdots, X_1\neq j \mid X_0=i)\\ &=\sum_{n=1}^\infty P(\forall R\geq r, \exists \{r_i\}_{1\leq i\leq R}, r_i\geq n+1, s.t. X_{r_i}=j \mid X_n=j )\cdot \\ &\quad P(X_n=j, X_{n-1}\neq j,\cdots, X_1\neq j \mid X_0=i)\quad \text{by Markov property}\\ &=\sum_{n=1}^\infty P(\forall R\geq r, \exists \{r_i\}_{1\leq i\leq R}, r_i\geq 1, s.t. X_{r_i}=j \mid X_0=j ) f_{ij}(n)\quad\text{by time-homogeneous}\\ &=\sum_{n=1}^\infty Q_{jj}(r)f_{ij}(n)=f_{ij}Q_{jj}(r). \end{align*} \]

\(\square\)

Corollary

We could inductively show that

\[ Q_{jj}(r)=(f_{jj})^r. \]

Proof

Apply Relationship between \(f_{ij}\) and \(Q_{ij}\) to \(Q_{jj}(m)\) iteratively.

\[ Q_{jj}(r)=f_{jj} Q_{jj}(r-1)=\cdots=(f_{jj})^{r-1} Q_{jj}(1)=(f_{jj})^{r}. \]

\(\square\)

rth Hitting Time as Random Variables¶

rth Hitting Time

Since random variable first hitting time is defined by

\[ T_{jj}(\omega)=\inf\{k\geq 1: X_k(\omega)=j\}. \]

So we have the inductive definition for rth hitting time

\[ S_{jj}^{(0)}=0, \quad S_{jj}^{(1)}=T_{jj},\quad S_{jj}^{(r)}=\inf\{k\geq S_{jj}^{(r-1)}+1: X_k(\omega)=j\}. \]

and the rth interval time

\[ T_{jj}^{(0)}=0, \quad T_{jj}^{(r)}=S_{jj}^{(r)}-S_{jj}^{(r-1)}. \]

check the following image.

It is easy to have

\[ P(S_{jj}(r)<\infty)=Q_{jj}(r). \]

Properties of rth hitting time and interval time

\(\{T_{jj}(r)\}\) are mutually independent and follow the same distribution. Moreover, they are independent with \(S_{jj}(r)\) and

\[ P(T_{jj}(r+1)=n\mid S_{jj}(r)<\infty)=P(T_{jj}(1)=n). \]

Proof

By Markov property.

The following theorem is the same with Relationship between \(f_{ij}\) and \(Q_{ij}\) essentially.

Relationship between first & rth hitting time

\[ P(S_{jj}(r)<\infty)=P(T_{jj}<\infty)^r. \]

Proof

We prove here with induction. For \(r=0\), \(1=1\) and the proposition holds.

Assume for \(r=k\) the proposition holds, i.e. \(P(S_{jj}(k)<\infty)=P(T_{jj}<\infty)^k\). Then

\[ \begin{align*} P(S_{jj}(k+1)<\infty)&=P(T_{jj}(k+1)<\infty, S_{jj}(k)<\infty)\\ &=P(T_{jj}(k+1)<\infty\mid S_{jj}(k)<\infty )P(S_{jj}(k)<\infty)\\ &=P(T_{jj}(1)<\infty)P(T_{jj}<\infty)^k\\ &=P(T_{jj}<\infty)^{k+1}. \end{align*} \]

\(\square\)

Number of visits in n steps¶

Number of visits in n steps

Define

\[ N_{jj}(n)=\sum_{i=1}^n 1_{\{X_n=j\mid X_0=j\}} \]

to denote the number of time the chain reaches state \(j\) from initial state \(j\) in \(n\) steps. Here \(1_{\{X_n=j\mid X_0=j\}}\) denotes the event whether the chain reaches state \(j\) from initial state \(j\) in exactly \(n\) steps.

So

\[ EN_{jj}(n)=\sum\limits_{i=1}^n p_{jj}(i),\quad EN_{jj}=\lim_{n\rightarrow\infty}EN_{jj}(n)=\sum\limits_{n=1}^\infty p_{jj}(n). \]

Proof

We could take another perspective from number of visits in n steps. Notice a property of ME

\[ \sum_{i=1}^\infty P(A_i)=E\sum_{i=1}^\infty 1_{A_i} \]

where \(\sum\limits_{i=1}^\infty 1_{A_i}\) is actually the total number of events \(A_1,\cdots,A_i,\cdots\) happening. Note that \(A_i\) are not necessarily mutually disjoint, but it does not impede our analysis.

Then we have the relationship between \(S_{jj}(r)\) and \(N_{jj}(n)\).

Relationship between n-visit probability \(S_{jj}(n)\) and \(N_{jj}(n)\)

Easy to see that

\[ P(N_{jj}(n)>r)=P(S_{jj}(r)<n) \]

So let \(n\rightarrow \infty\), we have

\[ P(N_{jj}>r)=P(S_{jj}(r)<\infty). \]

Now we give the main result of this part.

Recurrent state description with passage probability

(i) State \(j\) is a recurrent state, iff \(\sum\limits_{n=1}^\infty p_{jj}(n)=+\infty\).

(ii) State \(j\) is a transient state, iff \(\sum\limits_{n=1}^\infty p_{jj}(n)<+\infty\).

Proof using number of visitsProof using squeeze method

we make use of the relationship between the first hitting time and nth hitting time.

\[ \begin{align*} \sum\limits_{n=1}^\infty p_{jj}(n)&=EN_{jj}=\sum_{r=0}^\infty P(N_{jj}>r)\\ &=\sum_{r=0}^\infty P(S_{jj}(r)<\infty)=\sum_{r=0}^\infty Q_{jj}(r)\\ &=\sum_{r=0}^\infty f_{jj}^r=\frac{1}{1-f_{jj}}. \end{align*} \]

So if \(j\) is recurrent, then \(f_{jj}\rightarrow 1\), the series diverges. Otherwise \(j\) is transient, so \(f_{jj}<1\), the series converges.

The proof is done by leveraging Theorem for relationship between first-passage probability and passage probability.

Sufficient for (i) and necessary for (ii).

We have

\[ \begin{align*} p_{jj}(n)&=\sum_{m=1}^n f_{jj}(m)p_{jj}(n-m)\\ \Rightarrow \quad \sum_{n=1}^N p_{jj}&=\sum_{n=1}^N \sum_{m=1}^n f_{jj}(m)p_{jj}(n-m)\\ &=\sum_{m=1}^N \sum_{n=m}^N f_{jj}(m)p_{jj}(n-m)\\ &=\sum_{m=1}^N f_{jj}(m) \sum_{n=m}^N p_{jj}(n-m)\\ &\leq \sum_{m=1}^N f_{jj}(m) \sum_{n=0}^N p_{jj}(n) \end{align*} \]

so we have

\[ \frac{\sum_{n=1}^N p_{jj}(n)}{p_{jj}(0)+\sum_{n=1}^N p_{jj}(n)}\leq \sum_{m=1}^N f_{jj}(m) \]

let \(N\rightarrow\infty\), we have

\[ 1\leq f_{jj} \]

which means \(f_{jj}=1\), so it is recurrent state. So if state \(j\) is transient, then it must have \(\sum\limits_{n=1}^\infty p_{jj}(n)<+\infty\).

Necessary for (i) and sufficient for (ii).

Still using the same formula, we choose a smaller one \(M\leq N\)

\[ \begin{align*} \sum_{n=1}^N p_{jj}(n)&=\sum_{m=1}^N f_{jj}(m)\sum_{n=0}^{N-m} p_{jj}(n)\\ &\geq \sum_{m=1}^M f_{jj}(m)\sum_{n=0}^{N-m} p_{jj}(n)\\ &\geq \sum_{m=1}^M f_{jj}(m)\sum_{n=0}^{N-M} p_{jj}(n) \end{align*} \]

So

\[ \sum_{m=1}^M f_{jj}(m)\leq \frac{\sum_{n=1}^{N} p_{jj}(n)}{p_{jj}(0) +\sum_{n=1}^{N-M} p_{jj}(n)} \]

let \(N\rightarrow \infty\), if \(\sum_{n=1}^{N} p_{jj}(n)<\infty\), then the right-hand limit is strictly less than \(1\), i.e.

\[ \sum_{m=1}^M f_{jj}(m)\leq \delta<1 \]

so let \(M\rightarrow \infty\), we have a contradiction. Actually, if \(\sum_{n=1}^{N} p_{jj}(n)<\infty\), then \(f_{jj}<1\), which means state \(j\) is transient state.

Here we use the Green function to give the second transitivity of communicating class for recurrent states.

Corollary: Transitivity of Recurrent state

If state \(i\leftrightarrow j\), and \(j\) is a recurrent state, then \(i\) is also a recurrent state.

Proof

Since \(i\leftrightarrow j\), \(\exists m,n\geq 1\), such that

\[ p_{ij}(m)>0, \quad p_{ji}(n)>0. \]

\[ \begin{align*} \sum_{k=1}^\infty p_{ii}(k)&\geq \sum_{k=1}^\infty p_{ii}(n+m+k)\\ &\geq \sum_{k=1}^\infty p_{ij}(m)p_{jj}(k)p_{ji}(n)\\ &= p_{ij}(m)p_{ji}(n) \sum_{k=1}^\infty p_{jj}(k)=+\infty. \end{align*} \]

we are done.

\(\square\)

Limit Theorem¶

We now focus on the outer characteristics of recurrent, transient and ergodic state, the limit of \(p_{jj}(n)\).

The following theorem has considered the influence of period. If a positive recurrent state has a period, it might not have the limit of \(p_{jj}(n)\) (this is actually the limit distribution item we will talk about in the next part), but just a convergent subsequence. So a more general statement of the following Theorem for limit of passage probability of three states (i) and (ii) considering period is

Lemma: subsequence convergence and Cesàro limit

Assume the Markov chain is irreducible. If state \(j\) is positive recurrent, then

\[ \limsup_{n\rightarrow\infty}p_{jj}(n)>0 \]

or we have its Cesàro limit (take an average of the first \(n\) items)

\[ \lim_{n\rightarrow \infty}\frac{1}{n}\sum_{m=1}^n p_{jj}(m)=\frac{1}{\mu_j}>0, \]

without considering period.

Proof

Define some variables. In the following discussion, we assume the probability is conditioned on \(X_0=j\).

In number of visits in n steps, we have already defined \(N_{j}(n)\) to be the number of reaching \(j\) in \(n\) steps, which is also called Sojourn number.

Define \(T_1\) to be the number of steps for reaching \(j\) at the first time, and \(T_k\) to be the number of steps for reaching \(j\) at \(k\) times after reaching \(j\) at \(k-1\) times. So by time-homogeneous property, we have \(T_1,\cdots,T_k,\cdots\) are mutually independent and follow the same distribution, i.e. first hitting time \(T_j\), that is,

\[ P(T_k=n)=f_{jj}(n), \quad n=1,2,\cdots,\forall k=1,2,\cdots \]

and satisfies \(E(T_k)=\mu_j\).

Define accumulated time \(S_k=\sum_{i=1}^k T_i\). Then we have the relationship

\[ N_j(n)=\max\{k\geq 1: S_k \leq n\}. \]

or

\[ S_k=\min\{n\geq 1: N_j(n)=k\}. \]

Note that the above relationship also holds in Poisson Process.

Prove first \(\lim\limits_{n\rightarrow \infty} \frac{N_j(n)}{n}=\frac{1}{\mu_j}\)

Notice their relationship

\[ S_{N_j(n)}\leq n\leq S_{N_j(n)+1} \]

so

\[ \frac{S_{N_j(n)}}{N_j(n)}\leq \frac{n}{N_j(n)}\leq \frac{S_{N_j(n)+1}}{{N_j(n)+1}}\frac{{N_j(n)+1}}{N_j(n)}. \]

By Strong law of large number, we have \(\frac{S_n}{n}\rightarrow \mu_j, a.s.\), and since \(N_j(n)\rightarrow \infty\) as \(n\rightarrow \infty\) (recurrent state),

\[ \frac{n}{N_j(n)}\rightarrow \mu_j\quad a.s. \]

and we have

\[ \frac{N_j(n)}{n}\rightarrow \frac{1}{\mu_j}\quad a.s. \]

Prove second \(\lim\limits_{n\rightarrow \infty} \frac{1}{n}\sum_{m=1}^n p_{jj}(m)=\frac{1}{\mu_j}\).

By definition of sojourn number, we have

\[ N_j(n)=\sum_{m=1}^n 1_{\{X_m=j\}},\quad 1_{\{X_m=j\}}\sim B(1,p_{jj}(m)) \]

So

\[ E\frac{N_j(n)}{n}=\frac{1}{n}\sum_{m=1}^n p_{jj}(n), \]

take the limit and we have the result.

Theorem for limit of passage probability of three states

(i) If state \(j\) is a non-periodic positive recurrent state (ergodic), then

\[ \lim_{n\rightarrow \infty}p_{jj}(n)=\frac{1}{\mu_j} \]

(ii) If state \(j\) is a positive recurrent state with period \(d\), then

\[ \lim_{n\rightarrow \infty}p_{jj}(nd)=\frac{d}{\mu_j}, \]

(iii) If state \(j\) is null recurrent state or transient state, then

\[ \lim_{n\rightarrow \infty}p_{jj}(n)=0. \]

Proof for (i)Proof for (ii)Proof for (iii)

We could derive this from Cesàro limit. We could also get the same result from subsequence limit.

Assume \(\lim_{n\rightarrow \infty}p_{jj}(n)=\frac{1}{\mu_j}+\delta\), substitute in the result of the lemma and we have \(\delta=0\).

\(\square\)

Let \(n=dn'\), we could make \(X_{dn'}=X_{n}\) a non-periodical Markov chain. So by (i)

\[ \lim_{n\rightarrow\infty}p_{jj}(n)=\lim_{n'\rightarrow\infty}p_{jj}(n')=\frac{1}{\mu'}. \]

Notice that

\[ \mu'=\sum_{n'=1}^\infty n'f_{jj}(dn')=\frac{1}{d}\sum_{n'=1}^\infty n'd f_{jj}(n'd)=\frac{\mu}{d}. \]

and we are done.

\(\square\)

For null recurrent state, by (ii), let \(\mu_j\rightarrow \infty\), and we have \(\lim\limits_{n\rightarrow \infty}p_{jj}(nd)=0\). Since \(p_{jj}(n)=0\) for \(n\neq\equiv 0(\text{ mod } d)\), so by Haine Theorem,

\[ \lim\limits_{n\rightarrow \infty}p_{jj}(nd)=0. \]

For transient state, just by convergence of \(\sum_{n=1}^\infty p_{jj}(n)\), we have each item \(\lim_{n\rightarrow \infty }p_{jj}(n)=0\).
\(\square\)

Now we could give the last transitivity of communicating class, which is the most detailed.

Corollary: Transitivity of positive and null recurrent states

(i) If state \(i\leftrightarrow j\), and \(j\) is a positive recurrent state, then \(i\) is also a positive recurrent state.

(ii) If state \(i\leftrightarrow j\), and \(j\) is a null recurrent state, then \(i\) is also a null recurrent state.

Proof

By Transitivity of Recurrent state, we have \(i\leftrightarrow j\). So \(\exists m,n>0\), such that \(p_{ij}(m)>0\), and \(P_{ji}(n)>0\), so

\[ p_{ii}(m+n+kd)\geq p_{ij}(m)p_{jj}(kd)p_{ji}(n) \]

let \(k\rightarrow \infty\), we have

\[ \lim_{k\rightarrow\infty}p_{ii}(m+n+kd)\geq p_{ji}(n) p_{ij}(m)\lim_{k\rightarrow\infty} p_{jj}(kd) \]

If \(j\) is a positive recurrent state, then we have \(\lim\limits_{n\rightarrow\infty} p_{ii}(n)>0\) and we are done.

\(\square\)

Now the following theorem gives a limit situation for different initial states.

Limit theorem for states starting from different initial state

(i) If \(j\) is a non-periodic positive recurrent state, then \(\forall\) state \(i\),

\[ \lim_{n\rightarrow \infty}p_{ij}(n)=\frac{f_{ij}}{\mu_j}. \]

(ii) If \(j\) is a null recurrent state or transient state, then \(\forall\) state \(i\),

\[ \lim_{n\rightarrow \infty}p_{ij}(n)=0,\quad i\in S. \]

Proof for (i)Proof for (ii)

We have a lot of proof method, including using Lebesgue's Dominated convergence theorem. Here we give a squeeze method.

By Theorem for relationship between first-passage probability and passage probability, we have

\[ p_{ij}(n)=\sum_{m=1}^n f_{ij}(m)p_{jj}(n-m) \]

choose \(M<n\), we have

\[ \begin{align*} p_{ij}(n)&=\sum_{m=1}^M f_{ij}(m)p_{jj}(n-m) + \sum_{m=M+1}^n f_{ij}(m)p_{jj}(n-m)\\ &\leq \sum_{m=1}^M f_{ij}(m)p_{jj}(n-m) +\sum_{m=M+1}^n f_{ij}(m) \end{align*} \]

let \(n\rightarrow \infty\), and let \(M\rightarrow \infty\), we have

\[ \begin{align*} \lim_{n\rightarrow \infty}p_{ij}(n)&\leq \sum_{m=1}^M f_{ij}(m)\frac{1}{\mu_j} +\sum_{m=M+1}^\infty f_{ij}(m)\\ \lim_{M\rightarrow \infty}\lim_{n\rightarrow \infty}p_{ij}(n)&\leq \frac{f_{ij}}{\mu_j}+0. \end{align*} \]

For the other direction, we just leave out \(n-M\) number of summation items, and let \(M, n\rightarrow \infty\), we have

\[ \begin{align*} p_{ij}(n)&\geq \sum_{m=1}^M f_{ij}(m)p_{jj}(n-m)\\ \lim_{n\rightarrow \infty} p_{ij}(n)&\geq \sum_{m=1}^M f_{ij}(m) \frac{1}{\mu_j}\\ \lim_{M\rightarrow \infty}\lim_{n\rightarrow \infty}p_{ij}(n)&\geq \frac{f_{ij}}{\mu_j}. \end{align*} \]

So we are done.

\(\square\)

For null recurrent state.

\[ \begin{align*} p_{ij}(n)&=\sum_{m=1}^M f_{ij}(m)p_{jj}(n-m) + \sum_{m=M+1}^n f_{ij}(m)p_{jj}(n-m)\\ &\leq \sum_{m=1}^M f_{ij}(m)p_{jj}(n-m)+ \sum_{m=M+1}^n f_{ij}(m)\\ \Rightarrow \quad \lim_{n\rightarrow \infty} p_{ij}(n)&\leq 0+ \sum_{m=M+1}^\infty f_{ij}(m)\\ \lim_{M\rightarrow \infty}\lim_{n\rightarrow \infty}p_{ij}(n)&\leq 0 \end{align*} \]

and we are done.

For transient state. By Recurrent state description with passage probability, since \(j\) is a transient state, we have

\[ \sum_{n=1}^\infty p_{jj}(n)<\infty. \]

So by Theorem for relationship between first-passage probability and passage probability,

\[ \begin{align*} \sum_{n=1}^\infty p_{ij}(n) &= \sum_{n=1}^\infty \sum_{m=1}^n f_{ij}(m)p_{jj}(n-m)\\ &=\sum_{m=1}^\infty \sum_{n=m}^\infty f_{ij}(m)p_{jj}(n-m) \quad\text{triangle choice}\\ &=\sum_{m=1}^\infty f_{ij}(m) \sum_{n=m}^\infty p_{jj}(n-m)\\ &=\sum_{m=1}^\infty f_{ij}(m) \sum_{n=0}^\infty p_{jj}(n)\quad\text{by adjusting index}\\ &=f_{ij} \sum_{n=0}^\infty p_{jj}(n)\leq \sum_{n=0}^\infty p_{jj}(n)<\infty. \end{align*} \]

so \(\lim\limits_{n\rightarrow\infty}p_{ij}(n)=0\).

Limit & Stationary Distribution¶

Actually, we hope to find no matter whether the initial state is, after enough steps, after enough number of sampling, the state has a convergent distribution. That is, whether

\[ \lim_{n\rightarrow \infty}P(x_n=j) \]

exists?

We have the following necessary condition.

Existence for limit distribution

If \(\forall i\in S\), \(\exists\) a distribution \(\nu_{i}=\{\nu_{ij}\}_{j\in S}\) such that \(\lim\limits_{n\rightarrow \infty}p_{ij}(n)=\nu_{ij}\), then limit distribution exists.

Proof

By total probability formula

\[ \begin{align*} p_j(n)&=P(X_n=j)\\ &=\sum_{i=1}^\infty P(X_0=i)P(X_n=j\mid X_0=i)\\ &=\sum_{i=1}^\infty P(X_0=i) p_{ij}(n)\\ \end{align*} \]

So let \(n\rightarrow \infty\), and by LDCT we have

\[ \begin{align*} \lim_{n\rightarrow \infty} p_j(n)&=\lim_{n\rightarrow \infty} \sum_{i=1}^\infty P(X_0=i)p_{ij}(n)\\ &=\sum_{i=1}^\infty P(X_0=i)\lim_{n\rightarrow \infty}p_{ij}(n)\\ &=\sum_{i=1}^\infty P(X_0=i)\nu_{ij} \end{align*} \]

So define \(\mu_j=\lim\limits_{n\rightarrow \infty} p_j(n)=\sum\limits_{i=1}^\infty P(X_0=i)\nu_{ij}\), which is a distribution, since

\[ \sum_{j=1}^\infty \mu_j=\sum_{j=1}^\infty \sum_{i=1}^\infty P(X_0=i)\nu_{ij}=1. \]

So we have to know whether transition probability converges as steps go to infinity. In actual practice, this means calculation of \(\lim\limits_{n\rightarrow \infty}\pmb{P}^{(n)}=\lim\limits_{n\rightarrow \infty}\pmb{P}^n\).

Limit distribution is connected close to the following stationary distribution.

Definition of Stationary distribution

Assume there exists a distribution \(\{\pi_k\}_{k\geq 0}\) such that for all state \(k\in S\), we have

\[ \pmb{\pi}=\pmb{\pi}\pmb{P} \]

where \(\pi=[\pi_0,\pi_1,\cdots]\) is a row vector.

The above definition implies

\[ \pi_k=\sum_{j=0}^\infty \pi_j p_{jk},\quad k=0,1,\cdots \]

which also have a practical meaning

\[ \pi_k(1-p_{kk})=\sum_{j=0\atop j\neq k}^\infty \pi_j p_{jk},\quad k=0,1,\cdots \]

which means the probability of leaving \(k\) and entering \(k\) in one step are the same.

A Markov chain with an initial Stationary distribution is Strictly stationary process

A Markov chain with an initial Stationary distribution is strictly stationary process. That is, if

\[ p_k(0)=\pi_k,\quad k\geq 0 \]

where \(p_k(0)=P(X_0=k)\), \(\{\pi_k\}\) is a stationary distribution of the Markov chain, then for all \(n\geq 1\), we have

\[ p_k(n)=P(X_n=k)=\pi_k. \]

Proof

Since \(\pi_k=\sum_{j=0}^\infty \pi_j p_{jk}\), then

\[ p_{k}(1)=\sum_{j=0}^\infty p_j(0)p_{jk}=\sum_{j=0}^\infty \pi_j p_{jk}=\pi_k. \]

By induction, or Markov characteristic, we are done.

\(\square\)

Now we could talk about the relationship between limit distribution and stationary distribution.

Lemma: First-passage probability with transitivity

If state \(i\rightarrow j\), \(i\) is a recurrent state, then \(f_{ij}=1\) and \(j\rightarrow i\).

Proof

\(j\rightarrow i\) is apparent because after reaching \(j\), \(\exists n'>0\) such that \(p_{ji}(n')>0\). It is a little hard to prove \(f_{ij}=1\), i.e. we could reach \(j\) in finite many times.

A lot of proof could be used here. We give two versions. The first, in my opinion, is the simplest.

Assume \(X_0=i\), since \(i\rightarrow j\), \(\exists n>0\), such that \(p_{ij}(n)>0\). Now we make use of the properties of recurrent state \(i\). We check the number of failure for reaching \(j\).

If \(X_n\neq j\), then \(\exists T_1=\min\{k\geq n: X_k=i\}\) (we fail to reach \(j\), but afterwards we could still reach \(i\)). Check if \(X_{T_1+n}\neq j\), then \(\exists T_2=\min\{k\geq n+T_1: X_k=i\}\). Follow this routine, we define

\[ T_l=\min\{k\geq n+ T_{l-1} : X_k=i\},\quad l=1,2,\cdots \]

and \(T_0=0\) as default. Define \(\zeta=\#\{l\geq 0: X_{T_l+n}\neq j\}\), which is the number of failing reaching state \(j\) before success after \(T_l+n\) steps (if successfully reaching \(j\), then we stop the routine and return \(l\)). Then easy to see that

\[ P(\zeta=k)=(1-p_{ij}(n))^k p_{ij}(n) \]

(fail \(k\) times and success at the end). This random variable is reasonable, because \(i\) is recurrent. So \(\zeta\) follows geometric distribution and

\[ E\zeta=\frac{1-p_{ij}(n)}{p_{ij}(n)}<\infty. \]

which means in average, we fail in finite many times, and \(\zeta<\infty\) a.s. or \((P(\zeta=\infty)=0)\). So \(\exists T<\infty\) such that \(X_{T}=j\), which means \(f_{ij}=1\).

\(\square\)

Theorem: Relationship between stationary and limit distribution

Assume \(\{X_n\}\) is an irreducible Markov chain.

(i) it has a unique stationary distribution, iff all of its state are positive recurrent, and

\[ \lim\limits_{n\rightarrow \infty}\frac{1}{n}\sum_{k=1}^n p_{ij}(k) = \pi_j. \]

(ii) if \(X_n\) is ergodic, then its stationary distribution equals the limit distribution, i.e. for all state \(i,j\),

\[ \lim\limits_{n\rightarrow \infty}p_{ij}(n)=\pi_j \]

which means it is also irrelevant with initial state.

Proof for (i)

Necessary.

Assume it has a unique stationary distribution, then

\[ \pi_k=\sum_{j=0}^\infty \pi_j p_{jk}=\sum_{j=0}^\infty \pi_j p_{ik}(n),\quad k\geq 1. \]

Let \(n\rightarrow \infty\), exchange the order of limit and summation, we have

\[ \pi_k=\lim_{n\rightarrow \infty}\sum_{j=0}^\infty \pi_j p_{ik}(n)=\sum_{j=0}^\infty \pi_j \lim_{n\rightarrow \infty} p_{ik}(n) =\sum_{j=0}^\infty \pi_j \cdot 0=0 \]

for null recurrent state and transient state, which contradicts! Since it is irreducible, then all the states is positive recurrent.

Sufficient. We focus on the limit of \(p_{ij}(n)\) in formulating stationary distribution.

Since \(i\leftrightarrow j\), by Lemma: First-passage probability with transitivity \(f_{ij}=1\). Since all the states are positive recurrent, by Limit theorem for states starting from different initial state

\[ \lim_{n\rightarrow \infty} p_{ij}(n)=\frac{f_{ij}}{\mu_j}=\frac{1}{\mu_j}, \]

We want to prove that \(\{\frac{1}{\mu_j}, j\geq 0\}\) is the stationary distribution.

Firstly, since \(\sum\limits_{j=0}^\infty p_{ij}(n)=1\), we want to take limit of \(n\) to get \(\frac{1}{\mu_j}\), i.e. by Fatou Lemma,

\[ \sum_{j=0}^\infty \frac{1}{\mu_j} =\sum_{j=0}^\infty \lim_{n\rightarrow \infty} p_{ij}(n)\leq \lim_{n\rightarrow \infty} \sum_{j=0}^\infty p_{ij}(n)=1. \]

Secondly, since

\[ p_{ij}(n+1)=\sum_{k=0}^\infty p_{ik}(n)p_{kj}(1) \]

still by Fatou lemma, we have

\[ \frac{1}{\mu_j}=\lim_{n\rightarrow \infty} \sum_{k=0}^\infty p_{ik}(n)p_{kj}(1)\geq \sum_{k=0}^\infty\lim_{n\rightarrow \infty} p_{ik}(n) p_{kj}(1)=\sum_{k=0}^\infty \frac{1}{\mu_k} p_{kj}. \]

Actually, for a stationary distribution, we shoule prove \(\frac{1}{\mu_j}=\sum_{k=0}^\infty \frac{1}{\mu_k} p_{ik}\). We show this by contradiction. If \(\exists j_0\) such that

\[ \frac{1}{\mu_{j_0}}>\sum_{k=0}^\infty \frac{1}{\mu_k} p_{kj_0} \]

then

\[ \sum_{j=0}^\infty \frac{1}{\mu_j}>\sum_{j=0}^\infty\sum_{k=0}^\infty \frac{1}{\mu_k} p_{kj}=\sum_{k=0}^\infty\frac{1}{\mu_k} \sum_{j=0}^\infty p_{kj}=\sum_{k=0}^\infty\frac{1}{\mu_k} \]

which contradicts!

Lastly, we prove that \(\{\frac{1}{\mu_j},j\geq 0\}\) is a distribution, i.e. \(\sum_{j=0}^\infty \frac{1}{\mu_j}=1\). With the same result, we have \(\frac{1}{\mu_j}=\sum_{k=0}^\infty \frac{1}{\mu_k} p_{kj} (m)\) (iteratively, proved by readers). So

\[ \begin{align*} \frac{1}{\mu_j} &= \lim_{m\rightarrow \infty} \sum_{k=0}^\infty \frac{1}{\mu_k} p_{kj} (m)\\ &= \sum_{k=0}^\infty \frac{1}{\mu_k} \lim_{m\rightarrow \infty} p_{kj} (m)\\ &= \sum_{k=0}^\infty \frac{1}{\mu_k} \frac{1}{\mu_j}\\ \Rightarrow \quad \sum_{j=0}^\infty \frac{1}{\mu_j} &= \sum_{k=0}^\infty \frac{1}{\mu_k} \sum_{j=0}^\infty \frac{1}{\mu_j}\leq \sum_{j=0}^\infty \frac{1}{\mu_j}. \end{align*} \]

From the above proof, we have the following useful theorem.

Corollary: Theorem for mean recurrence time

Assume \(X_n\) is a non-periodic irreducible Markov chain, then its stationary distribution

\[ \pi_{j}=\frac{1}{\tau_j},\quad j\in S, \]

where \(\tau_j=ET_j\) is the mean recurrence time of state \(j\).

Decomposition of State Space¶

Now we come back to the relationship between communicating class and recurrent state.

Firstly, let us talk abuot the number of elements in the Markov chain which plays a role in determining recurrence.

Lemma: Number of elements

(i) If A Markov chain has finitely many number of elements, then there exists at least one positive recurrent state. Moreover, if the chain is irreducible, then all of the states are positive recurent.

Proof

We prove by contradiction.

Assume there is no recurrent states, then all the states must be transient or null recurrent. Then by Limit theorem for states starting from different initial state, we have for all state \(i\), \(j\)

\[ \lim_{n\rightarrow \infty}p_{ij}(n)=0. \]

Still make use of the normality of transitive probability.

\[ 1=\lim_{n\rightarrow \infty}\sum_{j=1}^\infty p_{ij}(n)=\sum_{j=1}^\infty \lim_{n\rightarrow \infty} p_{ij}(n)=0, \]

which contradicts!

If it is irreducible, then by Transitivity of positive and null recurrent states, we have all the states are positive recurrent.

\(\square\)

Relationship between communicating class and recurrent state

(i) If state \(i\) is recurrent, then its communicating class is closed.

(ii) If a finite-element communicating class is closed, then all state of it is positive recurrent.

Proof

(i) By contradiction. If \(C(i)\) is not closed, then \(\exists k\not\in C(i)\), such that \(i\rightarrow k\), but \(k\not\rightarrow i\). So \(\exists n>0\)

\[ P(\mu_i=\infty \mid X_0=i)\geq p_{ik}(n)P(\mu_i=\infty\mid X_0=k)>0. \]

(ii) Limit \(\{X_n\}\) on \(C(i)\), which is a irreducible Markov chain. So it must be positive recurrent Markov chain.

Decomposition of Markov Chain

The state space \(S\) of Markov chain could be decomposed into denumerable mutually disjoint state sets, i.e.

\[ S=\mathcal(N)\cup C_1\cup C_2\cup \cdots\cup C_n, \]

where \(C_i\cap C_j=\varnothing\) whenever \(i\neq j\), \(\mathcal{N}\) is the set of all transient states.

Proof

Iterate over all elements of \(S\). Arbitrarily choose a state \(j_1\in S\), let \(C_1=C(j_1)\), if \(j_1\) is a recurrent state; otherwise let \(C_1=\{j_1\}\), i.e. \(j_1\) is a transient state. Next round, Arbitrarily choose a state \(j_2\in S-C_1\), let \(C_2=C(j_2)\), if \(j_2\) is a recurrent state; otherwise let \(C_2=\{j_2\}\), i.e. \(j_2\) is a transient state.

Continue until \(S-C_1-C_2-\cdots=\varnothing\). We combine all the transient states together to \(\mathcal{N}\). Notice \(n\) could be \(\infty\).

Inverse Markov Chain¶

Check whether the following Markov property holds?

\[ P(X_{n}=j\mid X_{n+1}=i,X_{n+2}=i_{n+2},\cdots, X_N=i_N)=P(X_{n}=j\mid X_{n+1}=i) \]

Proof

\[ \begin{align*} P(X_{n}=j& \mid X_{n+1}=i,X_{n+2}=i_{n+2},\cdots, X_N=i_N)\\ &=\frac{p_j(n)p_{ji} p_{i i_{n+2}}\cdots p_{i_{N-1}i_{N}}}{p_{i}(n+1)p_{ii_{n+2}}\cdots p_{i_{N-1}i_{N}}}\\ &= \frac{P(X_n=j)}{P(X_{n+1}=i)}p_{ji}\\ &=P(X_{n}=j\mid X_{n+1}=i). \end{align*} \]

\(\square\)

Actually, we need some special condition to interpret.

The first is to making use of stationary distribution. That is, if the initial distribution is \(\pi\), then \(p_{j}(n)=\pi_j\), \(p_{i}(n+1)=\pi_i\), and

\[ \frac{P(X_n=j)}{P(X_{n+1}=i)}p_{ji}=\frac{\pi_j}{\pi_i}p_{ji} \]

which is irrelevant with time.

Now we have a new but natural idea of a special term: Inverse distribution.

Definition of inverse distribution

If

\[ (X_1,X_0)\overset{d}{=}(X_0,X_1), \]

i.e.

\[ \pi_i p_{ij}=\pi_j p_{ji} \]

then the whole Markov chain is inversible.

Actually, having a stationary distribution is a necessary condition for a Markov chain to be inverse.

Kolmogorov Criterion

Assume \(X_n\) is a irreducible stationary Markov chain, then it is inversible iff for all closed path

\[ i_0,i_1,\cdots,i_{m-1}, i_m=i_0 \]

we have

\[ p_{i_0i_1}p_{i_1i_2}\cdots p_{i_{m-1}i_0}=p_{i_0 i_{m-1}}\cdots p_{i_2i_1}p_{i_1i_0}. \]

Proof

necessary.

If \(X\) is inversible, then \(\forall i,j\in S\), \(\pi_ip_{ij}=\pi_jp_{ji}\). So

\[ \begin{align*} \pi_{i_0} p_{i_0i_1}p_{i_1i_2}\cdots p_{i_{m-1}i_m}&=\pi_{i_1} p_{i_1i_0}p_{i_1i_2}\cdots p_{i_{m-1}i_m}\\ &=p_{i_1i_0}\pi_{i_1} p_{i_1i_2}\cdots p_{i_{m-1}i_m}\\ &=p_{i_1i_0} p_{i_2i_1}\pi_{i_2}\cdots p_{i_{m-1}i_m}\\ &=p_{i_1i_0} p_{i_2i_1}\cdots \pi_{i_{m-1}} p_{i_{m-1}i_m}\\ &=p_{i_1i_0} p_{i_2i_1}\cdots p_{i_m i_{m-1}} \pi_{i_{m}}\\ &=p_{i_1i_0} p_{i_2i_1}\cdots p_{i_0 i_{m-1}} \pi_{i_{0}}\\ &=\pi_{i_0}p_{i_0 i_{m-1}}\cdots p_{i_2i_1}p_{i_1i_0}. \end{align*} \]

Sufficient. This is similar to a curve integral.

Fix a state \(j_0\), by assumption, for any state \(j\), there must be a path \(j, j_n, j_{n-1},\cdots, j_{1},j_0\) such that

\[ p_{jj_n}p_{j_nj_{n-1}}\cdots p_{j_1j_0}>0 \]

and then we could define

\[ \pi_j=\frac{1}{Z}\frac{p_{j_0j_1}\cdots p_{j_{n} j}}{p_{j_n j}\cdots p_{j_1j_0}} \]

where \(Z\) is made for \(\sum_{j}\pi_j=1\). Then it is easy to show that the above definition is well-defined, and for stationary distribution, we have to show for \(j,k \in S\),

\[ \pi_jp_{jk}=\pi_kp_{kj}. \]

If \(p_{jk}=p_{kj}=0\), then the above holds naturally. If \(p_{jk}\neq 0\), then by assumption, we could define

\[ \pi_k:=\frac{1}{Z}\frac{p_{j_0j_1}\cdots p_{j_{n} j}p_{jk}}{p_{kj}p_{j_n j}\cdots p_{j_1j_0}} \]

and it also holds.

\(\square\)

Branching Process¶

This is a special case of Markov chain.

we have to solve the smallest positive root of \(\phi(s)=s\), where

\[ \phi(s)=E(s^{\xi}) \]

\(\xi\) is the product.