An alternate view of state of the channel

In addition to defining 'state' based on symbol input X (set of all x symbols) we can also define 'state' of the channel with respect to Y (set of all y symbols). Hence,

where g is the total possible number of states and h is the total possible number of y symbols.

Thus for the above example

Referring to the state diagram

we see that given the state S0 probability that y = y0 = −2 or p(y0 | S0) = 0.5. Similarly p(y1 | S0) = 0.5 but p(y2 | S0) = 0. Thus

Let,

Note that Π0 + Π1 + Π2 = 1. We can therefore write the equation for the output probabilities as Thus where is the state probability vector and the matrix of state probabilities such that fij is the transition probability fij = p(yj | Si). Since given a particular state, sum of all probabilities from this state (probabilities in a particular column) must equal 1, f00 + f01 + f02 = f10 + f11 + f12 = 1.

Hence for the above example

Like with P, the F elements in each column adds to 1 but the elements in each row may not.

The output probability at steady–state (ss) is derived from the limit

Note that which was computed at the start of the analysis.

Note that the conditional probabilities shown in the state diagram are all equivalent–

In our example Y = {y0 = −2, y1 = 0, y2 = +2}.

Why is the system called Hidden Markov Process?

Since

Thus If H'(Y) is the error rate, then because of the above inequality we should suspect that the error rate should be Note that the entropy H(Y) grows as the output grows, that is But, by definition of entropy rate Thus it is fair to assume that the error rate is less than the entropy.

Recall that for information source with memory,

Markov process was developed to compute error rates.

However, for channels with memory the output sequence (y0, y1, …, yn−1) is not a Markov process but the State Model is (a Markov process). Thus

and the channel is a function where knowledge about yn−1 does not provide knowledge about the state of channel. Therefore the channel states are called hidden and the system is called Hidden Markov process. The fact that the channel is a function is what makes channels with memory difficult to analyze. One way is to use Gallagher's diagram.