Uploaded by lavonpage on 22.06.2012

Transcript:

This video treats a topic called absorbing Markov chains. An absorbing Markov chain is

a chain where’s there’s at least one state you get stuck if you get there.

A state like that is called an absorbing state. It’s a state that has the property that

once you get into the state it’s impossible to leave it.

Here’s an example of an absorbing Markov chain where state number one is an absorbing

state. If you look at the transition probabilities corresponding to the first row you see that

when you are in state one the probability is one that you’ll remain is state one and

zero that you’ll go to either one of the other two states. In this example state number

one is an absorbing state.

Let’s think about tracking students through a university. We have the four standard grade

classifications (freshman, sophomore, junior, and senior). When a student successfully completes

school they enter the graduated state and, of course, some students drop out. We’ll

have the four states corresponding to the four class years, we’ll have a graduated

state, and we’ll have a dropped out state. The data that we’re going to use in our

example is what’s shown right here in the transition diagram. Let’s look at the transition

diagram a bit just to make sure we’re interpreting the numbers correctly. For someone who enters

as a freshman, during their first year there’s a ten percent chance that they will fail their

year and remain classified as a freshman the following year. There’s an eighty percent

chance they’ll successfully complete their freshman year during that year’s time and

move on to become a sophomore. Or there’s a ten percent chance that they’ll drop out.

Notice that we don’t have an arrow going from freshman to the graduated state because

you can’t graduate in one year. Nobody that’s a freshman is graduated at the end of one

year. Read through the rest of the transition diagram and make sure that you’re clear

on all the transition probabilities that are shown here. With seniors, for example, ninety

percent of seniors graduate, five percent drop out, and five percent of them have to

repeat their senior year. Notice also that nobody ever moves backward. Ninety percent

of juniors become seniors, but of course once you’re a senior you never go back to being

a junior again. Now what about the graduated and dropped out states? We’re going to treat

both of those as absorbing states. If you look at the graduated state we have a one

on the arrow pointing from G back to G (which says that if somebody has graduated then at

the end of the next year they remain graduated). We’re going to treat dropouts the same way.

If a student drops out then they’re still dropped out at the end of the next year. In

other words, they don’t get readmitted to school. That’s a bit unusual. Most schools

of course don’t really have a policy like that, but for purposes of our example we’re

going to treat both graduated and dropped out as absorbing states.

That leads to a transition matrix that has the pattern shown here. I’m listing the

two absorbing states (graduated and dropped out) as states one and two, and then the other

states appear after that. Notice that the matrix has a particular pattern to it. If

we list our absorbing states first, then we get a little identity matrix up in the top

left hand corner. The size of that piece will be determined by how many absorbing states

we have, and then this upper right hand corner has nothing but zeros in it. Then we have

these other two pieces down here. The transition matrix splits into those four blocks. That

will always be the case if you have absorbing states and list your absorbing states first.

We’re going to be working with those four blocks, so we in fact are going to be giving

them names. Since this upper left hand corner is an identity matrix of some size determined

by how many absorbing states we have, we’ll called it I. This top right hand corner is

just in this instance it’s two rows and four columns for this little matrix up in

the corner. It’s a zero matrix of some size depending on how many absorbing and non-absorbing

states we have. This bottom left hand piece, which in this instance has four rows and two

columns, is matrix we’ll call R. Then the bottom right hand chump, which will always

be a square matrix (in this instance four by four) we’ll call Q. Now what we’re looking at on this slide

is the transition matrix raised to the third power (approximately). I’ve done some rounding

off here just because some of the decimal expressions were getting too long. This is

approximately T-cubed. Let’s think about the meaning interpretation of some of these

numbers that appear in the matrix T-cubed. For example, the point six one two here, that’s

the probability. Remember we learned back in section six point two that for T cubed

the numbers in T-cubed can be interpreted as the three step transition probabilities.

This point six one two is the three step transition probability from the freshman state to the

senior state. Which to put in more natural language would simply be saying it’s the

probability that someone who’s a freshman will become a senior in three years. That

would be where a freshman would be in three years if they stay on track, right? Because

the first year they would go from freshman to sophomore, the second year from sophomore

to junior, and the third year from junior to senior. This point six one two is the probability

that someone who starts as a freshman does make it to being classified as a senior within

the three years time frame, which they would make it to that if they were operating on

schedule. This little left hand bottom piece is of special interest because it indicates

the probabilities of hitting each of the two absorbing states (graduated and dropped out).

For example, for someone who starts as a freshman, we’re looking at three step probabilities

so this would say someone who starts as a freshman can’t possibly graduate in three

years, but there will be a point one nine three probability that someone who starts

as a freshman will drop out within their first three years. Someone who’s classified as

a sophomore within three years has almost a seventy percent chance of graduating (about

a fourteen percent chance of dropping out). Someone who’s a junior has a ninety percent

chance of graduating within the following three years and about a ten percent chance

of dropping out (and so forth). If you’re interested in which state a student winds

up in, this portion of T-cubed is of particular interest if you’re tracking the student

for three years and want to know what has become of them by the end of that time.

An interesting question and probably the most important question that can be asked is if

you’re looking at the distribution of data that we have in the transition matrix, what

kind of techniques can you use to determine the long term probabilities for entering each

of the absorbing states? It’s obvious that you would want to know that, isn’t it? Someone

who’s thinking about going to that school would be very interested in the question “If

I go to that school what are the chance I’ll eventually graduate?” There’s hardly any

information that would be more relevant than the answer to that question.

Fortunately there is a recipe for doing this and we’re going to track through that recipe

which involves some matrix information. Remember we slice the matrix up into the four pieces

shown here. Step one of the recipe is we’re going to take this bottom right hand chunk

Q, which in our example is four by four, and subtract it from the identity matrix of the

same size. In this matrix expression, I minus Q, we’re not really talking about this minus

this because their different sizes. This identity matrix being identified in this picture is

a two by two identity matrix. Understand we’re talking about I minus Q, we’re talking about

this four by four matrix Q and we’re talking about subtracting Q from the four by four

identity matrix because you have to have two matrices the same size if you’re going to

subtract them. The first step is a simple matrix subtraction. Four by four identity

matrix minus this four by four identity matrix Q. The next step is to find the inverse of

that. Having done the subtraction to find the inverse of that, and that’s what’s

called N. The matrix N is the inverse of that matrix we computed in step one. The final

step in the recipe is to multiply the product N times this matrix R, and that product is

called B. Let’s think about the sizes to make sure everything makes sense here. Q is

four by four, I minus Q is four by four, so the inverse of that will also be four by four.

N is a four by four matrix. In this example R is a four by two matrix, so we’ll be multiplying

a four by four matrix times a four by two matrix (which of course is possible), and

we’ll get when you multiply a four by four times a four by a two you get a four by two

matrix. The matrix B that we’re calculating in step three will in this example be a four

by two matrix (four rows and two columns).

In this example when you carry out the recipe for I minus Q we get the matrix shown here.

R we don’t have to do any calculation to get that, we just copy it down from the original

transition matrix. We know what R is. Remember we have to find the inverse of this guy. You

can find the inverse of that using the matrix tool and you come out with a matrix, which

rounded off a bit is shown at the top. The final multiplication (N times R), this four

by four times the four by two matrix R that we saw in the previous slide, and we come

out with the product matrix B that looks something like what’s shown here. This is the matrix

that contains a lot of valuable information.

To interpret that information what we do is we label the rows of B with the names of the

non-absorbing states in the same order that we had listed them in the original transition

matrix (freshman, sophomore, junior, and senior). These were states three, four, five, and six

in the original transition matrix. We label the rows with those same states in the same

order and we label the columns with the names of the two absorbing states. Again taken from

the original transition matrix keeping the same order. We originally had these as states

one and two in the original transition matrix. They’re are our column labels. With this

labeling, the numbers in this matrix B can be interpreted as the long term probabilities

from going from any of the non-absorbing states into any of the absorbing states. For example,

the point seven five three says that there’s about a seventy-five percent chance that someone

who enters as a freshman will eventually graduate. About a twenty-five percent chance they’ll

drop out instead. The point eight nine eight in it’s position is saying that for someone

who is currently a junior, there’s almost a ninety-percent chance that they will eventually

graduate and only about a ten percent chance that they will drop out. Obviously the long

term prospects of a junior are better than the long term prospects of a freshman simply

because they’re already much closer to graduation. The important point is that the numbers in

the matrix B show the probability of eventually arriving in each of the absorbing states depending

on which column you’re looking at from any of the non-absorbing states depending on which

row you’re looking at. The point one zero two is the probability that someone who is

classified as a junior will eventually drop out rather than graduate.

Now to move to an entirely different example. Sally and Becky are playing tennis. Deuce

is a score in tennis which says that in the game that’s being played, the two players

are tied with the same number of points and then at that point someone has to get two

points ahead in order to win the game. Now the assumption we’re making in this example

is that when the game is tied at deuce, Sally has two-thirds probability of winning the

next point and Becky has one-third probability of winning the point. On the other hand when

Sally has advantage, when she is a point ahead, then he has probability three-fourths of winning

the next point. When Becky has advantage, that is when Becky’s a point ahead, she

has probability one-half of winning the next point. The probabilities of either of them

winning the next point depends on what state they’re in relative to the score. We’re

going to set this up as a Markov chain with the absorbing states being Sally wins the

game and the game is over or Becky wins the game and the game is over or Sally’s advantage

(where Sally is a point ahead) or Becky’s advantage (where Becky’s a point ahead)

or deuce (where the score is tied). The play can go on for a long period of time. It has

to go on until somebody gets two points ahead (that’s the nature of the game of tennis).

One other question we’ll look at is if the game is at deuce, find how long the game is

expected to last and the probability that Becky wins. The second question we’ll look

at is if Sally has advantage (if Sally’s a point ahead) what’s the probability that

she eventually wins the game?

We set up names for the five states. The game can be at deuce, it can be Becky’s advantage

(Becky a point ahead), Sally’s advantage (meaning Sally’s a point ahead), or Becky

wins (which is an absorbing state because at that point the same is over), or Sally

wins (which is the other absorbing state because, again, the game is over is someone has won).

The transition diagram is as shown here based on the probabilities that we saw in the original

setup. If the game is tied at deuce there’s a two-third probability that Sally wins so

the score goes to Sally’s advantage. There’s a one-third probability that Becky wins the

point so it goes to Becky’s advantage. If it’s at Sally’s advantage there’s a

three-fourth probability that Sally wins the point (so Sally wins the game). There’s

a one-fourth probability that Sally wins the point, so it goes back to deuce. If it’s

Becky’s advantage then there’s a one-half probability that Becky wins the point (so

Becky wins the game). There’s a one-half probability that Sally wins the game so the

score goes back to deuce. Check all the number here to make sure that you agree that the

transition diagram we’ve drawn is consistent with the information we had to start out with.

Putting those numbers into a transition matrix (the transition matrix is the one that’s

shown) the top left corner is two by two identity because we again have two absorbing states

(all zeros in the top right piece). The lower left block R has three rows and two columns,

and the matrix Q in this example is a three by three matrix.

Following the recipe, we start with Q, subtract Q from the three by three identity matrix,

which takes us to here. R is just the lower left block we’ve copied over, and then we

find the inverse of R (a three by three matrix), which is the matrix shown right here.

The multiplication that’s done is to multiply that inverse matrix N times the block R (which

was taken from the lower left corner of the original transition matrix) and the result

of that matrix multiplication is the three by two matrix shown here. The number of rows

is the same as the number of non-absorbing states that we had in the original transition

matrix and the number of columns is the same as the number of absorbing states.

We label this matrix B with the row labels coming from the non-absorbing states in the

same order as in the original transition matrix and the columns being the absorbing states.

N itself is a square matrix and we label the rows and columns of N with the names of the

non-absorbing states. All the information we’re interested in, in problems that involve

absorbing Markov chains comes from one of these two matrices. The long-term probabilities

come from the matrix B. The meaning of the one-fourth is if the game is tied at deuce

(which is one of the non-absorbing states), then the long-term probability that Becky

wins the game is one-fourth. The long-term probability that Sally wins the game is three-fourths.

Another way of saying that in a natural fashion is to say that if the game is at deuce, Becky

has a one-fourth probability of eventually winning the game whereas Sally’s probability

of eventually winning the game is three-fourths. If it’s Becky’s advantage, then the chances

that Becky eventually wins the game is five-eights and Sally’s, three-eighths (and so forth).

This is where we go to get the long-term probabilities based on whatever assumption we want to make

on whichever non-absorbing state we happen to be in. We’re also in this example going

to illustrate the use of the matrix N. The numbers that appear in N represent long-term

expected values. For example, the three-fourths says that if the game is at Becky’s advantage

(Becky’s a point ahead) then the expected value for the number of times the game will

be at deuce prior to somebody winning the game is three-fourths. The five-fourths in

the bottom right hand corner says if the game is at Sally’s advantage (so if Sally is

a point ahead) then the expected number of times for that to be the score (in order words

the expected number of times that the game will be at Sally’s advantage prior to somebody

winning) is five-fourths. Notice that all of these numbers are bigger than one. They

have to be at least big as one because, for example, if you’re at Becky’s advantage

then it’s already one time that you’re at the score (so you’ll be at that score

at least that one time before the game ends). In fact this number is one and a quarter (one

point twenty-five), so the expected value for the number of times you’ll be at that

score before the game ends is a little bit bigger than one because it could go from Becky’s

advantage back to deuce and then back to Becky’s advantage again. The other interesting thing

about this matrix is that you can add up the numbers in any row to get the long-term number

of steps for the game to end. Looking at row-two, we’re looking at the scenario where it’s

tied at deuce. On average the score will be Becky’s advantage point five times before

the game ends. It’ll be deuce one point five times before the game ends and it will

be Sally’s advantage one time before the game ends. If you add up all of those numbers

you have the expected value for the total number of points played prior to the game

ending. In this case you add up the numbers in the second row and you get three. That’s

saying that if the score is tied at deuce, then on average they’re going to play three

point more in order for the game to end. The game could end in two points. Sally could

win the next two points and it would be over, but it might go back and forth and back and

forth and back and forth. They might play a large number of points before the game ends.

It averages out to three. If it’s at deuce the number of points played for the game to

end averages out to three. You could similarly add up the first row or the third row to find

the average number of points to be played if you’re starting from Becky’s advantage

or from Sally’s advantage.

If you go back to the example of tracking students and think about that same kind of

question, in that example the matrix N turned out to be what’s shown on this slide. If

we go back and label N with the labels form the non-absorbing states (freshman, sophomore,

junior, and senior), then what you have is the average number of years spent at each

of the classification levels (prior to either dropping out or graduating). Let’s look

at some of these numbers and interpret them. Let’s look at a freshman. What does this

number say? It says on average a freshman will spend one point eleven years classified

as a freshman prior to either dropping out or graduating. We know there’s one year

because they’re already a freshman right now. That year will contribute one year, but

there’s a possibility of more than one because they may fail their freshman year and have

to repeat it again. On average freshman spent one point eleven years classified as freshman.

What about this point eight eight four? The freshman might drop out before ever making

it to the junior year so they may not transition through the junior year (which is why that

number’s less than one). There’s no guarantee that they’ll be ever classified as a junior.

Of course they may be a junior more than one year. They may be a junior and fail that year

and have to repeat as a junior the next year, so it’s possible that a freshman could spend

more than one year as a junior (but it averages out to a little less than one year because

some freshman never make it that far), and even less for a senior. Why is there a zero

down in this spot? Well if somebody’s already a junior, then the expected number for the

expected number of additional years they’ll spend as sophomore is zero. They’ll never

go backwards to being a sophomore. If someone is a senior, why is this number a little bit

bigger than one? Well, they’re already guaranteed their one year as a senior since they’re

already a senior and if they fail their year they’ll have to repeat as a senior the next

year also. That has a small probability of happening, which means the average will be

a little bit more than one year that a senior will be in the school. Let’s think about

the row sums. If you add up the numbers in the first row you get about three point eight.

What is that saying? It’s saying that for somebody who’s a freshman, if you add up

the expected value for the number of years they’ll spend as a freshman, sophomore,

junior, and senior, you get about three-point eight as the expected value for the number

of years they’ll spend in school. Of course it will take them four years at least to graduate.

For someone who’s graduating it will be one year as a freshman, one as a sophomore,

one as a junior, and one as a senior guaranteed. Possibly additional years if they have to

repeat any grades. But on average how can this average out to be less than the four

years required for graduation? Well it averages out of that simply because some students drop

out before graduating (so they’re in school for a much shorter time). Remember these are

averages (expected values). For somebody who’s a sophomore, of course if you look at the

number of years to graduate, if they’re going to graduate then they’ll spend the

sophomore, junior, and senior years doing it. That’s at least three years. Not all

of them will graduate but for somebody who has made it successfully to the sophomore

year, the expected value now turns out to be a little more than three because the chances

of somebody repeating slightly more than offsets the possibility of dropping out. Some of the

sophomores will spend less than three years because they drop out. Some will spend more

because they have to repeat. It averages out to a little bit more than three years, and

similarly the junior row sum indicates that on average somebody who’s a junior will

spend a little more than two years in the school and for seniors, one point O point

three.

a chain where’s there’s at least one state you get stuck if you get there.

A state like that is called an absorbing state. It’s a state that has the property that

once you get into the state it’s impossible to leave it.

Here’s an example of an absorbing Markov chain where state number one is an absorbing

state. If you look at the transition probabilities corresponding to the first row you see that

when you are in state one the probability is one that you’ll remain is state one and

zero that you’ll go to either one of the other two states. In this example state number

one is an absorbing state.

Let’s think about tracking students through a university. We have the four standard grade

classifications (freshman, sophomore, junior, and senior). When a student successfully completes

school they enter the graduated state and, of course, some students drop out. We’ll

have the four states corresponding to the four class years, we’ll have a graduated

state, and we’ll have a dropped out state. The data that we’re going to use in our

example is what’s shown right here in the transition diagram. Let’s look at the transition

diagram a bit just to make sure we’re interpreting the numbers correctly. For someone who enters

as a freshman, during their first year there’s a ten percent chance that they will fail their

year and remain classified as a freshman the following year. There’s an eighty percent

chance they’ll successfully complete their freshman year during that year’s time and

move on to become a sophomore. Or there’s a ten percent chance that they’ll drop out.

Notice that we don’t have an arrow going from freshman to the graduated state because

you can’t graduate in one year. Nobody that’s a freshman is graduated at the end of one

year. Read through the rest of the transition diagram and make sure that you’re clear

on all the transition probabilities that are shown here. With seniors, for example, ninety

percent of seniors graduate, five percent drop out, and five percent of them have to

repeat their senior year. Notice also that nobody ever moves backward. Ninety percent

of juniors become seniors, but of course once you’re a senior you never go back to being

a junior again. Now what about the graduated and dropped out states? We’re going to treat

both of those as absorbing states. If you look at the graduated state we have a one

on the arrow pointing from G back to G (which says that if somebody has graduated then at

the end of the next year they remain graduated). We’re going to treat dropouts the same way.

If a student drops out then they’re still dropped out at the end of the next year. In

other words, they don’t get readmitted to school. That’s a bit unusual. Most schools

of course don’t really have a policy like that, but for purposes of our example we’re

going to treat both graduated and dropped out as absorbing states.

That leads to a transition matrix that has the pattern shown here. I’m listing the

two absorbing states (graduated and dropped out) as states one and two, and then the other

states appear after that. Notice that the matrix has a particular pattern to it. If

we list our absorbing states first, then we get a little identity matrix up in the top

left hand corner. The size of that piece will be determined by how many absorbing states

we have, and then this upper right hand corner has nothing but zeros in it. Then we have

these other two pieces down here. The transition matrix splits into those four blocks. That

will always be the case if you have absorbing states and list your absorbing states first.

We’re going to be working with those four blocks, so we in fact are going to be giving

them names. Since this upper left hand corner is an identity matrix of some size determined

by how many absorbing states we have, we’ll called it I. This top right hand corner is

just in this instance it’s two rows and four columns for this little matrix up in

the corner. It’s a zero matrix of some size depending on how many absorbing and non-absorbing

states we have. This bottom left hand piece, which in this instance has four rows and two

columns, is matrix we’ll call R. Then the bottom right hand chump, which will always

be a square matrix (in this instance four by four) we’ll call Q. Now what we’re looking at on this slide

is the transition matrix raised to the third power (approximately). I’ve done some rounding

off here just because some of the decimal expressions were getting too long. This is

approximately T-cubed. Let’s think about the meaning interpretation of some of these

numbers that appear in the matrix T-cubed. For example, the point six one two here, that’s

the probability. Remember we learned back in section six point two that for T cubed

the numbers in T-cubed can be interpreted as the three step transition probabilities.

This point six one two is the three step transition probability from the freshman state to the

senior state. Which to put in more natural language would simply be saying it’s the

probability that someone who’s a freshman will become a senior in three years. That

would be where a freshman would be in three years if they stay on track, right? Because

the first year they would go from freshman to sophomore, the second year from sophomore

to junior, and the third year from junior to senior. This point six one two is the probability

that someone who starts as a freshman does make it to being classified as a senior within

the three years time frame, which they would make it to that if they were operating on

schedule. This little left hand bottom piece is of special interest because it indicates

the probabilities of hitting each of the two absorbing states (graduated and dropped out).

For example, for someone who starts as a freshman, we’re looking at three step probabilities

so this would say someone who starts as a freshman can’t possibly graduate in three

years, but there will be a point one nine three probability that someone who starts

as a freshman will drop out within their first three years. Someone who’s classified as

a sophomore within three years has almost a seventy percent chance of graduating (about

a fourteen percent chance of dropping out). Someone who’s a junior has a ninety percent

chance of graduating within the following three years and about a ten percent chance

of dropping out (and so forth). If you’re interested in which state a student winds

up in, this portion of T-cubed is of particular interest if you’re tracking the student

for three years and want to know what has become of them by the end of that time.

An interesting question and probably the most important question that can be asked is if

you’re looking at the distribution of data that we have in the transition matrix, what

kind of techniques can you use to determine the long term probabilities for entering each

of the absorbing states? It’s obvious that you would want to know that, isn’t it? Someone

who’s thinking about going to that school would be very interested in the question “If

I go to that school what are the chance I’ll eventually graduate?” There’s hardly any

information that would be more relevant than the answer to that question.

Fortunately there is a recipe for doing this and we’re going to track through that recipe

which involves some matrix information. Remember we slice the matrix up into the four pieces

shown here. Step one of the recipe is we’re going to take this bottom right hand chunk

Q, which in our example is four by four, and subtract it from the identity matrix of the

same size. In this matrix expression, I minus Q, we’re not really talking about this minus

this because their different sizes. This identity matrix being identified in this picture is

a two by two identity matrix. Understand we’re talking about I minus Q, we’re talking about

this four by four matrix Q and we’re talking about subtracting Q from the four by four

identity matrix because you have to have two matrices the same size if you’re going to

subtract them. The first step is a simple matrix subtraction. Four by four identity

matrix minus this four by four identity matrix Q. The next step is to find the inverse of

that. Having done the subtraction to find the inverse of that, and that’s what’s

called N. The matrix N is the inverse of that matrix we computed in step one. The final

step in the recipe is to multiply the product N times this matrix R, and that product is

called B. Let’s think about the sizes to make sure everything makes sense here. Q is

four by four, I minus Q is four by four, so the inverse of that will also be four by four.

N is a four by four matrix. In this example R is a four by two matrix, so we’ll be multiplying

a four by four matrix times a four by two matrix (which of course is possible), and

we’ll get when you multiply a four by four times a four by a two you get a four by two

matrix. The matrix B that we’re calculating in step three will in this example be a four

by two matrix (four rows and two columns).

In this example when you carry out the recipe for I minus Q we get the matrix shown here.

R we don’t have to do any calculation to get that, we just copy it down from the original

transition matrix. We know what R is. Remember we have to find the inverse of this guy. You

can find the inverse of that using the matrix tool and you come out with a matrix, which

rounded off a bit is shown at the top. The final multiplication (N times R), this four

by four times the four by two matrix R that we saw in the previous slide, and we come

out with the product matrix B that looks something like what’s shown here. This is the matrix

that contains a lot of valuable information.

To interpret that information what we do is we label the rows of B with the names of the

non-absorbing states in the same order that we had listed them in the original transition

matrix (freshman, sophomore, junior, and senior). These were states three, four, five, and six

in the original transition matrix. We label the rows with those same states in the same

order and we label the columns with the names of the two absorbing states. Again taken from

the original transition matrix keeping the same order. We originally had these as states

one and two in the original transition matrix. They’re are our column labels. With this

labeling, the numbers in this matrix B can be interpreted as the long term probabilities

from going from any of the non-absorbing states into any of the absorbing states. For example,

the point seven five three says that there’s about a seventy-five percent chance that someone

who enters as a freshman will eventually graduate. About a twenty-five percent chance they’ll

drop out instead. The point eight nine eight in it’s position is saying that for someone

who is currently a junior, there’s almost a ninety-percent chance that they will eventually

graduate and only about a ten percent chance that they will drop out. Obviously the long

term prospects of a junior are better than the long term prospects of a freshman simply

because they’re already much closer to graduation. The important point is that the numbers in

the matrix B show the probability of eventually arriving in each of the absorbing states depending

on which column you’re looking at from any of the non-absorbing states depending on which

row you’re looking at. The point one zero two is the probability that someone who is

classified as a junior will eventually drop out rather than graduate.

Now to move to an entirely different example. Sally and Becky are playing tennis. Deuce

is a score in tennis which says that in the game that’s being played, the two players

are tied with the same number of points and then at that point someone has to get two

points ahead in order to win the game. Now the assumption we’re making in this example

is that when the game is tied at deuce, Sally has two-thirds probability of winning the

next point and Becky has one-third probability of winning the point. On the other hand when

Sally has advantage, when she is a point ahead, then he has probability three-fourths of winning

the next point. When Becky has advantage, that is when Becky’s a point ahead, she

has probability one-half of winning the next point. The probabilities of either of them

winning the next point depends on what state they’re in relative to the score. We’re

going to set this up as a Markov chain with the absorbing states being Sally wins the

game and the game is over or Becky wins the game and the game is over or Sally’s advantage

(where Sally is a point ahead) or Becky’s advantage (where Becky’s a point ahead)

or deuce (where the score is tied). The play can go on for a long period of time. It has

to go on until somebody gets two points ahead (that’s the nature of the game of tennis).

One other question we’ll look at is if the game is at deuce, find how long the game is

expected to last and the probability that Becky wins. The second question we’ll look

at is if Sally has advantage (if Sally’s a point ahead) what’s the probability that

she eventually wins the game?

We set up names for the five states. The game can be at deuce, it can be Becky’s advantage

(Becky a point ahead), Sally’s advantage (meaning Sally’s a point ahead), or Becky

wins (which is an absorbing state because at that point the same is over), or Sally

wins (which is the other absorbing state because, again, the game is over is someone has won).

The transition diagram is as shown here based on the probabilities that we saw in the original

setup. If the game is tied at deuce there’s a two-third probability that Sally wins so

the score goes to Sally’s advantage. There’s a one-third probability that Becky wins the

point so it goes to Becky’s advantage. If it’s at Sally’s advantage there’s a

three-fourth probability that Sally wins the point (so Sally wins the game). There’s

a one-fourth probability that Sally wins the point, so it goes back to deuce. If it’s

Becky’s advantage then there’s a one-half probability that Becky wins the point (so

Becky wins the game). There’s a one-half probability that Sally wins the game so the

score goes back to deuce. Check all the number here to make sure that you agree that the

transition diagram we’ve drawn is consistent with the information we had to start out with.

Putting those numbers into a transition matrix (the transition matrix is the one that’s

shown) the top left corner is two by two identity because we again have two absorbing states

(all zeros in the top right piece). The lower left block R has three rows and two columns,

and the matrix Q in this example is a three by three matrix.

Following the recipe, we start with Q, subtract Q from the three by three identity matrix,

which takes us to here. R is just the lower left block we’ve copied over, and then we

find the inverse of R (a three by three matrix), which is the matrix shown right here.

The multiplication that’s done is to multiply that inverse matrix N times the block R (which

was taken from the lower left corner of the original transition matrix) and the result

of that matrix multiplication is the three by two matrix shown here. The number of rows

is the same as the number of non-absorbing states that we had in the original transition

matrix and the number of columns is the same as the number of absorbing states.

We label this matrix B with the row labels coming from the non-absorbing states in the

same order as in the original transition matrix and the columns being the absorbing states.

N itself is a square matrix and we label the rows and columns of N with the names of the

non-absorbing states. All the information we’re interested in, in problems that involve

absorbing Markov chains comes from one of these two matrices. The long-term probabilities

come from the matrix B. The meaning of the one-fourth is if the game is tied at deuce

(which is one of the non-absorbing states), then the long-term probability that Becky

wins the game is one-fourth. The long-term probability that Sally wins the game is three-fourths.

Another way of saying that in a natural fashion is to say that if the game is at deuce, Becky

has a one-fourth probability of eventually winning the game whereas Sally’s probability

of eventually winning the game is three-fourths. If it’s Becky’s advantage, then the chances

that Becky eventually wins the game is five-eights and Sally’s, three-eighths (and so forth).

This is where we go to get the long-term probabilities based on whatever assumption we want to make

on whichever non-absorbing state we happen to be in. We’re also in this example going

to illustrate the use of the matrix N. The numbers that appear in N represent long-term

expected values. For example, the three-fourths says that if the game is at Becky’s advantage

(Becky’s a point ahead) then the expected value for the number of times the game will

be at deuce prior to somebody winning the game is three-fourths. The five-fourths in

the bottom right hand corner says if the game is at Sally’s advantage (so if Sally is

a point ahead) then the expected number of times for that to be the score (in order words

the expected number of times that the game will be at Sally’s advantage prior to somebody

winning) is five-fourths. Notice that all of these numbers are bigger than one. They

have to be at least big as one because, for example, if you’re at Becky’s advantage

then it’s already one time that you’re at the score (so you’ll be at that score

at least that one time before the game ends). In fact this number is one and a quarter (one

point twenty-five), so the expected value for the number of times you’ll be at that

score before the game ends is a little bit bigger than one because it could go from Becky’s

advantage back to deuce and then back to Becky’s advantage again. The other interesting thing

about this matrix is that you can add up the numbers in any row to get the long-term number

of steps for the game to end. Looking at row-two, we’re looking at the scenario where it’s

tied at deuce. On average the score will be Becky’s advantage point five times before

the game ends. It’ll be deuce one point five times before the game ends and it will

be Sally’s advantage one time before the game ends. If you add up all of those numbers

you have the expected value for the total number of points played prior to the game

ending. In this case you add up the numbers in the second row and you get three. That’s

saying that if the score is tied at deuce, then on average they’re going to play three

point more in order for the game to end. The game could end in two points. Sally could

win the next two points and it would be over, but it might go back and forth and back and

forth and back and forth. They might play a large number of points before the game ends.

It averages out to three. If it’s at deuce the number of points played for the game to

end averages out to three. You could similarly add up the first row or the third row to find

the average number of points to be played if you’re starting from Becky’s advantage

or from Sally’s advantage.

If you go back to the example of tracking students and think about that same kind of

question, in that example the matrix N turned out to be what’s shown on this slide. If

we go back and label N with the labels form the non-absorbing states (freshman, sophomore,

junior, and senior), then what you have is the average number of years spent at each

of the classification levels (prior to either dropping out or graduating). Let’s look

at some of these numbers and interpret them. Let’s look at a freshman. What does this

number say? It says on average a freshman will spend one point eleven years classified

as a freshman prior to either dropping out or graduating. We know there’s one year

because they’re already a freshman right now. That year will contribute one year, but

there’s a possibility of more than one because they may fail their freshman year and have

to repeat it again. On average freshman spent one point eleven years classified as freshman.

What about this point eight eight four? The freshman might drop out before ever making

it to the junior year so they may not transition through the junior year (which is why that

number’s less than one). There’s no guarantee that they’ll be ever classified as a junior.

Of course they may be a junior more than one year. They may be a junior and fail that year

and have to repeat as a junior the next year, so it’s possible that a freshman could spend

more than one year as a junior (but it averages out to a little less than one year because

some freshman never make it that far), and even less for a senior. Why is there a zero

down in this spot? Well if somebody’s already a junior, then the expected number for the

expected number of additional years they’ll spend as sophomore is zero. They’ll never

go backwards to being a sophomore. If someone is a senior, why is this number a little bit

bigger than one? Well, they’re already guaranteed their one year as a senior since they’re

already a senior and if they fail their year they’ll have to repeat as a senior the next

year also. That has a small probability of happening, which means the average will be

a little bit more than one year that a senior will be in the school. Let’s think about

the row sums. If you add up the numbers in the first row you get about three point eight.

What is that saying? It’s saying that for somebody who’s a freshman, if you add up

the expected value for the number of years they’ll spend as a freshman, sophomore,

junior, and senior, you get about three-point eight as the expected value for the number

of years they’ll spend in school. Of course it will take them four years at least to graduate.

For someone who’s graduating it will be one year as a freshman, one as a sophomore,

one as a junior, and one as a senior guaranteed. Possibly additional years if they have to

repeat any grades. But on average how can this average out to be less than the four

years required for graduation? Well it averages out of that simply because some students drop

out before graduating (so they’re in school for a much shorter time). Remember these are

averages (expected values). For somebody who’s a sophomore, of course if you look at the

number of years to graduate, if they’re going to graduate then they’ll spend the

sophomore, junior, and senior years doing it. That’s at least three years. Not all

of them will graduate but for somebody who has made it successfully to the sophomore

year, the expected value now turns out to be a little more than three because the chances

of somebody repeating slightly more than offsets the possibility of dropping out. Some of the

sophomores will spend less than three years because they drop out. Some will spend more

because they have to repeat. It averages out to a little bit more than three years, and

similarly the junior row sum indicates that on average somebody who’s a junior will

spend a little more than two years in the school and for seniors, one point O point

three.