Final MLB Power Rankings

Mephistopheles · October 16, 2007

1.568 Bos

1.419 NYY

.814 Cle

.769 LAA

.764 Det

.668 Tor

.408 Col

.245 Atl

.243 SD

.224 Min

.195 Sea

.152 Oak

.151 Phi

.104 Tex

.095 NYM

-.054 ChC

-.107 LAD

-.150 KC

-.244 Mil

-.262 Bal

-.295 Ari

-.328 SF

-.525 TB

-.545 CWS

-.740 Cin

-.834 Fla

-.841 Was

-.879 Hou

-.955 StL

-1.061 Pit

Methodology:

It's basically run dominance over teams with an adjustment for SOS. I'm going to assume a four team league instead of a 30 team system to explain the method simpler. We have four teams, let's be bland and call them teams A, B, C and D. They play some games:

A 5, B 10

A 7, D 0

B 10, C 7

C 3, D 10

As you can see they each played two games. B went 2-0, A went 1-1, C went 2-0 and D went 1-1. In standings we would rank them A, BCt, D but this does not really show us an accurate representation of the teams ability. We have to use dominance. In this method I'm just going to call dominance over another team as Rs - Ra, so that's how we get negative and positive numbers. 0, of course, is average.

To adjust for SOS we need something called an incidence matrix. This incidence matrix is nothing more than a matrix that represents the number of games played against other teams. So it's like this

__A B C D

A 0 1 0 1

B 1 0 1 0

C 0 1 0 1

D 1 0 1 0

We actually have to have the teams playing themselves once (the diagonal is changed to ones). This is needed for later.

Incidence Matrix =

(Let's call it M)

__A B C D

A 1 1 0 1

B 1 1 1 0

C 0 1 1 1

D 1 0 1 1

If we take M.M we get

__A B C D

A 3 2 2 2

B 2 3 2 2

C 2 2 3 2

D 2 2 2 3

This represents how basically many times our opponents played these opponents. This is called a generation and the M^n is the nth generation, obviously. In order to get nth order dominance we need some calculations but first we need our score vector, S, is the difference Rs - Ra:

A 7

B 8

C -10

D -5

The first generation is dominance over something is given by:

1/3 (M^0).S=

A 2.33

B 2.67

C -3.33

D -1.67

The second generation is

1/3^2 ( 3M^0.S + M^1.S)

The 1/3 represents the number of games each team played. We need this to convert our matrix to a markov matrix which allows us to find the average dominance. Since we're wanting an average we have to divide things by their total games times their total games. Think it through, dividing it once will give us just the number of games they played, but since we're using the number of games their opponents played, we have to add all those in. A markov matrix is a matrix whose rows add up to one with no negative values, it's used a lot in probabilities for the obvious reasons. So for the first generation each team played itself, and two other teams, so 3 total. So we have to divide by nine (3 teams they played and the three teams their opponents played). To illustrate the second generation let's look at A.

A played B and D. So how many times did it's opponents play team A (3, A played them, B did and so did D). How many times did they play B (2, A did, B did, D didnt), how many times did they played C (2, A didnt, B did, D did). How many times did they play D (2, A did, B didn't, C did). So 3 + 2 + 2 + 2 = 9. So we have to divide by 9*3 because those 9 teams played 3 games. The next generation would be how many times did those teams play those teams, and etc. So the dividing factor goes up by 3^n.

In general, the nth generation is (1/3)*(M/3)^(n-1)

M/3 is our markov matrix and this is a sequence of markov matrices, known as a markov chain. For all intents and purposes a markov chain will converge if some k, M^k will have all nonzero entries and M^h will have all non zero entries for all h>k.

In our case we need two things for this two happen.

1. The teams all have to be linked eventually (it can be 100 generations down the road), but you can't have two disctinct systems and compare them (ie pre-interleague days).

2. The teams must play themselves. If we have 0s in our diagonal, it won't be have all non zero entries for for each h>k.

To illustrate consider a 2 x 2 matrix, B, like so

[0 b]

[c 0]

B^2 =

[bc 0]

[0 bc]

B^3 =

[0 b^2c]

[bc^2 0]

And so on.

The actual ratings of the teams are going to be the limit of the markov chain. One property of markov chains (M/3 is what we are looking at) is that it has one dominant eigenvalue which is equal to 1, and the entries for it's eigenvector are the same constant. Furthermore, all other eigenvalues are less than 1. Finally since M is symmetric we can find four linearly independent orthonormal eigenvectors, one for each of our four eigenvalues. They are:

v[1] =

[1/2]

v[2] =

[-1/sqrt2]

[0]

[1/sqrt2]

[0]

v[3] =

[0]

[-1/sqrt2]

[0]

[1/sqrt2]

v[4] =

[-1/2]

[1/2]

[-1/2]

[1/2]

And each v corresponds to the eigenvalues 1, 1/3, 1/3 and -1/3, respectively. Since the eigenvectors form a basis, we can write S as a linear combination of them. The coefficient can be found by using a system of four equations, or the easy method is by finding the corresponding scalar product of S and the eigenvector in question. In this case we're left with:

S = 0*v[1] - 17/sqrt2*v[2] - 13/sqrt2*v[3] +3*v[4]

This will leave us with three new vectors, which are still eigenvectors of M/3 with eigenvalues of 1, 1/3, 1/3 and -1/3. Call them s[1], s[2], s[3], and s[4]. Note that S[1] is zero so just ignore it from now. We can rewrite our sum as

1/3 * sum[ (M/3)^j . (s[2] + s[3] + s[4])]

Note: it went from j-1 to j because im starting j at 0 as opposed to 1 now.

This can be broken up into three sums and since we know the respective eigenvalues the sums simplify as well (M/3 becomes the respective eigenvalue)

1/3 *[ sum[ (1/3)^j . (s[2])] + sum[ (1/3)^j . s[3]) + sum[ (-1/3)^j . s[4])]]

Now some of you may or may not know that the sum of (1/x)^infinite is equal to 1/(1-1/x). We want to know the limit as the generations go to infinite so we're taking the limit as j goes to infinite.

So it becomes

1/3 * (3/2s[2] + 3/2s[3] 3/4s[4])

Which is our final ranking:

A = 3.875

B = 3.625

C = -4.625

D = -2.785

Note: In general a markov matrix is a matrix where the columns add up to 1, not rows, but since this matrix is symmetric it's both. Using rows let me illustrate it easier.

I wrote this so someone who has taken just linear algebra can understand it.

wolf stansson · October 16, 2007

Suck on that, Jake.

vance_the_cubs_fan · October 16, 2007

Bottom three all from the NL Central. Yikes!

Derwood · October 16, 2007

words words words numbers

PROFIT?

GoCubsGo1679625768 · October 17, 2007

1.568 Bos
1.419 NYY
.814 Cle
.769 LAA
.764 Det
.668 Tor
.408 Col
.245 Atl
.243 SD
.224 Min
.195 Sea
.152 Oak
.151 Phi
.104 Tex
.095 NYM
-.054 ChC
-.107 LAD
-.150 KC
-.244 Mil
-.262 Bal
-.295 Ari
-.328 SF
-.525 TB
-.545 CWS
-.740 Cin
-.834 Fla
-.841 Was
-.879 Hou
-.955 StL
-1.061 Pit

Methodology:
It's basically run dominance over teams with an adjustment for SOS. I'm going to assume a four team league instead of a 30 team system to explain the method simpler. We have four teams, let's be bland and call them teams A, B, C and D. They play some games:

A 5, B 10
A 7, D 0
B 10, C 7
C 3, D 10

As you can see they each played two games. B went 2-0, A went 1-1, C went 2-0 and D went 1-1. In standings we would rank them A, BCt, D but this does not really show us an accurate representation of the teams ability. We have to use dominance. In this method I'm just going to call dominance over another team as Rs - Ra, so that's how we get negative and positive numbers. 0, of course, is average.

To adjust for SOS we need something called an incidence matrix. This incidence matrix is nothing more than a matrix that represents the number of games played against other teams. So it's like this

__A B C D
A 0 1 0 1
B 1 0 1 0
C 0 1 0 1
D 1 0 1 0

We actually have to have the teams playing themselves once (the diagonal is changed to ones). This is needed for later.

Incidence Matrix =
(Let's call it M)
__A B C D
A 1 1 0 1
B 1 1 1 0
C 0 1 1 1
D 1 0 1 1

If we take M.M we get
__A B C D
A 3 2 2 2
B 2 3 2 2
C 2 2 3 2
D 2 2 2 3

This represents how basically many times our opponents played these opponents. This is called a generation and the M^n is the nth generation, obviously. In order to get nth order dominance we need some calculations but first we need our score vector, S, is the difference Rs - Ra:

A 7
B 8
C -10
D -5

The first generation is dominance over something is given by:

1/3 (M^0).S=

A 2.33
B 2.67
C -3.33
D -1.67

The second generation is
1/3^2 ( 3M^0.S + M^1.S)

The 1/3 represents the number of games each team played. We need this to convert our matrix to a markov matrix which allows us to find the average dominance. Since we're wanting an average we have to divide things by their total games times their total games. Think it through, dividing it once will give us just the number of games they played, but since we're using the number of games their opponents played, we have to add all those in. A markov matrix is a matrix whose rows add up to one with no negative values, it's used a lot in probabilities for the obvious reasons. So for the first generation each team played itself, and two other teams, so 3 total. So we have to divide by nine (3 teams they played and the three teams their opponents played). To illustrate the second generation let's look at A.

A played B and D. So how many times did it's opponents play team A (3, A played them, B did and so did D). How many times did they play B (2, A did, B did, D didnt), how many times did they played C (2, A didnt, B did, D did). How many times did they play D (2, A did, B didn't, C did). So 3 + 2 + 2 + 2 = 9. So we have to divide by 9*3 because those 9 teams played 3 games. The next generation would be how many times did those teams play those teams, and etc. So the dividing factor goes up by 3^n.

In general, the nth generation is (1/3)*(M/3)^(n-1)

M/3 is our markov matrix and this is a sequence of markov matrices, known as a markov chain. For all intents and purposes a markov chain will converge if some k, M^k will have all nonzero entries and M^h will have all non zero entries for all h>k.

In our case we need two things for this two happen.
1. The teams all have to be linked eventually (it can be 100 generations down the road), but you can't have two disctinct systems and compare them (ie pre-interleague days).
2. The teams must play themselves. If we have 0s in our diagonal, it won't be have all non zero entries for for each h>k.

To illustrate consider a 2 x 2 matrix, B, like so
[0 b]
[c 0]

B^2 =
[bc 0]
[0 bc]

B^3 =
[0 b^2c]
[bc^2 0]

And so on.

The actual ratings of the teams are going to be the limit of the markov chain. One property of markov chains (M/3 is what we are looking at) is that it has one dominant eigenvalue which is equal to 1, and the entries for it's eigenvector are the same constant. Furthermore, all other eigenvalues are less than 1. Finally since M is symmetric we can find four linearly independent orthonormal eigenvectors, one for each of our four eigenvalues. They are:

v[1] =
[1/2]
[1/2]
[1/2]
[1/2]

v[2] =
[-1/sqrt2]
[0]
[1/sqrt2]
[0]

v[3] =
[0]
[-1/sqrt2]
[0]
[1/sqrt2]

v[4] =
[-1/2]
[1/2]
[-1/2]
[1/2]

And each v corresponds to the eigenvalues 1, 1/3, 1/3 and -1/3, respectively. Since the eigenvectors form a basis, we can write S as a linear combination of them. The coefficient can be found by using a system of four equations, or the easy method is by finding the corresponding scalar product of S and the eigenvector in question. In this case we're left with:

S = 0*v[1] - 17/sqrt2*v[2] - 13/sqrt2*v[3] +3*v[4]

This will leave us with three new vectors, which are still eigenvectors of M/3 with eigenvalues of 1, 1/3, 1/3 and -1/3. Call them s[1], s[2], s[3], and s[4]. Note that S[1] is zero so just ignore it from now. We can rewrite our sum as

1/3 * sum[ (M/3)^j . (s[2] + s[3] + s[4])]

Note: it went from j-1 to j because im starting j at 0 as opposed to 1 now.

This can be broken up into three sums and since we know the respective eigenvalues the sums simplify as well (M/3 becomes the respective eigenvalue)
1/3 *[ sum[ (1/3)^j . (s[2])] + sum[ (1/3)^j . s[3]) + sum[ (-1/3)^j . s[4])]]

Now some of you may or may not know that the sum of (1/x)^infinite is equal to 1/(1-1/x). We want to know the limit as the generations go to infinite so we're taking the limit as j goes to infinite.

So it becomes

1/3 * (3/2s[2] + 3/2s[3] 3/4s[4])

Which is our final ranking:
A = 3.875
B = 3.625
C = -4.625
D = -2.785

Note: In general a markov matrix is a matrix where the columns add up to 1, not rows, but since this matrix is symmetric it's both. Using rows let me illustrate it easier.

I wrote this so someone who has taken just linear algebra can understand it.

http://www.feebleminds-gifs.com/exploding-head.gif

Roast · October 17, 2007

:shaking:

Flames24Rulz · October 17, 2007

Take that, stock market.

mhuber92211 · October 17, 2007

exact proof that stats can be used to say whatever you want them too and that is why the interpretation of stats are just about as fact as scouts watching aplayer

Mephistopheles · October 17, 2007

Did you read the thread?

imb · October 17, 2007

exact proof that stats can be used to say whatever you want them too and that is why the interpretation of stats are just about as fact as scouts watching aplayer

really? this is exact proof of that how? Go ahead and explain it.

Geech · October 17, 2007

Did you read the thread?

Clearly, no one but a computer could read any of this.

Mephistopheles · October 17, 2007

i was asking huber because of his skepticism of the model

Sammy Sofa · October 17, 2007

http://robotporn.de/wp-content/uploads/2007/01/robot_porn.gif

Geech · October 17, 2007

i was asking huber because of his skepticism of the model

I know, I was being facetious.

seanimal · October 17, 2007

exact proof that stats can be used to say whatever you want them too and that is why the interpretation of stats are just about as fact as scouts watching aplayer

pie, meet face.

jaxxradio · October 17, 2007

Lord, that was fascinating how you used your incredible math skills to come up with this power rating. One question.. does this formula include pitching? If it did, I would be curious how the Yankees (with a below average pitching staff) would wind up number two, ahead of Cleveland (who has a very good pitching staff).

hardcorecubfan · October 17, 2007

What is SOS? forgive my ignorance. Much of that looked like binary

jersey cubs fan · October 17, 2007

Lord, that was fascinating how you used your incredible math skills to come up with this power rating. One question.. does this formula include pitching? If it did, I would be curious how the Yankees (with a below average pitching staff) would wind up number two, ahead of Cleveland (who has a very good pitching staff).

Weird question.

Mephistopheles · October 17, 2007

SOS = Strength Of Schedule

Lord, that was fascinating how you used your incredible math skills to come up with this power rating. One question.. does this formula include pitching? If it did, I would be curious how the Yankees (with a below average pitching staff) would wind up number two, ahead of Cleveland (who has a very good pitching staff).

it uses RS and RA. I probably should have scaled Rs and Ra, but I did not. The whole idea that pitching wins championships and games really is overblown.

jersey cubs fan · October 17, 2007

What is SOS? forgive my ignorance. Much of that looked like binary

Strength of schedule. Adjusting for the fact that teams like the Cubs got to play STL, HOU and PIT a whole bunch whereas Boston had to play the Yankees and Toronto in their division.

cubweiser03 · October 17, 2007

Interesting with the 'power' but it has a rather 'Rude Goldberg' sense to it.

Warpticon · October 17, 2007

Interesting with the 'power' but it has a rather 'Rude Goldberg' sense to it.

To be fair, I hear Goldberg was a lot politer out of the ring.

wolf stansson · October 17, 2007

Lord, that was fascinating how you used your incredible math skills to come up with this power rating. One question.. does this formula include pitching? If it did, I would be curious how the Yankees (with a below average pitching staff) would wind up number two, ahead of Cleveland (who has a very good pitching staff).

Much like the gas station attendant in Tommy Boy, I sense your sarcasm.

fibro · October 18, 2007

1.568 Bos
1.419 NYY
.814 Cle
.769 LAA
.764 Det
.668 Tor
.408 Col
.245 Atl
.243 SD
.224 Min
.195 Sea
.152 Oak
.151 Phi
.104 Tex
.095 NYM
-.054 ChC
-.107 LAD
-.150 KC
-.244 Mil
-.262 Bal
-.295 Ari
-.328 SF
-.525 TB
-.545 CWS
-.740 Cin
-.834 Fla
-.841 Was
-.879 Hou
-.955 StL
-1.061 Pit

Methodology:
It's basically run dominance over teams with an adjustment for SOS. I'm going to assume a four team league instead of a 30 team system to explain the method simpler. We have four teams, let's be bland and call them teams A, B, C and D. They play some games:

A 5, B 10
A 7, D 0
B 10, C 7
C 3, D 10

As you can see they each played two games. B went 2-0, A went 1-1, C went 2-0 and D went 1-1. In standings we would rank them A, BCt, D but this does not really show us an accurate representation of the teams ability. We have to use dominance. In this method I'm just going to call dominance over another team as Rs - Ra, so that's how we get negative and positive numbers. 0, of course, is average.

To adjust for SOS we need something called an incidence matrix. This incidence matrix is nothing more than a matrix that represents the number of games played against other teams. So it's like this

__A B C D
A 0 1 0 1
B 1 0 1 0
C 0 1 0 1
D 1 0 1 0

We actually have to have the teams playing themselves once (the diagonal is changed to ones). This is needed for later.

Incidence Matrix =
(Let's call it M)
__A B C D
A 1 1 0 1
B 1 1 1 0
C 0 1 1 1
D 1 0 1 1

If we take M.M we get
__A B C D
A 3 2 2 2
B 2 3 2 2
C 2 2 3 2
D 2 2 2 3

This represents how basically many times our opponents played these opponents. This is called a generation and the M^n is the nth generation, obviously. In order to get nth order dominance we need some calculations but first we need our score vector, S, is the difference Rs - Ra:

A 7
B 8
C -10
D -5

The first generation is dominance over something is given by:

1/3 (M^0).S=

A 2.33
B 2.67
C -3.33
D -1.67

The second generation is
1/3^2 ( 3M^0.S + M^1.S)

The 1/3 represents the number of games each team played. We need this to convert our matrix to a markov matrix which allows us to find the average dominance. Since we're wanting an average we have to divide things by their total games times their total games. Think it through, dividing it once will give us just the number of games they played, but since we're using the number of games their opponents played, we have to add all those in. A markov matrix is a matrix whose rows add up to one with no negative values, it's used a lot in probabilities for the obvious reasons. So for the first generation each team played itself, and two other teams, so 3 total. So we have to divide by nine (3 teams they played and the three teams their opponents played). To illustrate the second generation let's look at A.

A played B and D. So how many times did it's opponents play team A (3, A played them, B did and so did D). How many times did they play B (2, A did, B did, D didnt), how many times did they played C (2, A didnt, B did, D did). How many times did they play D (2, A did, B didn't, C did). So 3 + 2 + 2 + 2 = 9. So we have to divide by 9*3 because those 9 teams played 3 games. The next generation would be how many times did those teams play those teams, and etc. So the dividing factor goes up by 3^n.

In general, the nth generation is (1/3)*(M/3)^(n-1)

M/3 is our markov matrix and this is a sequence of markov matrices, known as a markov chain. For all intents and purposes a markov chain will converge if some k, M^k will have all nonzero entries and M^h will have all non zero entries for all h>k.

In our case we need two things for this two happen.
1. The teams all have to be linked eventually (it can be 100 generations down the road), but you can't have two disctinct systems and compare them (ie pre-interleague days).
2. The teams must play themselves. If we have 0s in our diagonal, it won't be have all non zero entries for for each h>k.

To illustrate consider a 2 x 2 matrix, B, like so
[0 b]
[c 0]

B^2 =
[bc 0]
[0 bc]

B^3 =
[0 b^2c]
[bc^2 0]

And so on.

The actual ratings of the teams are going to be the limit of the markov chain. One property of markov chains (M/3 is what we are looking at) is that it has one dominant eigenvalue which is equal to 1, and the entries for it's eigenvector are the same constant. Furthermore, all other eigenvalues are less than 1. Finally since M is symmetric we can find four linearly independent orthonormal eigenvectors, one for each of our four eigenvalues. They are:

v[1] =
[1/2]
[1/2]
[1/2]
[1/2]

v[2] =
[-1/sqrt2]
[0]
[1/sqrt2]
[0]

v[3] =
[0]
[-1/sqrt2]
[0]
[1/sqrt2]

v[4] =
[-1/2]
[1/2]
[-1/2]
[1/2]

And each v corresponds to the eigenvalues 1, 1/3, 1/3 and -1/3, respectively. Since the eigenvectors form a basis, we can write S as a linear combination of them. The coefficient can be found by using a system of four equations, or the easy method is by finding the corresponding scalar product of S and the eigenvector in question. In this case we're left with:

S = 0*v[1] - 17/sqrt2*v[2] - 13/sqrt2*v[3] +3*v[4]

This will leave us with three new vectors, which are still eigenvectors of M/3 with eigenvalues of 1, 1/3, 1/3 and -1/3. Call them s[1], s[2], s[3], and s[4]. Note that S[1] is zero so just ignore it from now. We can rewrite our sum as

1/3 * sum[ (M/3)^j . (s[2] + s[3] + s[4])]

Note: it went from j-1 to j because im starting j at 0 as opposed to 1 now.

This can be broken up into three sums and since we know the respective eigenvalues the sums simplify as well (M/3 becomes the respective eigenvalue)
1/3 *[ sum[ (1/3)^j . (s[2])] + sum[ (1/3)^j . s[3]) + sum[ (-1/3)^j . s[4])]]

Now some of you may or may not know that the sum of (1/x)^infinite is equal to 1/(1-1/x). We want to know the limit as the generations go to infinite so we're taking the limit as j goes to infinite.

So it becomes

1/3 * (3/2s[2] + 3/2s[3] 3/4s[4])

Which is our final ranking:
A = 3.875
B = 3.625
C = -4.625
D = -2.785

Note: In general a markov matrix is a matrix where the columns add up to 1, not rows, but since this matrix is symmetric it's both. Using rows let me illustrate it easier.

I wrote this so someone who has taken just linear algebra can understand it.

Looks like someone needs a robot wife.

October 18, 2007

Some people really don't like math, eh?

Final MLB Power Rankings

Recommended Posts

Top Posters In This Topic

Top Posters In This Topic

Guest

Create an account or sign in to comment

Create an account

Sign in

Member Statistics

Prospect News & Highlights

Recent News

Notes & Rumors

Recent Blogs