The Olympics are finally here, and I for one am really pleased and excited by the whole thing. I've always loved the Olympics from a young boy right through to adulthood. I'll be watching a ton of it on the tellybox and I will be cheering the Great Britain team on and hoping each participant can excel in their individual sports.
So why do I have this post title then? Well, unfortunately I was massively underwhelmed by the opening ceremony, and felt that huge chunks of it simply didn't work. It lacked cohesion and, for anyone other than a British national, whole sections would have left world's onlookers utterly bemused.
I'm sorry, but you cannot try to encapsulate "Britishness" in a two or three hour ceremony. Who's the smartarse that thought we should even try? Putting on a spectacular display is the only thing that's expected  and more importantly, the only thing that is wanted.
Mary Poppins flying down from the sky? Puhlease!
J.K.Rowling and then the NHS? Do me a favour.
The industrial revolution and prats walking around in top hats performing some demented mime act? Just fuck off.
And what was the Tellytubbies landscape all about? It all left me cold, I'm afraid. The highlight for me  and to be fair there were some excellent parts of the ceremony  was the lighting of the flame, which I thought was quite superb, original and ingenious. But even that was messedup by giving the task of lighting the flame to a bunch of hopefuls who may go on to achieve absolutely nothing in their sports careers. Surely we should have given that honour to someone who's already put in the hours and reached the pinnacle of their sport; someone who's already acheived the highest honours on the world stage; someone who's actually fought and toiled and worked all through their sporting careers, and earned the huge privilege of lighting the flame... Erm, no, let's give it to a group of spotty teenagers who we may never see again.
Okay, I'll leave it there. I have to say, I've really enjoyed my "moanfest", and feel much better for having got it off my chest.
Phew!
Yes, yet another selfobsessed moron waffling on subjects of which he has no real knowledge or understanding. Enjoy.
Saturday, 28 July 2012
Wednesday, 25 July 2012
Compiling Match Odds (Part IV)
Yawn!
I'm starting to get a bit bored of this series now  as I'm sure you are, so this may well be my last post on this welltrodden subject.
In case you hadn't worked it out for yourself, the last post on match odds  which in my humble opinion is the most worthwhile entry amongst the four that I've now posted  can also be used for generating match odds on any other rating mechanism you may employ that also arrives at an abstract number, rather than a percentage probability. Namely, these would be Game Form, Score Ability, Power Ratings and Penetration Plus. These are probably the most wellknown, but if you're not familiar with them, then may I suggest you hunt out Paul Steele's most excellent book on ratings.
Okay, today we'll have a look at a quick and dirty method for acquiring the match odds. Once again, I have to point out that there is nothing new here. Everything that I am writing is stolen from other sources over the course of time. I should also point out that I use some of the methods I've outlined (and am about to outline) and some I don't.
Quick Match Odds (QMO)
This method is very wellknown, but it doesn't have an official name (that I'm aware of), so we'll just call it Quick Match Odds. This is a nice, snappy name which I intend to copyright and use to make my millions.
I suppose giving this method a cool name also lends it a small degree of credibility  but whether it actually deserves any credibility will be up to you to decide. What I will say is that this method is probably best suited to, erm, the rather lazy amongst you.
Right, let's look at last season's match when the Arse were playing Citeh. The given odds on Betfair were:
At the time, Arsenal had played played at home on 15 occasions, and they had won 11, managed two draws and lost two. Man City had played away 15 times, and they had won 7, drawn four and lost four. If we use this simple formula:
Home Win Odds = ((Arse home_wins + Citeh away_losses) X 100)
/ (total_games)
Then we get Arsenal = ((11 + 4) X 100) / (30) = 50%
A 50% chance of winning equates to match odds of 2.00 (no overround applied).
Let's do the same with Man City using the inverse formula:
Away Win Odds = ((Citeh away_wins + Arse home_losses) X 100)
/ (total_games)
So for Man City = (( 7 + 2) X 100) / (30) = 30%
Giving a 30% chance of winning, which equates to match odds of 3.33 (again, no overround). If we use the remaining part of the book for the draw, this gives us:
Arsenal = 2.00
Man City = 3.33
Draw = 5.00
As you can see, this is a rather coarse, high granularity method for deriving the match odds, but as a quick roughandready guide it may have some merit. Then again, it may not.
In the words of the Blind Date voiceover guy, "You decide!"
Thursday, 19 July 2012
Compiling Match Odds (Part III)
Thought it was about time that I pulled my finger out of my fat arse and carried on with this (occasional) series of posts on compiling match odds. I promise that I have indeed washed said finger before addressing the keyboard.
I suppose the bad weather has helped me to move this series on a little, as I would normally be out barbequeing a lowgrade sausage to within an inch of its life at this time of year, but the incessant rain has driven me indoors instead.
For those who are not familiar with my previous posts on compiling match odds, I have also written these:
If you haven't read any of these, then don't worry as this post today can be read in isolation  although this one does followon from the discussion about ratings in general in my previous post, which you may want to view first before diving into the stuff below.
This may or may not be correct, but either way, we really need to be sure (or as sure as we can be), don't we? And we also need to know how to convert such rating values into match odds. In this way we can compare our rating with the odds currently on offer and decide whether we have found a value bet or not.
So today, we'll try to move such ratings away from their original abstract numbers like +250 or 300 and towards more concrete and usable percentage probabilities. From there, of course, we can then derive some odds.
Before I do so, however, I should pointout that the version I'm detailing here is perhaps the most basic version you can get. There is no consideration in this example for the status of the match, how many goals are scored, league points carried over from previous seasons or previous leagues or anything like that. Some people also addin things like shots on goal and corners. This version has no additional weightings or filtering at all; it’s just the bare bones method to demonstrate how to generate match odds from it. If you want to go on and enhance this version further with some of the things I've mentioned, or some of your own ideas, then do go right ahead.
Once these ratings have matured (and again, I'll leave the definition of "matured" up to you), then by subtracting the away team's ranking from the home team's (which will have a home advantage value added to it), a final Rate Form value will be arrived at, which can be used to determine the likely final outcome.
Some people say that +100 means a home win and 200 means an away win; but I've also heard other people say that +250 or even +500 is a home win – although I suspect that with such a high score, only teams like Man Utd, City and Chelsea will popout of the ratings.
My view, however, is that what "some people say" is completely irrelevant. I don't know about you, but I don't really care what "some people say", I'd be more interested in what people know. I'd be more interested in concrete facts such as "This probability is greater than that probability".
Okay, as you can see I've started waffling and moaning now, so I'll stop that and continue on. The real question is how do we turn a Rate Form rating figure, such as +858 points, into match odds? Well, as usual, there are many answers to this question, some of which are more involved and complex than others. But if any of you have read this blog before, then you’ll know that I tend to favour the line of least resistance, so happily we don't need to be brain surgeons to sort this out. No, we only need access to some historical data from which we can run the Rate Form system, and then we can compare the ratings with actual results. This way, we can see exactly how often a particular Rate Form value results in a home win, a draw or an away win.
For example, if we ran our Rate Form system on the historical data and found 1,000 matches where we had a rating value of +180, then we might endup with something like this:
Arsenal (1919.11) v Liverpool (1537.26)
This means, the calculation to arrive at a Rate Form figure for the match is:
These are distinct values indeed, and even with 10,000 matches our results are going to be spread far too thin from which to make any sense. Taking this rating above, how many matches are we going to find that have resulted in exactly a rating of 481.85 so that we can populate our home win/draw/away win percentages? Not too many, I suspect. One or two at most.
So what should we do about this? Well, I suppose we could just lopoff the fractional part and see where we get but, even then, the spread of potential rating values is most likely still going to be too wide.
You'll have to make your own decisions here, but one other option you might consider is to group some rating numbers together, perhaps in clumps of 25 so that we get less of them. Therefore instead of +100, +101, +102, etc, we would have +100, +125, +150 etc. If a match is rated with an example value of +115, then we’d place it in the +100 group (less than +125 but greater than or equal to +100), and so on.
Frankly, I’ve not done myself any favours here, as The Rate Form system here is harder to create match odds from than, say, Goal Supremacy or Game Form, where their rating methods don't create fractional numbers and, more importantly, don’t have such a large spread of rating values. Anyway, for the moment, we’ll persist with our grouping idea and see how we get on.
It's at this time, we should dispense with the fake, madeup data and start dealing with real historical data. I have run the Rate Form system on 7,481 matches and, after grouping Rate Form by each 25 points, the headline results are:
I don't know how to display a large table of Excel data in Blogger, so I'm going to leave what I've done as a download which you can find HERE. If you look a the "Summary" tab in this downloadable workbook, you can see each individual rating value (161 of them), the number of times that value was rated in a match and the 1X2 spread. And here we can still see some anomalous results.
For example, if you look at the +200 rating, this shows a 51.46% probability of a home win, and yet a stronger rating of +275 only shows a 47.49% probability of a home win. That cannot be right. Why would that be?
Well, this is due to insufficient data, which is effectively causing some “noise”, or kinks in the data. No need to fret though, as we can overcome this problem and smoothen out the results by running some basic regression analysis on our results.
Regression Analysis:
I suppose the bad weather has helped me to move this series on a little, as I would normally be out barbequeing a lowgrade sausage to within an inch of its life at this time of year, but the incessant rain has driven me indoors instead.
For those who are not familiar with my previous posts on compiling match odds, I have also written these:
 Compiling Match Odds (Part I): Here I discussed using Goal Supremacy to create match odds.
 Compiling Match Odds (Part II): Here I discussed using Diads and Triads to create match odds.
I have effectively also shown how to generate match odds using Poisson with these two posts:
If you haven't read any of these, then don't worry as this post today can be read in isolation  although this one does followon from the discussion about ratings in general in my previous post, which you may want to view first before diving into the stuff below.
Right, last week, during my “How To Rate the Rating s” post, I discussed a difficulty with ratings methods which tend to plop a number out at the end of the algorithm that is then often interpreted to be a home win, away win or a draw  but without any reference to the actual match odds.
So, for example, if a rating method arrives at a figure for a match as +250, then we could say it is going to be a home win, 300 is going to be an away win, whilst any range between will probably be a draw. At least that’s the theory.
This may or may not be correct, but either way, we really need to be sure (or as sure as we can be), don't we? And we also need to know how to convert such rating values into match odds. In this way we can compare our rating with the odds currently on offer and decide whether we have found a value bet or not.
So today, we'll try to move such ratings away from their original abstract numbers like +250 or 300 and towards more concrete and usable percentage probabilities. From there, of course, we can then derive some odds.
Rate Form (Elo):
In the last post, I stated (rather piously) that I wasn't going to bother detailing exactly how the Rate Form method was implemented; but I've changed my mind now as, if I'm going to show various methods for compiling match odds, then it might be a good idea to take Rate Form from the beginning right through to the end. It gives a better, overall view.
Sorry, that's the wrong Elo
Before I do so, however, I should pointout that the version I'm detailing here is perhaps the most basic version you can get. There is no consideration in this example for the status of the match, how many goals are scored, league points carried over from previous seasons or previous leagues or anything like that. Some people also addin things like shots on goal and corners. This version has no additional weightings or filtering at all; it’s just the bare bones method to demonstrate how to generate match odds from it. If you want to go on and enhance this version further with some of the things I've mentioned, or some of your own ideas, then do go right ahead.
To recap, the original, eponymouslynamed, Elo system was developed for rating chess players, but this was turned into the Rate Form system for football by Tony Drapkin and Richard Forsyth. Here it is:
 At the beginning of the season, assign each team in the league 1,000 points.
 For any given match, both teams give a percentage of their points towards a shared pot. The home team gives 7%, the away team give 5%.
 The winner gets the whole pot which they then add to their overall points tally.
 Should the match end in a draw, the teams share the pot, often meaning the home team will lose a little bit and the away team will gain a little bit.
That's essentially it. The smart thing about this rating method is that big teams like Man Utd and Chelsea will accrue more points than the smaller teams, and so when these smaller teams do get to play the bigger teams, they get a chance to win a bigger pot than normal. In this way, the quality of the team is accounted for.
Once these ratings have matured (and again, I'll leave the definition of "matured" up to you), then by subtracting the away team's ranking from the home team's (which will have a home advantage value added to it), a final Rate Form value will be arrived at, which can be used to determine the likely final outcome.
Some people say that +100 means a home win and 200 means an away win; but I've also heard other people say that +250 or even +500 is a home win – although I suspect that with such a high score, only teams like Man Utd, City and Chelsea will popout of the ratings.
My view, however, is that what "some people say" is completely irrelevant. I don't know about you, but I don't really care what "some people say", I'd be more interested in what people know. I'd be more interested in concrete facts such as "This probability is greater than that probability".
Okay, as you can see I've started waffling and moaning now, so I'll stop that and continue on. The real question is how do we turn a Rate Form rating figure, such as +858 points, into match odds? Well, as usual, there are many answers to this question, some of which are more involved and complex than others. But if any of you have read this blog before, then you’ll know that I tend to favour the line of least resistance, so happily we don't need to be brain surgeons to sort this out. No, we only need access to some historical data from which we can run the Rate Form system, and then we can compare the ratings with actual results. This way, we can see exactly how often a particular Rate Form value results in a home win, a draw or an away win.
For example, if we ran our Rate Form system on the historical data and found 1,000 matches where we had a rating value of +180, then we might endup with something like this:
Rate Form

Total Matches

1

X

2

+180

1,000

655

152

193

Do note that this is a fake example, but this shows us that, when a Rate Form rating of +180 was calculated on 1,000 matches, it resulted in 65% ending with a home win, 15% in a draw and 20% in an away win. If we did this for all valid and likely Rate Form values with a reasonably large data set, then we could start to imply a set of odds for each rating value.
Rate Form

Total Matches

1

X

2

+180

1,000

655

152

193

Match Odds

1.52

6.56

5.17

Obviously there is no overround included in these figures.
Hmm, that’s all well and good, but there are obvious problems. Firstoff, when using Rate Form, the system can produce these kind of ratings:
This means, the calculation to arrive at a Rate Form figure for the match is:
(Arsenal (1919.11 ) + home advtge (100))  Liverpool (1537.26) = 481.85
These are distinct values indeed, and even with 10,000 matches our results are going to be spread far too thin from which to make any sense. Taking this rating above, how many matches are we going to find that have resulted in exactly a rating of 481.85 so that we can populate our home win/draw/away win percentages? Not too many, I suspect. One or two at most.
So what should we do about this? Well, I suppose we could just lopoff the fractional part and see where we get but, even then, the spread of potential rating values is most likely still going to be too wide.
You'll have to make your own decisions here, but one other option you might consider is to group some rating numbers together, perhaps in clumps of 25 so that we get less of them. Therefore instead of +100, +101, +102, etc, we would have +100, +125, +150 etc. If a match is rated with an example value of +115, then we’d place it in the +100 group (less than +125 but greater than or equal to +100), and so on.
Frankly, I’ve not done myself any favours here, as The Rate Form system here is harder to create match odds from than, say, Goal Supremacy or Game Form, where their rating methods don't create fractional numbers and, more importantly, don’t have such a large spread of rating values. Anyway, for the moment, we’ll persist with our grouping idea and see how we get on.
It's at this time, we should dispense with the fake, madeup data and start dealing with real historical data. I have run the Rate Form system on 7,481 matches and, after grouping Rate Form by each 25 points, the headline results are:
RF Values

Total Matches

1

X

2

161

7,481

3,416

2,001

2,064

45.66%  26.75%  27.59% 
Okay, this isn’t looking too bad, and the home win/draw/away win percentages are not a million miles away from the longterm averages  but what about the breakdown of actual points?
I don't know how to display a large table of Excel data in Blogger, so I'm going to leave what I've done as a download which you can find HERE. If you look a the "Summary" tab in this downloadable workbook, you can see each individual rating value (161 of them), the number of times that value was rated in a match and the 1X2 spread. And here we can still see some anomalous results.
For example, if you look at the +200 rating, this shows a 51.46% probability of a home win, and yet a stronger rating of +275 only shows a 47.49% probability of a home win. That cannot be right. Why would that be?
Well, this is due to insufficient data, which is effectively causing some “noise”, or kinks in the data. No need to fret though, as we can overcome this problem and smoothen out the results by running some basic regression analysis on our results.
Regression Analysis:
Okay, don’t run away scared by this rather technicalsounding title. This doesn't involve a psychiatrist's couch, you'll be pleased to know. Regression analysis is nothing more than a fairly boring statistical method. Personally, I took the time and trouble to learn the basic maths behind regression analysis for myself, but I won’t trouble you with them. Instead, we can relax and let Microsoft Excel take all the strain. It’s really very simple indeed this way.
If you look at the "Regression" tab of the workbook, you'll see I have grouped all the Rate Form values from +1475 all the way down to 1400 along with each corresponding home win percentages that were acquired from running the system against the 7,481 matches (columns B and C in the worksheet). I've also done exactly the same thing for the away wins (columns H and I).
Then I highlighted the data in the B & C columns (B4:C119) and then I went to the "Insert" ribbon within Excel and selected the little arrow underneath "Scatter". I then selected the first picture (top left). This plops a diagram onto the currently open worksheet, showing all the rating values against the actual percentages attained. As you can see, there seems to be an upward line in the data (which is what we're looking for). If I rightclick on the mass of dots within the diagram, a context menu appears, from which we can select "Add Trendline...".
Within the subsequent box that opens, I can select the radio button marked "Display equation on chart" and also the one marked "Display Rsquared value on chart". I then closed that box. This now gives me a nice trendline on the chart itself  but more importantly I also have two other items on the diagram. The first one is a readymade equation that I can use to create my probabilities for the home win, and the second one (the R_{2} = is a measurement of how closely this data matches the ideal trend line. An absolutely perfect relationship would be a value of 1, although anything above .60 is probably okay. If the R_{2} value is around the .50 mark, then you should perhaps go back to the drawing board.
Using the given equation, we simply need to substitute the x value with our Rate Form value, so if we have a +150 rating value, then the equation becomes:
P = (0.0002 * 150) + 0.4521.
This gives us a probability of 0.4821 (or 48.21%). Therefore a +150 rating is equivalent to home odds of 2.07. If we truly believe that this rating is accurate and the bookies are offering odds of 2.20, then perhaps we have a value bet on our hands.
Right, so we do exactly the same for the away data as we do for the home data. Select it all, and create a scatter plot based on that data. Do note, that when creating a trendline, you do also have the option of addingin a different type of trendline. Linear looks to be the correct type for the data I have shown, but this may not be so for all cases. Do experiment. Remember, we are looking for the highest R_{2} value we can get.
Once we have the away odds created, we can then either do the same for the draw values, or alternatively we could just subtract the home and away probabilities for each rating value from 1, giving the remaining amount for the draw. It might be good, however to actually create the scatter plot for the draw as we can then see how efficient the trendline is. Don’t be surprised if it’s not very efficient at all! Anyway, I'll leave that part up to you (as an exercise perhaps).
Improvements:
Okay, so what can we do to improve the efficiency of what we have done here today? Well, there are a number of things you may want to look at:
All of these should help to improve things and help to bring your R_{2} value up further towards 1. Do keep an eye on that value.
Okay, we’ll I’m going to leave it here. Hopefully I haven’t confused too many of you, either through not explaining all this properly or by making it too complex. Apologies if I have done either of those things.
However, hopefully I have shown at least a few of you who didn't know before just how you can create match odds from seemingly unrelated rating values. From there you should now have a fighting chance of being able to find some value bets for yourself out there.
Good luck.
If you look at the "Regression" tab of the workbook, you'll see I have grouped all the Rate Form values from +1475 all the way down to 1400 along with each corresponding home win percentages that were acquired from running the system against the 7,481 matches (columns B and C in the worksheet). I've also done exactly the same thing for the away wins (columns H and I).
Then I highlighted the data in the B & C columns (B4:C119) and then I went to the "Insert" ribbon within Excel and selected the little arrow underneath "Scatter". I then selected the first picture (top left). This plops a diagram onto the currently open worksheet, showing all the rating values against the actual percentages attained. As you can see, there seems to be an upward line in the data (which is what we're looking for). If I rightclick on the mass of dots within the diagram, a context menu appears, from which we can select "Add Trendline...".
Within the subsequent box that opens, I can select the radio button marked "Display equation on chart" and also the one marked "Display Rsquared value on chart". I then closed that box. This now gives me a nice trendline on the chart itself  but more importantly I also have two other items on the diagram. The first one is a readymade equation that I can use to create my probabilities for the home win, and the second one (the R_{2} = is a measurement of how closely this data matches the ideal trend line. An absolutely perfect relationship would be a value of 1, although anything above .60 is probably okay. If the R_{2} value is around the .50 mark, then you should perhaps go back to the drawing board.
Using the given equation, we simply need to substitute the x value with our Rate Form value, so if we have a +150 rating value, then the equation becomes:
P = (0.0002 * 150) + 0.4521.
This gives us a probability of 0.4821 (or 48.21%). Therefore a +150 rating is equivalent to home odds of 2.07. If we truly believe that this rating is accurate and the bookies are offering odds of 2.20, then perhaps we have a value bet on our hands.
Right, so we do exactly the same for the away data as we do for the home data. Select it all, and create a scatter plot based on that data. Do note, that when creating a trendline, you do also have the option of addingin a different type of trendline. Linear looks to be the correct type for the data I have shown, but this may not be so for all cases. Do experiment. Remember, we are looking for the highest R_{2} value we can get.
Once we have the away odds created, we can then either do the same for the draw values, or alternatively we could just subtract the home and away probabilities for each rating value from 1, giving the remaining amount for the draw. It might be good, however to actually create the scatter plot for the draw as we can then see how efficient the trendline is. Don’t be surprised if it’s not very efficient at all! Anyway, I'll leave that part up to you (as an exercise perhaps).
Improvements:
Okay, so what can we do to improve the efficiency of what we have done here today? Well, there are a number of things you may want to look at:
 Decide whether the basic Rate Form method that I’ve outlined is really the best one to use. As mentioned above, it’s possible to look into other ways of producing Rate Form ratings by introducing other variables and by refining the approach. This is well worth pursuing.
 Look again at whether the 25 point grouping is the best approach for shrinking the data set down.
 Cut the outer ranges that we’ve included in our ratings. Presently I have shown extreme ranges in the data such as +1475 and 1400. These will not occur that often and will be skewing the results we’re getting. If we just concentrated on the more common values, not only will we be shrinking our data set down (which is still too big) but we should also increase the efficiency of our regression.
All of these should help to improve things and help to bring your R_{2} value up further towards 1. Do keep an eye on that value.
Okay, we’ll I’m going to leave it here. Hopefully I haven’t confused too many of you, either through not explaining all this properly or by making it too complex. Apologies if I have done either of those things.
However, hopefully I have shown at least a few of you who didn't know before just how you can create match odds from seemingly unrelated rating values. From there you should now have a fighting chance of being able to find some value bets for yourself out there.
Good luck.
Saturday, 14 July 2012
How To Rate the Ratings
I've frequently discussed various ratings methods on this blog, and ratings in general have been talked about, picked over, criticised and praised at different times and at different places on a multitude of bettingrelated websites over the years.
As you probably know, I am a fan of ratings and statistical analysis  in my view they are a very useful aide, a handy addition to a bettor's armoury in helping to increase profitability, even if they are not the definitive answer. Even on their own, a decent set of ratings will dramatically increase someone's hit rate above random guessing... But there is an issue with a lot of rating methods that often goes unnoticed, or is not discussed. In the worst cases, the problem is even deliberately ignored.
So both for advocates of rating methods and also for those of you that feel they are, ahem, overrated, today I'm going to discuss one of their shortcomings, or perhaps I should say one of their potential shortcomings when they are used in a particular way.
I suppose there's no more classic example of a ratings system than Elo, which is perhaps the bestknown of them all, and perhaps one of the most heavily employed. As many (most) [all] of you will know, this was a system first developed to rate chess players, but which has subsequently also been heavily used to rate two football teams playing against each other. This converted Elo method is often known as Rate Form.
It's not my intention here to review exactly how the Rate Form method works. Most people already know how to implement it, and if they don't then it's easily found on the web. But we'll use Rate Form here to demonstrate a common obstacle that many rating methods face when using it for football betting. It's certainly a surmountable problem, but I suspect that it's not surmounted by everyone out there.
Rate Form might throw out figures like this:
Man Utd (RF rating 1,745) v Aston Villa (RF rating 987)
Please note, I have just dreamed up these Rate Form figures out of thin air just to demonstrate the point. They are not real values. The idea with these Rate Form ratings is to subtract one from the other, and perhaps addin a home advantage value to arrive at a final figure.
[Man Utd (1745) + home advntge (100)]  Aston Villa (987) = 858
For the moment, we'll ignore the rather crude way in which a straight 100 points is popped on for the home team's advantage as that's not my focus here. Disregarding that commonlyused "tweak", we have nonetheless arrived at a ratings total of +858 points.
"Aha!", people will say, "that's a home win. A final figure of +858 points is significant, so we can plonk some money onto Man Utd because they're going to win."
Erm, no I don't think so. It's right here that the problem I'm talking about appears. Just what ARE 858 points? What is the exact unit of measurement that we're discussing here? Is it 858 sausages? Is it 858 elephants? They may as well be elephants because as things stand, unless we can successfully convert these points into match odds and compare them against the odds on offer at the various bookies, then we have absolutely no idea if there is any value here or not. We have to find the correlation or the rating is utterly useless.
I do hope you can see what I'm banging on about here, because I've had so many discussions with people about ratings, and I'm always somewhat astonished by how many people arrive at a points value in the various methods they employ  and then go blindly ahead and strike bets based on that implied value. To my mind, they have done very well to buildup all their ratings in the first place, but then have fallen at the final hurdle because they haven't used those ratings to then generate match odds from them.
Rate Form, goal supremacy, Game Form, Power Ratings, Score Ability, shot on target, corners, number of farts per match: I don't care what ratings systems you employ, they can probably all be used to generate match odds from their individual ratings  but if they can't be transformed, then they should only be used as a general guide to what may or may not happen in the match, but certainly not used as vital information on which to place bets.
Okay, Mr smartyarse, highandmighty, condescending tosspot, how exactly do we convert all our lovely ratings into match odds?
Well, all will be revealed shortly. It's my earnest intention to continue with my intermittent series called "Compiling Match Odds" in my very next post, and when I do I will endeavour to show exactly how ratings, such as Rate Form, can indeed be used to generate match odds. There is no real mystery behind it and certainly no magic, but hopefully it will allow some of you to convert your ratings into something that you can compare with the real live odds on offer... and afterall, that's what it's all about, isn't it?
As you probably know, I am a fan of ratings and statistical analysis  in my view they are a very useful aide, a handy addition to a bettor's armoury in helping to increase profitability, even if they are not the definitive answer. Even on their own, a decent set of ratings will dramatically increase someone's hit rate above random guessing... But there is an issue with a lot of rating methods that often goes unnoticed, or is not discussed. In the worst cases, the problem is even deliberately ignored.
So both for advocates of rating methods and also for those of you that feel they are, ahem, overrated, today I'm going to discuss one of their shortcomings, or perhaps I should say one of their potential shortcomings when they are used in a particular way.
I suppose there's no more classic example of a ratings system than Elo, which is perhaps the bestknown of them all, and perhaps one of the most heavily employed. As many (most) [all] of you will know, this was a system first developed to rate chess players, but which has subsequently also been heavily used to rate two football teams playing against each other. This converted Elo method is often known as Rate Form.
It's not my intention here to review exactly how the Rate Form method works. Most people already know how to implement it, and if they don't then it's easily found on the web. But we'll use Rate Form here to demonstrate a common obstacle that many rating methods face when using it for football betting. It's certainly a surmountable problem, but I suspect that it's not surmounted by everyone out there.
Rate Form might throw out figures like this:
Man Utd (RF rating 1,745) v Aston Villa (RF rating 987)
Please note, I have just dreamed up these Rate Form figures out of thin air just to demonstrate the point. They are not real values. The idea with these Rate Form ratings is to subtract one from the other, and perhaps addin a home advantage value to arrive at a final figure.
[Man Utd (1745) + home advntge (100)]  Aston Villa (987) = 858
For the moment, we'll ignore the rather crude way in which a straight 100 points is popped on for the home team's advantage as that's not my focus here. Disregarding that commonlyused "tweak", we have nonetheless arrived at a ratings total of +858 points.
"Aha!", people will say, "that's a home win. A final figure of +858 points is significant, so we can plonk some money onto Man Utd because they're going to win."
Erm, no I don't think so. It's right here that the problem I'm talking about appears. Just what ARE 858 points? What is the exact unit of measurement that we're discussing here? Is it 858 sausages? Is it 858 elephants? They may as well be elephants because as things stand, unless we can successfully convert these points into match odds and compare them against the odds on offer at the various bookies, then we have absolutely no idea if there is any value here or not. We have to find the correlation or the rating is utterly useless.
I do hope you can see what I'm banging on about here, because I've had so many discussions with people about ratings, and I'm always somewhat astonished by how many people arrive at a points value in the various methods they employ  and then go blindly ahead and strike bets based on that implied value. To my mind, they have done very well to buildup all their ratings in the first place, but then have fallen at the final hurdle because they haven't used those ratings to then generate match odds from them.
Rate Form, goal supremacy, Game Form, Power Ratings, Score Ability, shot on target, corners, number of farts per match: I don't care what ratings systems you employ, they can probably all be used to generate match odds from their individual ratings  but if they can't be transformed, then they should only be used as a general guide to what may or may not happen in the match, but certainly not used as vital information on which to place bets.
Okay, Mr smartyarse, highandmighty, condescending tosspot, how exactly do we convert all our lovely ratings into match odds?
Well, all will be revealed shortly. It's my earnest intention to continue with my intermittent series called "Compiling Match Odds" in my very next post, and when I do I will endeavour to show exactly how ratings, such as Rate Form, can indeed be used to generate match odds. There is no real mystery behind it and certainly no magic, but hopefully it will allow some of you to convert your ratings into something that you can compare with the real live odds on offer... and afterall, that's what it's all about, isn't it?
Tuesday, 10 July 2012
Aaah, Beautiful Ludmilla!
The number of TV programmes about the Olympics is rampingup nicely now, and tonight I watched one on (I think) BBC2, with interviews from Olga Korbut and Nadia Comaneci amongst others.
This led me to reminisce about past Olympics and I suddenly remembered my schoolboy crush. Yes, the beautiful Ludmilla Tourischeva, the Russian gymnast. Unlike Korbut, Comaneci and the like, Ludmilla actually had some curves and a healthy pair of jugs. I was not even a teenager when I fancied her, but her breasts were a highlight back then, and for me Ludmilla had it all. Of course being a gymnast, Ludmilla always obliged delightfully by pulling and pushing herself into all manner of erotic and outrageously flexible poses that could send a spotty pubescent English boy wild with delight. Yes, our lovely Ludmilla was perfect material for a growing boy, if you know what I mean.
The other Olympiad that sticks in my memory, but for completely different reasons, was also Russian. He was the superheavyweight weightlifter from the same period and his name was Vasily Alekseyev. Now sadly dead, he always caught my attention  not just because he used to destroy all the other contestants and completely dominated his weight class, and not simply because he broke about 17,000 world records throughout his career... No, I loved Alekseyev because he used to eat 26 eggs as part of his morning breakfast. Now that's what you call training.
This led me to reminisce about past Olympics and I suddenly remembered my schoolboy crush. Yes, the beautiful Ludmilla Tourischeva, the Russian gymnast. Unlike Korbut, Comaneci and the like, Ludmilla actually had some curves and a healthy pair of jugs. I was not even a teenager when I fancied her, but her breasts were a highlight back then, and for me Ludmilla had it all. Of course being a gymnast, Ludmilla always obliged delightfully by pulling and pushing herself into all manner of erotic and outrageously flexible poses that could send a spotty pubescent English boy wild with delight. Yes, our lovely Ludmilla was perfect material for a growing boy, if you know what I mean.
The other Olympiad that sticks in my memory, but for completely different reasons, was also Russian. He was the superheavyweight weightlifter from the same period and his name was Vasily Alekseyev. Now sadly dead, he always caught my attention  not just because he used to destroy all the other contestants and completely dominated his weight class, and not simply because he broke about 17,000 world records throughout his career... No, I loved Alekseyev because he used to eat 26 eggs as part of his morning breakfast. Now that's what you call training.
Vasily Alexseyev
Monday, 9 July 2012
Calculating Goal Expectancy
Way back in January, I did a post on how to understand Poisson (see entry HERE) and how it could be used to estimate various score lines between two teams. This was just a basic getyoustarted guide, but it was generally wellreceived and generated a fair few comments.
The post, however, while being all well and good, did rather leave everyone hanging because one of the cornerstones of being able to calculate Poisson to any degree of accuracy comes from having each team's attack and defence parameters to hand. When you have the attack and defence parameters, you can then calculate each teams scoring ability  in other words you will know the goal expectancy for each team.
Now that the Euros have come to an end, we have a natural, albeit brief, pause before all the European leagues kickoff again in August and September, so this seems like a good point to pop in some more analytical posts, rather than my normal "ooh", this is what I won/lost kind of entries.
Recap:
If you haven't read my post on Poisson and/or you haven't got a clue what I'm talking about, then do have a read of that post first. Briefly, however, given each team's scoring ability, (or goal expectancy) then we can use Poisson to generate probabilities for any scoreline. This is good for betting/trading in the correct score market, but can also be used for match odds, over/unders and asian bets.
But, as mentioned, the scoring ability bit was omitted from that post, so let's rectify that here. If you remember, in the horrible Poisson calculation we had this little symbol, "Âµ". This is from the Greek alphabet and is pronounced as "mu". In the Poisson calculation, this was used as a placeholder for the scoring rate of a particular team. So if we wanted to find out the actual scoring rate for Man City when playing at home last season, we would need to work out the actual value of Âµ. This is achieved thus:
Âµ = A1 x D2 x H
The value of Âµ is calculated as the attack rating of the home side (A1) times the defence rating of the away side (D2) times the home advantage value (H).
Oh dear, so we've now got to workout three different values to get to our scoring rate? Well, we should do really, yes... but we don't absolutely have to do so.
It's at this point that I have a decision to make because there are different methods for calculating attack and defence parameters, and some are easier than others. The best method (in my view) is also the most complicated to explain and the most difficult to implement, and that's a method called Maximum Likelihood Estimation (or MLE). MLE basically uses best estimates for each attack and defence parameter, in addition to the home advantage parameter, by settling on a set of values that produce probabilities which actually reflect the real scorelines when looking at historical data. In other words, MLE is a "best fit" technique.
The problem is that calculating MLE would involve some detailed maths and some medium to advanced Excel, which personally I feel may be at odds with the rather simplistic approach I took when explaining Poisson in the first place. Afterall, what's the point in a dummies guide if the completion of that guide then moves on to advanced calculations?
You may strongly disagree with me here; you may think this is a copout or that I'm being outrageously patronising, but for now I'm going to leave MLE behind and show you a more "accessible" method for generating these values.
Before I do so, you may remember that I have already provided one method for calculating these values. I did it in my Fag Packet Calculations post, so have a look at that if you want a really roughandready method.
Goal Expectancy Method:
Okay, let's get down to brass tacks:
 CityOverallRate = Man City scored 93 goals in 38 Premier League matches last season. If we divide one figure by the other ( 93 ÷ 38 ) we get 2.45
 WiganOverallRate = Wigan scored 42 goals in 38 Premier League matches last season. If we divide one figure by the other ( 42 ÷ 38 ) we get 1.10
 CityHomeRate = Man City scored 55 goals at home and 38 goals when away. Again, if we divide one figure by the other (55 ÷ 38 ), then we get a multiplication factor to show us how much better City are at home compared to when they are away. The result is 1.45.
 WiganAwayRate = Wigan scored 22 goals at home and 20 goals when away. This time we reverse the division (away divide home) because we want to know Wigan's multiplication factor when playing away. This is (20 ÷ 22 ) = 0.90.
Man City Scoring Rate = 2.45 X 1.45 = 3.55
Wigan Scoring Rate = 1.10 X 0.90 = 0.99
So the goal count looks high. These scoring rate values can be plugged into Poisson now and used to calculate each individual scoreline. In Excel, Microsoft have kindly provided a Poisson function that we can use, so even though I explained the details behind how Poisson works, the truth is that you don't really need to know. Instead you can just plug the appropriate values into the function.
The POISSON() function in Excel takes three arguments; they are, "X", "Mean" and "Cumulative".
If you did revisit my Poisson post, then you'll recognise the "X" value, as that is the number of goals we're interested in. For example, if you want to know the likelihood of Man City scoring 2 goals, then "X" becomes 2.
The "Mean" figure is simply our scoring rate, which we calculated above  so for Man City that would be 3.55.
The "Cumulative" parameter is either TRUE or FALSE. If set to TRUE, then the result will return the cumulative probability between zero and "X", where as FALSE returns exactly "X". For our purpose, we should set it to FALSE. The POISSON function for finding out how likely Man City scoring two goals is then:
=POISSON( 2, 3.55, FALSE)
The answer is 0.181001136, or 18%.
Ideally you should complete a full set of POISSON calculations for all scores from 0 to 10 goals for both the home and away sides. You can then do things like addup all the home win scorelines, draw scorelines and away win scorelines to produce match odds. The same can be done for overs and unders.
Outstanding Issue:
There is still one other issue that I mentioned in my original Poisson post back in January, and that's the weighting that needs to be applied to Poisson to prevent it from underforcasting score draws and lowscore home wins. Judging from my normal output on this type of post, I'd say you only have six months or so to wait before I get round to talking about this. :)
Sunday, 8 July 2012
The Ropey League
The first qualifying round of the Europa League was played today... Yes, that's right. I do realise that it's early July but the Europa League has indeed started already.
Without doubt, this competition has got to be one of the most ludicrous, bloated, longwinded tournaments in history. Starting in July, there are three qualifying rounds, a playoff round, a group stage and five knockout rounds, before two teams finally drag their weary arses to the final in May, some ten months later. The competition is now so elongated that it's made a mockery of itself, and it's no wonder top teams are reluctant to be in it.
This wasn’t too bad a tournament back in the days when it was called the EUFA Cup, but it’s now so large and unwieldy that even my local pub, The Kings Arms, are in the opening qualifying rounds.
If a team manages to battle it's way through the qualifying rounds (two matches in each round), they then face the group stage, which is 12 groups of four teams. That's right, fortyfuckingeight teams fighting to get through to the knockout phase of the tournament.
The Group stages finally finish in December, but even then, if a team makes it through, they haven’t managed to whittle their opposition down to 23 other teams. No, because a huge bunch of Champions League losers are parachuted in, bulking the numbers back up to 32 teams again. To win the competition from the group stages, a team has to get through 15 matches in addition to their other league and cup commitments. What a slog!
The Europa League really should be looked at afresh, as it’s been horribly devalued by too much meddling. Now it's almost comical. Of course, as a trader of football, the Europa League is actually quite good for me. If there are additional matches on the gogglebox for me to trade, that’s all well and good.
__________________________________________
Anonymous8 July 2012 22:23
Hi, I'm really interested in football trading, but I'm afraid to start with real money. Is there a way to "practice" just like play money in online poker?
There are some clever bits of software out there that do allow you to run in practice mode, but they tend to cost money and so you may as well startoff actually using some money. You can however, practice with £2.00 stakes and, if trading, then you'll probably only be risking a fraction of that £2.00 on each trade, therefore the risk and your exposure are very small indeed.
It's also been said before and I think it's true, that trading with real money is different to trading with fake money. You will react differently if it's real money, and that's valuable learning indeed; and if you trade large sums, the money you put into the market will go on to affect the actual behaviour of that market. Fake money can never do this.
So start small and start slowly. Don't be too eager to find an opportunity, but rather let an opportunity present itself to you. That way you'll do fine.
Oh and my last piece of advice is don't listen to people who plop blogs out onto the internet and waffle on about trading as if they actually know what the hell they're talking about. I hate those kind of people, don't you?
Tuesday, 3 July 2012
Only a month to go...
That is, only a month or so to go before the new season starts, which is just fine by me. That allows me a bit of time to recharge my trading batteries, so to speak, and to work on some new stats and ideas that I have. I suspect that I'm not going to finish these before the season does indeed start, but I can continue working on them even then.
So, what about those European Championships, huh? Pretty, pretty, pretty good  as Larry David might say. In my estimation there were only one or two boring games, with most of them being thoroughly entertaining. This may be due to the truncated nature of the tournament compared to the World Cup, but most of the matches did seem to buck the trend (and indeed most bookies estimations) of how the games would pan out.
Before I reflect on my own trading performance throughout the tournament, I think it's only right and proper that we hand out some awards:
Personally, I'd give the best player award to Andres Iniesta, who is an absolutely fabulous player. Incidentally, he also wins the award for the player with the whitest, smoothest and shiniest head in the Euros. He looks like Casper. Then we have Mario Ballotelli, who surely must win sweatiest player of the tournament. He must have shed 20 pounds per match.
Special mention, however, has to go to our new England manager, "Woy" Hodgson. Without doubt, Hodgson wins the Gutless Fucker Of the Year award, a man only interested in not losing, rather than having the balls to go out and try to win. No one really expected us to get beyond the quarter finals, but to go out so meekly without really giving it a go was absolutely infuriating. Leaving Rooney on the field when he was doing nothing; leaving Ashley Young on the field when he couldn't put a foot right (did he complete more than three or four passes without losing the ball?); bringing on the waste of oxygen, Jordan Henderson, a man so far away from international class that it's beyond belief.
Ah, well, I could go on but where's it going to get me? I'll let you fillin the blanks of my rantings for yourself.
_______________________________
Okay, so how did I fare in the Euros? Nothing too amazing either way. I lost money on the final but finished just over £830 in profit across the whole tournament, which I realise is not worldbeating but is also not too bad considering I'm just sitting there enjoying watching my favourite sport. There's not much that can better that, is there? Earning money doing something you enjoy. Maybe if someone gave me a tenner each time I had a wank, then I'd relegate my trading to second place, but right now I haven't found anyone around who is willing to put forward such an offer.
Football: £24.02  Tennis: £42.84  Tote:  Total P&L: £18.82
Football Showing 1  8 of 8 markets
Market Start time Settled date Profit/loss (£)
Football / Spain v Italy : Winner 2012 01Jul12 19:45 01Jul12 21:42 1.36
Football / Spain v Italy : Match Odds 01Jul12 19:45 01Jul12 21:37 11.23
Football / Spain v Italy : Over/Under 3.5 Goals 01Jul12 19:45 01Jul12 21:33 1.17
Football / Spain v Italy : Correct Score 01Jul12 19:45 01Jul12 21:32 178.49
Football / Germany v Italy : Correct Score 28Jun12 19:45 28Jun12 21:45 20.94
Football / Germany v Italy : Over/Under 2.5 goals 28Jun12 19:45 28Jun12 21:37 15.28
Football / Portugal v Spain : Correct Score 27Jun12 19:45 27Jun12 21:36 105.86
Football / Portugal v Spain : Match Odds 27Jun12 19:45 27Jun12 21:35 1.37
_______________________________
With the Euros now finished, I suppose I should turn my attention to trading the tennis. However, even though I've had some small dabblings, I can't really drive up my interest to give it a proper go. I'm looking forward to having a break not trading, and the thought of continuing on all through the summer (what little summer we're getting) doesn't really appeal to me. I suppose this is another reason I will never make a fulltime trader. I need some winddown time.
I'll endeavour to pop one or two interesting posts up between now and the new footy season, but do expect them to be a little sporadic now. Instead I'll be barbequeing in my garden... even if it's 10 degrees and in horizontal rain. It's summer afterall!
Subscribe to:
Posts (Atom)