The new Arena system explained

Blizzard poster Aratil dropped by the PvP forums earlier to explain -- in layman's terms -- the new Arena system. The new system is supposed to match players according to skill, rather than gear. Exactly how the system determines skill wasn't made clear, although Kalgan explained that the system uses a Gaussian Density Filter. New to the system is a 'hidden rating' that's different from either the personal or team rating, and is unique to the player regardless of how many Arena teams he or she plays with.
What's clear from Aratil's post is that the change was intended to "promote the enjoyment of Arenas". Under the new system, highly skilled players will be matched up against other teams that provide a challenge while newer players just starting Arenas won't feel shut out. In a way, this can be likened to the low barrier of entry for raiding in Wrath of the Lich King. They're still tweaking the system, specially as far as ratings losses and gains are concerned, but the overall goal is to make Arenas more fun. I think we can all agree that more fun is always good. As long as it's working as intended, that is.






Reader Comments (Page 1 of 3)
Shulkman Jan 28th 2009 1:18AM
If a warlock is included in your arena team, you will be matched up against a team of murlocs from Elwynn Forest, as that is the only way the warlock will survive longer than 3 seconds.
Jagoex Jan 28th 2009 1:23AM
This is incredible. And I don't mean that in a good way. I simply cannot believe Blizzard is still throwing the word "skill" around when PvP is as crazy as it is these days.
The only way "skill" can be measured in the current PvP environment is if Blizzard limits the comparison pool. For example, if Warriors were only compared to other Warriors, Warlocks against other Warlocks, etc, they would have a much better representation of how players were performing. Also, they can even get more specific and bring spec and team comp into the equation of comparison.
Without doing so, you are left with an unbalanced system that compares a Paladin to a Shaman, or a Rogue to a Hunter, etc. Apples and oranges, essentially.
Wake up, Blizzard. This shouldn't be as difficult as you are making it.
Burix Jan 28th 2009 1:57AM
It would seem like that in theory, matching up skill limited to only players of your class, but it just isn't feasible material to be put to practical use anyways unless you were only playing people of your same class all the time.
Since you're going to be playing other classes anyways, the pool of people you're going to be pulling from will be diverse [enough].
I see the system as a straight number of how your ratings have gone up on any team you've been on. I think additive will be the only real way to determine "skill" by numbers, because their current system is built on a curve (you have 100 teams at the lowest level, which you beat to get to the next, 50 at that, 25 at the next, 12 at the next, and so on until you're the top team with nowhere to go but down).
Also, I do enjoy most people's posts on these since they usually follow a formula: Wild hypothesis, pseudoproof, and then the finisher: "Wake up, Blizzard. This shouldn't be as difficult as you are making it." saying it in a way that tries to make you look so awesome after you've come up with this amazing proof.
Jagoex Jan 28th 2009 3:22AM
Burix, I think we are misunderstood.
When I say to compare players of the same class, I am not talking about Warlock vs Warlock. I am talking about two different Warlocks and their respective performance in the Arena. Given the large sample size, this would gauge performance against all other classes but limit comparison to within one class, limiting the class imbalance issue greatly.
Also, before you attack someone's post (in a wildly hypocritical way, might I add), it might be worthwhile to make sure you understand it first. ;)
Johnzim Jan 28th 2009 7:21AM
Amen.
It takes a lot of "Skill" for a Feral Druid/Rogue/DK/Paladin to drop a 10k crit through 800 Resilience.
PvP is something of a joke at the moment. I play because I love PvP but it's not much fun being a Disc priest in this environment. Apparently I need 1000 Resilience before I have a chance of lasting more than a few seconds.
Max Jan 28th 2009 2:18AM
All that equal skill jazz is fine, but the points system is not.
My team went 24-5 last week, and only went up around 80 rating.
wtf.
Max Jan 28th 2009 2:18AM
5 points per game is not fun.
telanium Jan 28th 2009 3:01AM
Yep, been winning consistently on all 3 of my teams and our ratings have barely inched ever so slowly upwards... Soooo at this rate if I do 25 games a day, winning lets say 2/3 of those, my personal rating MIGHT reach 1900 sometime in March?
Mr Magoo Jan 28th 2009 3:29PM
I am interested. Would you be equally as pissed if you were ganked by a bug exploit OP combo of chars and your rating was massively mutilated because of it?
If it goes up fast, it must come down fast to compensate. It is all relative. From the comments in the thread it sounds like reaching your potential quickly is one of their goals also - not sure what mechanism causes this to happen. They also want to avoid the reverse.
Perhaps if you defeat 10 in a row they start matching you against higher players until you lose, at some point you get more per win? Not sure.
Mr Magoo Jan 28th 2009 3:34PM
PS: Just read about the gaussian filtering.
I am not 100% sure that you simply have not had enough fights to accellarate the curve yet, assuming you are still keeping a 90% win streak. 5 fights as a stats sample is nothing.
spf Jan 28th 2009 3:44AM
dunno how can they match that skill via some numbers.. it just cant be through dmg done, healing don or dispels casted,, its too individual.. me, as a shadowpreist can do a lot in arena just by kiting or dispeling,,and im afraid, that these things will not be counted in this system":|
as far as i understand:d
Muse Jan 28th 2009 3:58AM
Sooo.... any "gaussian density filtering for dummies" guides around? Because that wikipedia article looked like greek to me.
Eisengel Jan 28th 2009 6:27AM
Explaining the math from base principles kind of requires at least the beginning of a decent statistics course... so the best I can do is describe what it does...
To start off, a Gaussian distribution is also known as the Bell curve... that nice, regular, kinda upside-down-parabolic-shaped distribution. Suffice to say, a Gaussian distribution is really nice for lots of reasons.
Let's say you have some unknown variable, X. You know that it will change values every time you measure it, and you can't measure it exactly, but you also know that in general all the values it creates come from a single mean, or average. For instance, if you measure X 10 times, and you get the values:
1, 2, 1, 1, 2, 1, 1, 4, 1, 5
this is called a 'distribution'... it's just the group of numbers you measured... it shows how values of X are 'distributed':
You can compute the average:
1+2+1+1+2+1+1+4+1+5 = 19
19 / 10 = 1.9 (10 being the number of samples)
So X has a mean of 1.9. If you know this, you can guess that, for any value you measure from X, you will expect it to be a little less than 2... and if you measure a bunch of values from X again, they should average out to about 1.9. Now you can see this doesn't really match reality. X normally produced low values, except for a few high ones.
Imagine X is a WoW player, and you want to determine their skill, and the value you measure is, let's say, the number of points they win or lose. You can then determine, on average, if the player is gaining points (rating) or losing it. You'd want to say X was doing poorly and had a couple good wins, but should probably remain in the 1 and 1.5 area, but if you use only the mean, you'd say that X is a 1.9-level player and should be fighting 2.5s and 3s.
Gaussian Filtering is kind of a modification of this averaging process. You can't measure just one player, but everyone involved, since they are all affecting each other, plus you expect the player to change over time. All players pretty much should get better. What you want to know is how much each player is getting better, and what level of players they will likely do better after fighting.
From the wikipedia article:
"Gaussian filters are designed to give no overshoot to a step function input while minimizing the rise and fall time (which leads to the steepest possible slope). This behavior is closely connected to the fact that the Gaussian filter has the minimum possible group delay."
This makes sense to EE & CE people who do things like radio communication and signal processing. What it means is that what you want to see in your data is a nice curve... you want to see a player slowly getting better or getting worse, you don't want to see a bunch of jumps back and forth... because it's very hard to determine how good that player is. You also don't want to over-predict or under-predict. You don't want to put a 1400 team having trouble up against a 2200 rated team just because they won against a 1600 team once, and you don't want to drop a 2300 team down to fight a 1300 team because they lost to a 2100 team... and you can't measure just the entire team, but all the members themselves.
By pushing your readings through a Gaussian filter, you basically force them to relate in terms of the nice, smooth Gaussian distribution. The fastest things can go up, or go down, is based on the Gaussian function. The Gaussian filtering can also relate multiple Gaussian distributions at once, so you can slide everyone's personal rating up or down in a match based on a smooth increase or decrease that is related to the ratings of everyone on their team. You can then try to match the rate at which players are improving rather than just their overall rating.
For instance, let's say someone from a 2200 team disbanded the team and formed a new team. They would still have a very high hidden rating, and even though their team is 1500 rated, they would still be matched against 2200-level teams. If they won, their hidden rating increase would be small, as if they were still 2200-rated and won against a 2200-rated team. If they lost, their decrease would also be small. The more they lost or the more they won, the faster the climb should get, but it would take some time to change. It would probably go down faster than up, so it would take less time for them to bring their hidden rating down, but if it took them a season to get from 1500 to 2200, it would likely take at least one half to 3/4 of a season of consistently sitting on the ground and losing to drop their hidden rating back down.
Arashikou Jan 28th 2009 4:43PM
That was a more understandable explanation than I got from my own professor. Nice job! Does WoW Insider have a tenure process? 'Cause if so, you should be on staff.
Eisengel Jan 29th 2009 4:22PM
@Arashikou
Thanks. I'm not sure if they do... but first I have to graduate, then I get to worry about a different tenure process.. then hopefully I'll be able to bring my ability to explain WoW statistical mechanics to my own students.. ;^)
Let me clarify what I said above a little. The hidden rating can actually vary pretty widely, but you can 'filter' it with the likelihood of it varying that much. Let's say you won a game against a much better-ranked team, the hidden rating would go up a lot, but your team/personal rating may not, since the likelihood of you winning against so highly ranked a team would be pretty small. The system may match you then against a better team (since your hidden ranking is now higher), but if you lose, you shouldn't lose a lot, since the likelihood of you losing against a better team would be higher.
As you play, you basically 'prove' to the rating system that you should be fighting better teams by defying its expectations. If you are able to beat a +100 team 20% more often than the system expects, and are able to beat a +200 team 5% more often than the system expects, it should revise its expectations. That's what the multiple ratings and the Gaussian filtering give you... a way to measure not only current ability, but a way to determine what you expect will happen, and a way to measure how unlikely or likely an outcome is/was.
Ozma Jan 28th 2009 4:10AM
I did some arena yesterday. I admit to being highly fail so was losing alot and i expected too. But WTF is with the mad stacking system they have going on with the AP loss?? i lost 4 games in a row and i think i lost around something like -18,-20,-24,-30 and gained about +6 for a win... If there goal was to make new starters not feel shut out from arena they are failing.
Moofius Jan 28th 2009 6:23AM
I'm new to Arena - and the system isn't working - it's so broken.
Last week I won 7 lost 3 and gained about 20 points. Since they changed the matching system this week - I won 8 lost 2. Every loss was -20 points with a win only giving 7.
The only reason we won 1 of those games was becuase they only had 1 player. Its not matching players on skill at the moment at all. If anything - the further our rating dropped - the harder the fights got and the more points we lost, was seeing more and more rogues as we approached 1300 rating - and no way should rogues have a rating that low with how broken they are in PvP.
Blizzard need to reverse the arena changes as they are no fun to anyone. Even the teams beating us where only getting 5-10 points - you would need to win every game just to go up a small amount of rating.
Azhariel Jan 28th 2009 4:44AM
And did you noticed how many AP the other team lost/won? It must be bugged, yesterday I won 5AP in a match where our opponents lost around 16. 5AP is what I expected if they were at least 50AP below us, but if they were, why the heck they lost 16?
Mindfsck.
Hobbes Jan 28th 2009 1:33PM
I had one 2v2 game on Monday night in which we lost 22 points, and the other team lost 8 points. I can't figure out how that happened. If they were lower-rated than us, they should have won a lot of points for winning the game. If they were higher-rated than us, then we shouldn't have lost 22 points. In either case, they should have won some points for winning.
My partner and I have been in the 1400-1500 range all season, and neither of us have ever paired with a much higher or lower-rated partner. So no shenanigans there.
Slaytanic Jan 28th 2009 3:26PM
Same shit here. Did 3s last night on the best team I've been on this season.....BM hunter/unholy DK/holy pally (me). But at the end of the night, we were 16-4.....and our rating was less than where we started.
We threw the team away.
Winning 3 points for a game and losing 20 (while still in the 1500 bracket) is the quickest way for me to permanently hang up my arena hat once and for all.
We had a winning record and a losing rating. No thank you.