New Points Costs – A Statistical Model (Updated for S2 2019)

By ClassicalMoser, in X-Wing

There's been a whole lot of talk going around here lately about points costs and the coming adjustment. I've done my fair share of speculating and campaigning for my favorite adjustments as well, but yesterday I got curious about what the actual data says by itself. I constructed a mathematical model to interpret all the data from Meta Wing that's been collected since Wave III dropped. It was a tad ambitious and it's taken me a lot more time than expected, but I've finally come up with some results that I'm happy with, though a few particulars surprise me.

TL;DR: I applied an algorithm to ListFortress data to propose new points for every pilot and upgrade in the game. They can be found in this Google Sheet:

https://docs.google.com/spreadsheets/d/1fZqj7rroGGcAPio285FJ7qOnluaUtixxs508FJq1pwQ/edit?usp=sharing

The basic idea for generating a mathematical formula would be that:

a) Cost adjustment based on performance is a function of mean percentile

b) Sample size sets the absolute maximum by which a pilot's cost can be adjusted based on performance

c) Pilots with a very small sample size get a cost decrease to bring them more into the meta

Basically this yields a formula of:

[New Cost] = [Old Cost] + { [Old Cost] x [Max Correction] x [Adjustment] } + { [Old Cost] + [Scarcity Bonus] }

Of course, I couldn't allow the maximum correction to be directly linear based on sample size or that could lead to some extreme overreactions. On the other hand, I'm not really a statistician and I can't plot crazy bell curves and do tons of stuff with standard deviations; the purpose of this exercise is to be fairly approximate anyway. I did want to use a horizontally-asymptotal function to set a hard limit on maximum correction size. The simplest horizontally asymptotal function I could think of was a basic Harmonic function (1/x) Inverted and offset to make the absolute maximum performance-based adjustment possible (assuming an infinitely ubiquitous ship averaged the 99th percentile) of 20%, though of course none of the adjustments in my model approach that level of change.

This is the graph of the Maximum Correction function, where Y is maximum correction percent and X is sample size:

- 10 / (0.01x + 0.5) + 20

YpRdarB.png

I then took the data from ListFortress and found the mean pilot's percentile at 26.41. To bring all pilots in line with that performance level, I took the difference of their mean percentile and 26.41, then dividing by the mean to give a performance ratio between -1 and +1. I could have used this linearly, but it hardly led to any changes except to the most abusive ships, so once again I used a harmonic function (had to use absolute values and sign correction) to make sure that those even closer to the power curve were brought in line.

This is the graph of the Adjustment function, where Y the correction to be made (as a fraction of Maximum Correction and X is sample size:

± 1 / (5x + 1) ± 1

(The signs were corrected in the spreadsheet with the SIGN function)

A7FjheD.png

So now we have corrections made based on performance level with a maximum change as a function of sample size. Great! Except it doesn't do anything for the poor ships like Leebo and Rebel Fenn Rau that didn't show up in the data at all, and it will hardly do anything to help the pilots that have only been used 1-25 times in OP since Wave III.

For these, I use another harmonic function starting at -5% and approaching 0 as the ship becomes more common. It has less than half a percent effect on ships with 100 or more uses but should drop prices on unused ships by an amount too small to wreck the meta but hopefully enough to incentivize bringing them more often. Time will tell.

This is the graph of the Scarcity Bonus function, where Y is the correction percent and X is sample size:

- 5 / (0.05x + 1)

w3G0rp1.png

It was a pretty interesting exercise and the results are... kind of surprising. No matter how I change the coefficients or even use constant functions, the data insists that Guri's price needs to go down and that Kylo needs a substantial increase (both of which I wasn't expecting and don't expect from the devs, but hey, statistics!). Rebel Han is poised as one of the biggest subjects for a nerf, while Dash gets the biggest buff. Now possible are:

• 4x naked TIE Silencers

• 6x Special Forces TIE

• 5x naked Cavern Angels Zealot

• 6x Planetary Sentinel

• 6x Alpha Squadron Pilot

• 5x Crack Sabers

But why keep on talking? Here's the new data so you can have your fun speculating on how grateful we are that this isn't the way the Devs do this:

https://docs.google.com/spreadsheets/d/1fZqj7rroGGcAPio285FJ7qOnluaUtixxs508FJq1pwQ/edit?usp=sharing

EDIT:

Added sheet for upgrades. Method was similar but data is much more sensitive since point values are significantly lower and changes up to 100% are quite common at those values. Further information downthread.

EDIT 2: UPDATED for Season 2 2019. All data since July 31 goes into the S2 projections. The final projection comes from averaging Season 1 and Season 2 projections, giving double weight to the more recent one. I expect to continue with this algorithm in the future.

Edited by ClassicalMoser
23 minutes ago, Wiredin said:

👻👻👻

JOUST ME

O

U

S

T

M

E

41 minutes ago, ClassicalMoser said:

all the data from Meta Wing that's been collected since Wave III dropped.

So using merged Extended and Hyperspace data... That's going to skew things a fair bit.

5 minutes ago, Kieransi said:

👻👻👻

JOUST ME

O

U

S

T

M

E

there is still points left over for Leia!

Imagine... 3 ghost's with Leia..... I will eff you up!

29 minutes ago, Hiemfire said:

So using merged Extended and Hyperspace data... That's going to skew things a fair bit.

True, but as I said it’s a pretty rough exercise. That could explain why Kylo is doing so well

1 minute ago, ClassicalMoser said:

True, but as I said it’s a pretty rough exercise. That could explain why Kylo is doing so well

The far funkier quirk is using all metawing data, including random 7 person kit tournaments in Alaska won by Kevin.

Running it on System Opens or Hyperspace Trials only, probably generates a more interesting output.

Did Poe get kicked out of the Resistance for insubordination or something?

6 minutes ago, Caduceus01 said:

Did Poe get kicked out of the Resistance for insubordination or something?

Missed him, but added.

New cost: 68

Where most I6s are getting a good-sized hike, he could really be good.

(Exceptions: Dengar, Rebel Fenn, Non-Rebel Hans)

Edited by ClassicalMoser

First, minor overhead: I think Sol Syxa has a mistaken price.

//

It's a really interesting exercise. Great work. One thing which struck me is that your process preserved the small price gap between the Resistance and Rebel generic Falcons. Both Outer Rim Smuggler and Resistance Sympathizer are similarly unplayed, and receive similar reductions.

10 minutes ago, theBitterFig said:

First, minor overhead: I think Sol Syxa has a mistaken price.

//

It's a really interesting exercise. Great work. One thing which struck me is that your process preserved the small price gap between the Resistance and Rebel generic Falcons. Both Outer Rim Smuggler and Resistance Sympathizer are similarly unplayed, and receive similar reductions.

Also fixed.

Yeah, not going to lie, Resistance Sympathizers would be really appealing at 65 points. You could fly three of them. I mean, it may still not be strong but I can imagine a lot of arc-dodgers that would hate that. Plus you have 5 points left to give one the title or one key crew piece. Three naked Outer Rim Smugglers might even be stronger, and tbh is almost gross.

Also notice that triple Defenders is only 2 points away from possible.

Edited by ClassicalMoser

When you figure out what this means:

image.png.202df73e8e86c8dac53e179ca45ce285.png

Me:

I'm a scientist | Etsy

1 hour ago, svelok said:

The far funkier quirk is using all metawing data, including random 7 person kit tournaments in Alaska won by Kevin.

Running it on System Opens or Hyperspace Trials only, probably generates a more interesting output.

It would undoubtedly be better quality data, but I really need the quantity that was provided by the way I did it. So many ships only have 0-12 uses even including everything. It would be a pity to just "auto-bump" all of these down by 5% when common sense says some may not need it as much though the evidence isn't there to prove it.

So yeah, it's a little skewed but probably better than it would be if I'd been more choosy with my data 😕

3 hours ago, ClassicalMoser said:

Three naked Outer Rim Smugglers [...] might [...] be [...] gross.

FTFY

In all seriousness though, this is a really cool exercise. Thanks for doing it!

I'd be curious to see 3 Resistance sympathizers, since their actions and dial and hit points are worse than other 3-red turret ships. It's like the 3x Knaves with Torpedoes list, a good test of the worst-case scenario for stuff. If it's too strong, I'd certainly be one for advocating a 67 point Resistance Sympathizer. Otherwise? Might be cool. 3x Resistance Sympathizers (6.191 attacks to die, from 3-red with focus) has approximately the toughness of a list of 4x B-Wings, with less total firepower, but with more arc coverage. It'd be interesting to see if it'd work, if it'd be too strong, etc.

Three Outer Rim Smugglers (7.254 attacks to die), however, has total list toughness closer to 4 U-Wings or 5 X-Wings. That might be a step beyond what's acceptable.

//

Another quirk. Init 2 Blue B-Wings go up a point, Init 3 Daggers don't. There's a decent number of similar things, where some generics get adjustments and others don't. A purely frequency-based approach to pricing adjustment is going to create little things like that, which a human wouldn't. Or at least, I don't think a game balance team would. It's interesting to think about the ways that an algorithm will make decisions which are a bit different.

Look, any data which says that Sigmas go up while leaving Echo alone (even going down one!) is ok in my book.

Also some of the frequency stats are skewed for small factions. Gold torrents got a bump of one, but that’s probably a simple frequency due to limited options deal. Drops to the other Torrents, plus the deploy of the N1, should see its frequency drop.

But overall an interesting and well reasoned method, even if a few manual ‘we’re not doing that’ looks are needed, particularly at break points.

Would 5 Scariff Base Pilots be bad? I’d have to try and find five to find out!

15 hours ago, theBitterFig said:

Another quirk. Init 2 Blue B-Wings go up a point, Init 3 Daggers don't. There's a decent number of similar things, where some generics get adjustments and others don't. A purely frequency-based approach to pricing adjustment is going to create little things like that, which a human wouldn't. Or at least, I don't think a game balance team would. It's interesting to think about the ways that an algorithm will make decisions which are a bit different. 

This is one of my favorite things to see when training a new AI algorithm. Doesn't always make sense to a human, but for the AI it was correct at some point. One of the most fun is a clustering algorithm, and seeing the clusters that are created by doing so.

Double-checked meta wing and it seems Kylo is doing exactly as well if you do give more weight to higher levels of play as he is if you treat all equally, so my adjustment isn't as far off as I thought.

He is far better in Hyperspace than in Extended though. (Hyperspace percentile: 34.04 vs Extended percentile: 30.25. Note that both are far better than the overall average of 25.52). Maybe his current weakness is he just doesn't have flexible enough support options or other solid aces to run with?

Anything that suggests increasing the cost of Sigmas is alright by me.

On ‎6‎/‎12‎/‎2019 at 7:18 PM, ClassicalMoser said:

• 4x naked TIE Silencers

Well, that already is possible in quick builds, and whilst nice, they're less terrifying than you might imagine; you've got low enough initative and enough ships on the board that arc dodging with everyone becomes difficult, but don't have the firepower of a 5-ship squad or the warm bodies to block with a true swarm.

Overall, a very interesting experiment. Thanks!

This is really interesting data.

Obviously there's a bunch of problems with sampling this sort of data - Hyperspace vs Extended, equal weight given to all tournaments regardless of size etc. - but you know about them, and if you don't others are already pointing them out. Another thing to point out would be that this is looking solely at ships, and ignoring the idea that certain combos of ships and upgrades are worth more than the sum of their parts. Any such model should, if we're being truly fair, applied equally to upgrades and analysed based on combinations. But obviously that's orders of magnitude more work.

I think the big catch is that no matter how much you fine tune your data, a mathematical model for points costing should never be the only driver. There will always need to be some 'common sense' adjustments that might deviate from the model, or are imposed by factors that are less easy to quantify. And of those, the big one is definitely thresholds, as a few other people have mentioned. Lothal Rebel is obviously the big one. A -3 point adjustment might make sense for the ship itself, but I doubt anyone seriously wants 3x Lothal Rebels to suddenly become a possibility.

I also think it's important to acknowledge that some of these results are simply due to favourable list combinations. Soontir is in a lot of successful lists, for example, and I can see why the model would suggest a 3 point increase. But does anyone actually think that's warranted? I honestly think Soontir is one of the best costed pilots in the game at the moment. Sure, he's good. But you also need to be a good player to make the most of him. I'm not good enough, and the few times I've used him he's done very little for me. I'm also really not sure about the points increases on Howl and Iden. Yes, they're strong but only because you can currently fit a decent swarm around them. Increase both and the synergy falls apart, and they become useless.

What's interesting, though, is how many of these adjustments do make sense. I really hate to admit it, because I love that my favourite pilot is popular and doing well, but Wedge probably does need a 3 point increase. That would make him the equivalent of half a point more expensive than 1e Wedge with Renegade Refit, IA and S-foils. Which is probably fair enough given that the 2e X-Wing gets the benefits of all the 1e s-foils dial upgrades all the time, and gets the open side barrel roll all the time, while the extra hull is probably better, all told, than IA (doesn't require you to spend points on an astro, doesn't mean you lose the astro's ability when you hit that given damage threshold).

The tweaks to generics across the board look good. It's fascinating that all Defenders, even Rexler, get cheaper.

4 hours ago, GuacCousteau said:

I also think it's important to acknowledge that some of these results are simply due to favourable list combinations. Soontir is in a lot of successful lists, for example, and I can see why the model would suggest a 3 point increase. But does anyone actually think that's warranted? I honestly think Soontir is one of the best costed pilots in the game at the moment. Sure, he's good. But you also need to be a good player to make the most of him. I'm not good enough, and the few times I've used him he's done very little for me. I'm also really not sure about the points increases on Howl and Iden. Yes, they're strong but only because you can currently fit a decent swarm around them. Increase both and the synergy falls apart, and they become useless. 

Honestly I think Howl should go up by MORE so Iden and the generics can come down by a bit. I actually felt like the Academy was the one getting screwed over here.

I’d also like to ever see Iden in a list without Howl. Or any TIE Swarm without Howl. If something is auto include that means it’s too cheap. Maybe Howl + Iden should be 80, but they should split it 45-35 instead of 40-40, and then you can reduce the generics and leave most of the names TIEs alone.

Regarding Soontir it’s hard to say. 3 points wouldn’t kill him but again, it might make him less of a default go-to and more of a careful choice.

Edited by ClassicalMoser
52 minutes ago, ClassicalMoser said:

If something is auto include that means it’s too cheap.

Not necessarly, it can just mean that this is the only ship that does this particular thing, and no other similar ability exist.

For example, lot's of FO list are flying Kylo. That doesn't mean he's too cheap, it just that he's the only high repo ace in the entire faction, so those people who wanna play FO and like ace playstyle will naturally go toward Kylo.

Same for empire. Empire has very poor support crew or ability, usually overcosted or hard to use. That's why every swarm list uses Howlrunner, she's the easiest support to use and fits tightly in the tie swarm. It's not that she's too cheap, there's just no other available choice to choose from!

4 minutes ago, DarthSempai said:

For example, lot's of FO list are flying Kylo. That doesn't mean he's too cheap, it just that he's the only high repo ace in the entire faction, so those people who wanna play FO and like ace playstyle will naturally go toward Kylo.

Same for empire. Empire has very poor support crew or ability, usually overcosted or hard to use. That's why every swarm list uses Howlrunner, she's the easiest support to use and fits tightly in the tie swarm. It's not that she's too cheap, there's just no other available choice to choose from!

It does mean that they're too cheap relative to the other things available. Of course, in those cases it means the other items need to come down in price. I think the Howlswarm is correctly priced, but I think the price should be rearranged between the existing elements to make more variation viable. I like TIE swarms but I dislike having to fly them in a block.

Of course, my model does have Kylo's cost going up a little and all the Reapers and Lambdas and almost everything FO going down too. My model is skewed more towards making everything playable than making everything competitive. Maybe Kylo won't be as competitive anymore, but at least the rest of FO will be playable. Same for Vader, Soontir, and Howl; not as competitive, but opening up lots more options for list-building by pushing them a little off-center.

Honestly I'm really hoping the devs don't screw up this points adjustment. It's annoying that my own model will probably make me disappointed in the rulings they do make. :(

Very interesting analysis, unlike MajorJugglers pure mathemathical jousting models, you are simply letting the tournament data "speak" for themselves which is in some way the best. Thank you for this. I do have a few constructive comments:

1. I think you should rename the thread title to "New Idea for Point Costs - A Data- and Percentile-Driven Approach"

2. I noticed that practically all pilots in your spreadsheet have a reduced cost - practically all of them. That strikes me as odd - If every pilot gets a price-reduction then nothing is achieved and simply leads to inflation. I am wondering if the data is driven by your "scarcity-boost"? Would you kindly re-upload the spreadsheet, but this time, mark if the price-adjustment comes from adjustment or from scarcity bonus.

3. This has been mentioned, but data should also be weigthed by tournament-participant-size being top-dog in a 64-man tournament is a totally different beast than winning your 8-person laugh-it-up-fuzz-ball tournament kabooze.

Edited by Sciencius