Note: As of December 5, 2014, this thread is obsolete. Please see the new thread here for new developments:
If you think that hokey equations are no match for good squad tactics, because no mystical dice gods control your destiny...
... otherwise, please do keep reading.
Warning: very long MathWing post ahead!
Introduction
This post has been a long time coming. This is a long post, so I will try and keep it as organized as possible. I added spoiler sections to you can click on whatever section you would like, rather than getting hit with a wall of text. If quoting when replying, please only leave the portion that you are replying to rather than the whole post.
The underlying methodology will all be self-contained within this post, so the process is as open to review as possible.
Limitations
- This process only considers the baseline ships at their equivalent PS1 cost. Higher PS pilot abilities (like Howlrunner) are not considered. Elite Pilot skills are not considered.
- The process doesn't work very well with the HWK, which is currently the only 1-attack ship. This is also because the HWK is typically best used with named pilots, and pilot abilities are not considered in this analysis.
- Most of the large ships (as of wave 1-4) have unique traits that make it difficult to directly balance using this method, since there is no comparative baseline. As we get more large base ships in wave 5 and we see similar functionality across more ships, this should get more refined.
- A ship does not have to be exactly at its "100% fair value" to be competitive. Everything is normalized to the PS1 TIE Fighter, which is the most point efficient ship through wave 3. Therefore, using the PS1 TIE Fighter as the reference, we should expect most of the ships to appear slightly overcosted. What we are looking for instead, are ships that are clear outliers.
- The game has an element of paper-rock-scissors, so the "fair value" does not necessarily predict that one ship is universally better than another. For example, more maneuverability inherently increases a ship's fair value, but it's nearly useless against a turret list.
- If a ship is loaded up with lots of potential upgrades, then you're inherently paying something for that privilege, even if it goes unused. Conversely, ship upgrade slots contribute a relatively small amount to the "fair value" since the upgrades themselves are generally self-balancing by their point costs, but there are still many amazing combinations and unique squad ideas. This is, of course, part of what makes the game so great. :-)
- The fair point value does NOT consider the opportunity cost associated with target priority, which in this game is generally determined by the attacker. Therefore, mixing glass cannons and tanks together in the same squad generally does not work very well. For example, spending a ton of points on a glass cannon that will likely get killed first (Alpha TIE Interceptors + Targeting Computer), or conversely, spending extra points on very durable units that your opponent simply waits to kill last (A-wing + Stealth) are both generally bad ways to spend points. Again, these factors are not considered in the fair points value, so you still need to consider tactics and specific squad composition to really determine a ship's situational usefulness.
Background
There are at least four ways to predict a ship's "fair", or "balanced" cost. If you can think of more, please chime in and I will add them to the list.
- Extensive play testing
- Comparing similar ships and differentially adding or subtracting points for different capabilities
- Converting attack/defense/dial/upgrade/etc parameters into a form that can be used as an input to Lanchester's Square Law
- Combat Salvo Model numerical simulations
Notice that I did not include the linear regression formulas that have previously been developed. Linear regression formulas that look backwards at existing ship costs and try to figure out a fixed price for each kind of upgrade/dice/etc are fundamentally flawed and were doomed to failure from the start.
A few thoughts on each category:
- Play testing: This is obviously the best method, and technically doesn't belong in this list since it doesn't predict balanced costs, it actually proves what the balanced values are. The main downside is that it requires a large sample size, and therefore cannot be used for discussing upcoming waves (even fully revealed ones), or custom stat ships that people have put together but haven't had much (or any) time to test yet.
- Differential point costing: This is the easiest of the three "theory crafting" methods, and it works OK for comparing ships that are very similar. As the differences between the two ships becomes greater, the margin for error becomes much larger. It also assumes that you can accurately assign point values for the different capabilities, which isn't trivial, but sometimes can be gleaned from looking at the point structure and capabilities of existing ships.
- Lanchester's Square Law: this is the method that I will be discussing here. The difficulty is in figuring out how to accurately quantify all of the various game mechanics to obtain a numerical "combat effectiveness" for each ship. A ship's cost is then proportional to the square root of it's combat effectiveness. This approach has scaling problems when the point squad is low, such as in X-wing, since the assumptions of the continuous time differential equations upon which it is based break down. Thankfully, this is relatively easy to compensate for.
- Salvo Model: this is by far the most difficult method, and takes a fair amount of expertise to generate a reasonable model. Even if there was an ideal simulation, there is still not necessarily a direct link between the results and a points prediction, since a ship's balance is not just based on a single matchup, but rather an aggregate of the entire metagame. I will undoubtedly get around to building such a model eventually, but more for analyzing matchups and less for predicting overall balance.
Motivation
Those of you that have followed any of my MathWing type posts might know that I have written several Matlab scripts to make statistical analysis of dice rolls possible for a variety of circumstances. One of the side goals of this project was to use it as inputs to solve the bigger problem of mathematically predicting approximately what a ship's "fair" cost should be. This approach has two main goals:
- Characterize each ship's "jousting" value. The jousting values are based only on attack and defense dice, shields and hull. Although there are many other factors that determine a ship's total balanced cost, the jousting value is extremely informative, and very well defined. There is a non trivial correlation with a ship's jousting value, and the frequency of its successful use in competitive tournaments. I have referred to this in previous posts as a "baseline cost", but I'll use "jousting" going forward, since it is more intuitively descriptive.
- Characterize each ship's "fair" value, considering all of it's capabilities, across the entire meta game. This is much more difficult, and requires assigning some value to all possible actions, dial mechanics, upgrades, and any other unique factor on a ship.
In general, FFG has done an excellent job of balancing the point costs for the ships, although there are a few outliers. Furthermore, extensive play testing is the ultimate way to determine balance between ships, NOT blindly using formulas and equations such as these. So why go through this much trouble? Here's a few reasons:
- Because I can. :-)
- The results all seem to all fall within about a point (or less) of the community's consensus for each ship. It is extremely difficult, if not impossible, to argue that the results here are statistically insignificant.
- It makes a great starting point for pricing custom ships, if you're into that.
- Knowing a ship's "jousting" value can be helpful when thinking about tactics to get the most out of your squads.
- It could be useful to the community, especially with more feedback and refinement. People occasionally ask for a formula that can be used
- It has reasonably good predictive power at theorizing how effective the partially revealed new ships will be, even without having to know all their details. This was true for waves 2 and 3, so wave 4 will likely be no different.
Making Lanchester's work with the X-wing model
Now, onto the gory details. It takes a reasonable understanding of math to get to the results here, but they can be recreated by anyone knowledgeable and determined enough.
First off, we start with Lanchester's Square Law. It states that if you have a large number of ranged combatants that can all fire at each other, the force strength of each side as a function of time can be given as a differential equation, with the result that a force's overall combat strength is:
F = N^2*E
Where F is the force's total combat strength, N is the number of units, and E is the combat effectiveness of each individual unit. Running a Lanchester's simulation requires only 2 inputs:
- ratio of the starting strength (number of combatants) of each army
- ratio of the damage-rate per unit
The first ratio is straightforward: it is the ratio of the number of ships, mathematically represented as A(0)/B(0). The combat effectiveness E determines the damage-rate per unit ratio. I.e. if the combat effectiveness of squad A is twice that of squad B, then the ratio of the damage-rate per unit is 2:1.
Here is a graphical example from Wikipedia:
What we want to do here instead, is have two different squads with equal force values F but at different ship costs, which yields the equation:
N1^2*E1 = N2^2*E2
If both squads spend the same number of points P, then we have:
N1 = P/C1
N2 = P/C2
where C1 and C2 are the costs of each individual ship in squads 1 and 2 respectively. If we solve for the cost C2, we have:
C2 = C1*(E2/ E1)^(1/2)
So, the basic premise is that because a squad's power increases proportionally to the number of ships present squared , an individual ship's cost should increase as the square root of its combat effectiveness.
However, this assumes that the number of individual units present is large enough that the changes in F over time are small enough that the curve can be approximated as a continuous time equation. In X-wing, we typically field 3-8 ships per side in a 100 point game, so there are two factors that start to take effect:
- A single, powerful ship will continue to deal out damage regardless if it has 1 hull left, or 13. Since Lanchester's assumes that damage output goes down linearly as you receive damage, it will artificially undervalue expensive ships.
- Lanchester's assumes that there is no wasted damage on "overkilling" any individual unit. However, each time an individual ship gets killed in a discrete time system (round based), the average damage of the attacking squad is reduced for that one shot, unless the target's hit points are the maximum damage that you can deal out in one shot. I.e. if you roll 3 hits against a ship with 1 hull, you still only do 1 damage, so the average damage cannot be higher than 1. Lanchester's does not consider this factor, so in this case it artificially overvalues expensive ships.
The first point is important, so I am directly addressing the concerns of "high hit point ships aren't treated fairly by Lanchester's Square Law (in low point games)!" First off, this is not just related to a ship's hit points, but rather to a ship's total combat effectiveness E. Thankfully, this is easily to numerically determine, and I have done so at a baseline of a 96 point squad. The process is relatively simple and can be easily implemented in Excel. You build a combat "simulator" that calculates two sides continually doing damage to the other side, assuming ideal focus fire onto one ship at a time on each side. The damage done is the number of ships that a side has, times the damage output per ship. Once a ship on one side dies, that side's damage output decreases. Repeat until all ships are destroyed on one or both sides.
To determine how much of an effect expensive ships have at 96 points, I did the following:
- Set side 1 to have 8 ships (8 TIEs), that each have 3 hull, and 1 attack per unit time.
- Set side 2 to have [7-1] ships with 1 attack each.
- Adjust the hull points on the side 2 ships until both squads simultaneously kill each other.
- Repeat steps 2 and 3 for 1 through 7 ships on side 2.
For a 96 point squad game, I got a curve with the following data points:
Ship Cost Hull
12 3
13.7 3.855
16 5.142
19.2 7.2
24 10.8
32 18
48 36
96 108
These data points are obviously slightly different than what would be predicted by Lanchester's Square Law. Since I wanted to account for both effects listed above, but wanted to weight it more heavily towards the first factor that helps expensive ships, I took the average of Square Law (Curve A), and the numbers in the curve above (Curve B) to create "Curve C". Using an exponent of 1.92 yields a curve that is within 1% of "Curve C" for ship costs up to 32. Even at a 48 point ship cost, the error was less than 2.5%: the E^(1/1.92) curve predicts 41.6 hull at 48 points, vs 43.8 hit points from Curve C.
So the formula, now "corrected" for expensive point cost ships is:
C = 12*E^(1/1.92)
where E is the ship's combat power normalized to a PS1 TIE Fighter.
FYI, 1/1.92 = 0.5208, so for easier calculator typing, you can also use E^0.52 rather than E^0.5.
Now, obviously there can still be some future improvement in this area. It's on my radar to eventually build a salvo combat model, using the exact probability density functions from the dice mechanics. The results could help to further refine the scaling here. But, for now, simply changing the exponent to 1.92 will have to do, and it appears to be a pretty reasonable approximation.
Combat Effectiveness Coefficients
The combat effectiveness E is a unit's damage output per unit time, multiplied by that unit's durability. To apply this model to X-wing, I have broken this down into the following categories:
- Attack
- Durability
- Dial
- Actions
- Firing arc
- Upgrades
Each of these categories is nominally 1, and then they are all multiplied together.
Since we are using the PS1 TIE Fighter as our baseline, it has a value of 1 in each category.
Attack: 2 attack dice vs. 3 attack dice vs. 4 attack dice
In order to calculate the average damage that 3 or 4 attack dice does relative to 2 dice, the following assumptions were used:
- attacker has focus 2/3 the time
- defender has focus 1/2 the time
- range bins probabilities are [15 23 9 4]/(15+23+9+4) for [R1 R2 R3 R3+asteroid]
- defender base defense dice is meta dependent, see below
Since we are looking to get an overall aggregate score, I'll treat each of these categories as independent, assign the weighted probability to each, and then calculate the aggregate totals. The base number of defense dice was evaluated in three different "meta" environments [1 dice%, 2 dice%, 3 dice%]:
- low defense dice meta: [45%, 25%, 30%]
- "standard" defense dice meta: [30%, 25%, 45%]
- high defense dice meta: [15%, 25%, 60%]
For a reference on what's currently popular in high level play, see this thread:
http://community.fantasyflightgames.com/index.php?/topic/105107-2014-regionals-results/
Another note while we are here: you don't always need to spend your focus for attack or for defense, so adding up the probability of having focus available for both can certainly be more than 100%. Since we only care about the overall statistical averages, and not conditional probabilities in a specific scenario, we can treat these as independent variables. That being said, please speak up if you have a better method of estimating how often the defender has focus available, since it does affect the result.
I used the ranges at the Worlds 2013 Finals game as a baseline: 15 @ range 1, 23 @ range 2, 9 @ range 3, and 4 @ range 3 through a rock.
This results in the following damage numbers, normalized to 2 attack dice:
defense meta
low defense normal defense high defense
1 dice: 0.4319 0.4189 0.4022
2 dice: 1 1 1
3 dice: 1.7137 1.7590 1.817
4 dice: 2.5129 2.6297 2.779
Also for reference:
2 dice + 1 reroll: 1.3436 1.355 1.3695
3 dice + 1 reroll: 2.2288 2.3125 2.4196
In order to calculate the average durability that 1 or 2 base defense dice have relative to 3 dice, the following assumptions were used:
-
attacker action economy:
- No action: 35%
- Focus: 30%
- Target Lock: 25%
- Focus + Target Lock: 10%
- defender has focus 1/2 the time
- range bins probabilities are [15 23 9 4]/(15+23+9+4) for [R1 R2 R3 R3+asteroid]
- attacker base dice number is meta dependent, see below
- critical hit values specially weighted, see below.
I do not directly account for a percentage of shots that are affected by Howlrunner. However I wanted to capture some of Howlrunner's reroll ability in how it changes the hit / critical hit ratio, so I shifted some of the focus actions into Target Lock actions. I realize that normally, Target Lock will be taken far less often than focus. I also included some small percentage of shots as having both TL + F to simulate the occasional Rebel PtL, and estimating Howlrunner's reroll affect.
The base number of attack dice was evaluated in three different "meta" environments. I used 5% as a baseline for 4 attack dice once wave 4 comes out, and assume the Phantom will see slightly below average table time. [2 dice%, 3 dice%, 4% dice]:
- low attack dice meta: [67%, 32%, 1%]
- "standard" attack dice meta: [33%, 62%, 5%]
- high attack dice meta: [20%, 70%, 10%]
Critical hits are weighted specially. The are 7 Direct Hit cards, and 2 Minor Explosion cards can be directly computed. The remaining critical hits are treated as being worth an additional 1/3 of a regular hit. None of those other cards do direct damage in the strictest sense, but some of the effects can be very nasty, so I had to put some value on them. That makes the critical hit weighting:
Crit Weighting = 1 + 7/33 + (3/8)*2/33 + (1/3)*(33-7-2)/33 = 1.4773
The damage is then calculated two ways: once with critical hits weighted as above to simulate Hull durability, and again with critical hits weighted as 1 to simulate Shield durability. The results normalized to Hull Durability for 3 defense dice are:
Hull Durability Shield Durability
low attack normal attack high attack low attack normal attack high attack
1 dice: 0.5113 0.5490 0.5647 0.5737 0.6161 0.6336
2 dice: 0.7071 0.7312 0.7411 0.8030 0.8313 0.8427
3 dice: 1 1 1 1.1458 1.1482 1.1490
4 dice: 1.4268 1.3876 1.3717 1.6458 1.6053 1.5886
These results indicate that Shields are worth about 12% to 15% more than Hull, which was far below my earlier estimate of 25%. The meta dependent durability coefficient is therefore:
( Shields*shield_coeffcient + Hull*hull_coefficient ) / 3
Dial
The dial has been broken down into various categories and each category has been given a value. The total dial coefficient is 1 + all the category scores, which are:
Tightest white turn
0: 1 turn
-0.03: red 1 turn, white 2 turn
-0.03: large base ship white 1 turn
-0.04: white 2 turn
-0.05: red 1 turn, red 2 turn, white 3 turn
-0.06: red 2 turn, white 3 turn
-0.06: Large ship base white 2 turn
-0.2: large base 3 red turn
slowest straight (+1 for large ship)
+0.025: 1 forward
0: 2 forward
fastest straight (+1 for large ship)
-0.025: 3 straight, red 4 straight
-0.01: 4 straight
0: 5 straight
K-turns
-0.3: no K-turn
-0.02: 1 red K-turn
0: 2 red K-turns
+0.3: 1 white K-turn
stress clear
-0.06: green on 2 straights
-0.05: green on 4 straights
0: green on 2 straights, 1 bank
+0.01: green on 3 straights, 1 bank
+0.05: green on 3 straights, 1 bank, 1 turn
+0.055: green on 4 straights, 1 bank, 1 turn
specialty
0: none
+0.05: red 0
Resulting dial values
Actions
For actions, I sum together all the values listed as deltas relative to a TIE Fighter: positive points if the ship has the action but the TIE doesn't, and negative points if the ship does not have the action but the TIE does. The reason I add all these actions together first, rather than multiplying them together, is because there are diminishing returns on having many actions on your bar, since you can only perform one per round. This obviously excludes Push the Limit, which I am not analyzing.
I started by estimating how much additional damage Target Lock yields, since this is the only action that can be easily quantifiable in terms of dice rolls. I modified the damage calculator so that instead of the attacker having no action 1/3 the time, and a focus 2/3 the time, the attacker has no action 1/3 the time, focus 5/9 the time, and focus+target lock 1/9 the time. The damage increase, for both 2 and 3 attack ships, is right around 5%. So I gave Target Lock a value of 0.05, and based everything off of that.
Target Lock 0.05
Evade 0.015
Barrel Roll 0.035
Boost = (attack/durabilty)/50
Cloak 0.15
I weighted boost as more useful for glass cannons than tanks, since glass cannons need to use boost to remain out of arc. I only used the "normal" attack/durability meta to calculate the boost coefficients.
So, for example, an X-wing is lacking Evade and Barrel Roll (-0.015 -0.035), but gains Target Lock (+0.5), so its net action value is 1. These values can certainly be fine-tuned, but they are an approximate starting point. Cloak is obviously a complete guess at this point, I'm just guessing that it will be very good. If we ever see (or dream up) a ship that has no focus, then we would need a weight for that as well, but so far all small and large base ships have focus. Again, these coefficients are certainly up for debate. I would love to hear people's thoughts.
Firing Arc
This only affects large base ships that have an innate primary firing arc that is not simply forward facing. Firing arcs help both offensively (closer shots, and more of them), and defensively (getting out of arcs while still being able to fire). I used the following coefficients:
normal arc: 1
360 degree arc: 1.75 (YT-1300)
forward + rear arc: 1.25 (Firespray)
It's obviously difficult to exactly quantify these numbers, with these firing arcs being unique to these ships.
Upgrades
All of the upgrade values are multiplied together, including having multiple crew, which currently only affects the YT-1300 (i.e. 2 crew is worth 1.075^2 not 2*1.075).
Turret: 1 attack ship 1.97 (Turret on 2 attack ship * Ion1 / 1 attack)
Turret: 2 attack ship 1.1
Cannon: 3 attack ship 1.01
System Upgrade 1.05
Crew 1.075
Droid 1.05
Ordnance
value description
1 No ordnance
1.01 1 missile / torpedo with 3 base attack
1.02 1 missile / torpedo with 2 base attack
1.03 1 missile + 1 bomb with 3 base attack
1.04 full loadout with 2 base attack
Cannons are very expensive, just like missiles and torpedoes, so I consider the cannon slot to be basically self-balancing, and only give it a coefficient of 1.01. I was curious about the HLC, so I calculated the net increase in a ship's effectiveness is (2.4506/1.7401)^(1/1.92) = 1.195. That's pretty good, but the problem is that it costs a whopping 7 points, so your ship would have to cost 7/(1.195-1) = 35.8 points at PS1 to break even! When you factor in the increased durability from being able to stay at long range then it helps justify the cost, but in general I still think the high point cost of the cannons themselves make them self-balancing.
Recap: What's the "Jousting Cost" again?
Again, for clarification: the "Jousting Cost" is literally calculated as: 12*(A*D)^(1/1.92),
Results
All costs and efficiency are based on their equivalent PS1 cost:
X-wing: 20
Y-wing: 17
A-wing: 17
A-wing + refit: 15
ORS: 27
Named YT-1300: 37
B-wing: 21
HWK-290: 15
Z-95: 11.5
E-wing: 27
TIE Fighter: 12
TIE Advanced: 20
TIE Interceptor: 18
Firespray: 31
TIE Bomber: 15
Lambda Shuttle: 20
TIE Defender: 30
TIE Phantom: 23
TIE Phantom + permanent cloaking: 27 (Advanced Cloaking Device, PS bid not included)
min, std. and max columns are to cover various meta games, which changes the ship's underlying jousting value. TIE Fighters are used as the 100% reference point for all meta.
Very High Degree of Certainty
Jousting Efficiency Total Efficiency
Ship min std. max min std max
High Degree of Certainty
TIE Bomber: requires ordnance to fill a useful role.
Jousting Efficiency Total Efficiency
Ship min std. max min std max
TIE Bomber 95.8% 97.5% 98.2% 96.2% 97.9% 98.6%
Medium Degree of Certainty
Y-wing: turret on a 2 attack ship.
YT-1300: 360 degree primary weapon
Firespray: rear arc
Jousting Efficiency Total Efficiency
Ship min std. max min std max
Low Degree of Certainty
HWK-290: turret on a 1 attack ship.
Lambda: No K-turns and no white turns
TIE Defender: white K-turn
TIE Phantom: cloak action
Jousting Efficiency Total Efficiency
Ship min std. max min std max
Changelog
- March 2, 2014: Updated Worlds 2013 meta with the 16th squad.
- March 20, 2014: Updated to include A-wing Chardaan Refit.
- March 22, 2014: Updated to include TIE Fighter with Howlrunner reroll, HWK-290 with blaster turret instead of Ion1 + Chewie, and permanently cloaked TIE Phantom.
- April 14, 2014: Added the Limitations section.
-
June 6, 2014: several changes:
- Added meta dependent attack and durability coefficients
- critical hits explicitly being calculated now to more accurately value shields
- maneuver dial broken down into major categories
- results and ship breakdowns moved into the next post
- July 2, 2014: Added spoiler headers for easier reading.
-
Coming soon. Major changes: (see new thread)
- Ship durability is now normalized to the number of shots required to destroy each stat line, including the probability of double damage critical hits, rather than the average damage divided by ship hit points. The effect of critical hits that do not deal double damage are then weighted in after the direct durability calculation, and depends on the ship's shields vs hull. This has a significant effect on the calculations vs the previous method. the "average damage" method was less accurate, and artificially favored high hit point ships.
- Changed the exponential curve fit to more accurately reflect high-value ship costs.
- Added a more accurate PS1 equivalent cost adjustment.
- Attack meta (used to calculate ship durability) now includes the possibility of a Heavy Laser Cannon in addition to the possibility of 2-4 base dice.
- Updated the attack and defense metas (both attack and defense, and low / standard / high) to the projected wave 5 meta.
- Now including a "required efficiency" value for each ship to break even based on its jousting value and absolute cost.
- Add wave 5 and wave 6 ships.