FFG Forum Archive

Posted at 2014-03-02 05:57:14+00:00

Note: As of December 5, 2014, this thread is obsolete. Please see the new thread here for new developments:

http://community.fantasyflightgames.com/index.php?/topic/128417-mathwing-comprehensive-ship-jousting-values-and-more/

If you think that hokey equations are no match for good squad tactics, because no mystical dice gods control your destiny...

... otherwise, please do keep reading.

Warning: very long MathWing post ahead!

Introduction

This post has been a long time coming. This is a long post, so I will try and keep it as organized as possible. I added spoiler sections to you can click on whatever section you would like, rather than getting hit with a wall of text. If quoting when replying, please only leave the portion that you are replying to rather than the whole post.

The underlying methodology will all be self-contained within this post, so the process is as open to review as possible.

Limitations

The cost predictions here work best on ships that have no unique capabilities. Thankfully, 8 of the 16 ships through wave 4 should have a very high degree of confidence, so overall the approach should still have utility. The next post will break down the degrees of certainty for each ship.

Here are some specific limitations:

This process only considers the baseline ships at their equivalent PS1 cost. Higher PS pilot abilities (like Howlrunner) are not considered. Elite Pilot skills are not considered.
The process doesn't work very well with the HWK, which is currently the only 1-attack ship. This is also because the HWK is typically best used with named pilots, and pilot abilities are not considered in this analysis.
Most of the large ships (as of wave 1-4) have unique traits that make it difficult to directly balance using this method, since there is no comparative baseline. As we get more large base ships in wave 5 and we see similar functionality across more ships, this should get more refined.
A ship does not have to be exactly at its "100% fair value" to be competitive. Everything is normalized to the PS1 TIE Fighter, which is the most point efficient ship through wave 3. Therefore, using the PS1 TIE Fighter as the reference, we should expect most of the ships to appear slightly overcosted. What we are looking for instead, are ships that are clear outliers.
The game has an element of paper-rock-scissors, so the "fair value" does not necessarily predict that one ship is universally better than another. For example, more maneuverability inherently increases a ship's fair value, but it's nearly useless against a turret list.
If a ship is loaded up with lots of potential upgrades, then you're inherently paying something for that privilege, even if it goes unused. Conversely, ship upgrade slots contribute a relatively small amount to the "fair value" since the upgrades themselves are generally self-balancing by their point costs, but there are still many amazing combinations and unique squad ideas. This is, of course, part of what makes the game so great. :-)
The fair point value does NOT consider the opportunity cost associated with target priority, which in this game is generally determined by the attacker. Therefore, mixing glass cannons and tanks together in the same squad generally does not work very well. For example, spending a ton of points on a glass cannon that will likely get killed first (Alpha TIE Interceptors + Targeting Computer), or conversely, spending extra points on very durable units that your opponent simply waits to kill last (A-wing + Stealth) are both generally bad ways to spend points. Again, these factors are not considered in the fair points value, so you still need to consider tactics and specific squad composition to really determine a ship's situational usefulness.

Background

There are at least four ways to predict a ship's "fair", or "balanced" cost. If you can think of more, please chime in and I will add them to the list.

Extensive play testing
Comparing similar ships and differentially adding or subtracting points for different capabilities
Converting attack/defense/dial/upgrade/etc parameters into a form that can be used as an input to Lanchester's Square Law
Combat Salvo Model numerical simulations

Notice that I did not include the linear regression formulas that have previously been developed. Linear regression formulas that look backwards at existing ship costs and try to figure out a fixed price for each kind of upgrade/dice/etc are fundamentally flawed and were doomed to failure from the start.

A few thoughts on each category:

Play testing: This is obviously the best method, and technically doesn't belong in this list since it doesn't predict balanced costs, it actually proves what the balanced values are. The main downside is that it requires a large sample size, and therefore cannot be used for discussing upcoming waves (even fully revealed ones), or custom stat ships that people have put together but haven't had much (or any) time to test yet.
Differential point costing: This is the easiest of the three "theory crafting" methods, and it works OK for comparing ships that are very similar. As the differences between the two ships becomes greater, the margin for error becomes much larger. It also assumes that you can accurately assign point values for the different capabilities, which isn't trivial, but sometimes can be gleaned from looking at the point structure and capabilities of existing ships.
Lanchester's Square Law: this is the method that I will be discussing here. The difficulty is in figuring out how to accurately quantify all of the various game mechanics to obtain a numerical "combat effectiveness" for each ship. A ship's cost is then proportional to the square root of it's combat effectiveness. This approach has scaling problems when the point squad is low, such as in X-wing, since the assumptions of the continuous time differential equations upon which it is based break down. Thankfully, this is relatively easy to compensate for.
Salvo Model: this is by far the most difficult method, and takes a fair amount of expertise to generate a reasonable model. Even if there was an ideal simulation, there is still not necessarily a direct link between the results and a points prediction, since a ship's balance is not just based on a single matchup, but rather an aggregate of the entire metagame. I will undoubtedly get around to building such a model eventually, but more for analyzing matchups and less for predicting overall balance.

Motivation

Those of you that have followed any of my MathWing type posts might know that I have written several Matlab scripts to make statistical analysis of dice rolls possible for a variety of circumstances. One of the side goals of this project was to use it as inputs to solve the bigger problem of mathematically predicting approximately what a ship's "fair" cost should be. This approach has two main goals:

Characterize each ship's "jousting" value. The jousting values are based only on attack and defense dice, shields and hull. Although there are many other factors that determine a ship's total balanced cost, the jousting value is extremely informative, and very well defined. There is a non trivial correlation with a ship's jousting value, and the frequency of its successful use in competitive tournaments. I have referred to this in previous posts as a "baseline cost", but I'll use "jousting" going forward, since it is more intuitively descriptive.
Characterize each ship's "fair" value, considering all of it's capabilities, across the entire meta game. This is much more difficult, and requires assigning some value to all possible actions, dial mechanics, upgrades, and any other unique factor on a ship.

In general, FFG has done an excellent job of balancing the point costs for the ships, although there are a few outliers. Furthermore, extensive play testing is the ultimate way to determine balance between ships, NOT blindly using formulas and equations such as these. So why go through this much trouble? Here's a few reasons:

Because I can. :-)
The results all seem to all fall within about a point (or less) of the community's consensus for each ship. It is extremely difficult, if not impossible, to argue that the results here are statistically insignificant.
It makes a great starting point for pricing custom ships, if you're into that.
Knowing a ship's "jousting" value can be helpful when thinking about tactics to get the most out of your squads.
It could be useful to the community, especially with more feedback and refinement. People occasionally ask for a formula that can be used
It has reasonably good predictive power at theorizing how effective the partially revealed new ships will be, even without having to know all their details. This was true for waves 2 and 3, so wave 4 will likely be no different.

Making Lanchester's work with the X-wing model

Now, onto the gory details. It takes a reasonable understanding of math to get to the results here, but they can be recreated by anyone knowledgeable and determined enough.

First off, we start with Lanchester's Square Law. It states that if you have a large number of ranged combatants that can all fire at each other, the force strength of each side as a function of time can be given as a differential equation, with the result that a force's overall combat strength is:

F = N^2*E

Where F is the force's total combat strength, N is the number of units, and E is the combat effectiveness of each individual unit. Running a Lanchester's simulation requires only 2 inputs:

ratio of the starting strength (number of combatants) of each army
ratio of the damage-rate per unit

The first ratio is straightforward: it is the ratio of the number of ships, mathematically represented as A(0)/B(0). The combat effectiveness E determines the damage-rate per unit ratio. I.e. if the combat effectiveness of squad A is twice that of squad B, then the ratio of the damage-rate per unit is 2:1.

Here is a graphical example from Wikipedia:

What we want to do here instead, is have two different squads with equal force values F but at different ship costs, which yields the equation:

N1^2*E1 = N2^2*E2

If both squads spend the same number of points P, then we have:

N1 = P/C1

N2 = P/C2

where C1 and C2 are the costs of each individual ship in squads 1 and 2 respectively. If we solve for the cost C2, we have:

C2 = C1*(E2/ E1)^(1/2)

So, the basic premise is that because a squad's power increases proportionally to the number of ships present squared , an individual ship's cost should increase as the square root of its combat effectiveness.

However, this assumes that the number of individual units present is large enough that the changes in F over time are small enough that the curve can be approximated as a continuous time equation. In X-wing, we typically field 3-8 ships per side in a 100 point game, so there are two factors that start to take effect:

A single, powerful ship will continue to deal out damage regardless if it has 1 hull left, or 13. Since Lanchester's assumes that damage output goes down linearly as you receive damage, it will artificially undervalue expensive ships.
Lanchester's assumes that there is no wasted damage on "overkilling" any individual unit. However, each time an individual ship gets killed in a discrete time system (round based), the average damage of the attacking squad is reduced for that one shot, unless the target's hit points are the maximum damage that you can deal out in one shot. I.e. if you roll 3 hits against a ship with 1 hull, you still only do 1 damage, so the average damage cannot be higher than 1. Lanchester's does not consider this factor, so in this case it artificially overvalues expensive ships.

The first point is important, so I am directly addressing the concerns of "high hit point ships aren't treated fairly by Lanchester's Square Law (in low point games)!" First off, this is not just related to a ship's hit points, but rather to a ship's total combat effectiveness E. Thankfully, this is easily to numerically determine, and I have done so at a baseline of a 96 point squad. The process is relatively simple and can be easily implemented in Excel. You build a combat "simulator" that calculates two sides continually doing damage to the other side, assuming ideal focus fire onto one ship at a time on each side. The damage done is the number of ships that a side has, times the damage output per ship. Once a ship on one side dies, that side's damage output decreases. Repeat until all ships are destroyed on one or both sides.

To determine how much of an effect expensive ships have at 96 points, I did the following:

Set side 1 to have 8 ships (8 TIEs), that each have 3 hull, and 1 attack per unit time.
Set side 2 to have [7-1] ships with 1 attack each.
Adjust the hull points on the side 2 ships until both squads simultaneously kill each other.
Repeat steps 2 and 3 for 1 through 7 ships on side 2.

For a 96 point squad game, I got a curve with the following data points:

Ship Cost Hull

12 3

13.7 3.855

16 5.142

19.2 7.2

24 10.8

32 18

48 36

96 108

These data points are obviously slightly different than what would be predicted by Lanchester's Square Law. Since I wanted to account for both effects listed above, but wanted to weight it more heavily towards the first factor that helps expensive ships, I took the average of Square Law (Curve A), and the numbers in the curve above (Curve B) to create "Curve C". Using an exponent of 1.92 yields a curve that is within 1% of "Curve C" for ship costs up to 32. Even at a 48 point ship cost, the error was less than 2.5%: the E^(1/1.92) curve predicts 41.6 hull at 48 points, vs 43.8 hit points from Curve C.

So the formula, now "corrected" for expensive point cost ships is:

C = 12*E^(1/1.92)

where E is the ship's combat power normalized to a PS1 TIE Fighter.

FYI, 1/1.92 = 0.5208, so for easier calculator typing, you can also use E^0.52 rather than E^0.5.

Now, obviously there can still be some future improvement in this area. It's on my radar to eventually build a salvo combat model, using the exact probability density functions from the dice mechanics. The results could help to further refine the scaling here. But, for now, simply changing the exponent to 1.92 will have to do, and it appears to be a pretty reasonable approximation.

Combat Effectiveness Coefficients

The combat effectiveness E is a unit's damage output per unit time, multiplied by that unit's durability. To apply this model to X-wing, I have broken this down into the following categories:

Attack
Durability
Dial
Actions
Firing arc
Upgrades

Each of these categories is nominally 1, and then they are all multiplied together.

Since we are using the PS1 TIE Fighter as our baseline, it has a value of 1 in each category.

Attack: 2 attack dice vs. 3 attack dice vs. 4 attack dice

In order to calculate the average damage that 3 or 4 attack dice does relative to 2 dice, the following assumptions were used:

attacker has focus 2/3 the time
defender has focus 1/2 the time
range bins probabilities are [15 23 9 4]/(15+23+9+4) for [R1 R2 R3 R3+asteroid]
defender base defense dice is meta dependent, see below

Since we are looking to get an overall aggregate score, I'll treat each of these categories as independent, assign the weighted probability to each, and then calculate the aggregate totals. The base number of defense dice was evaluated in three different "meta" environments [1 dice%, 2 dice%, 3 dice%]:

low defense dice meta: [45%, 25%, 30%]
"standard" defense dice meta: [30%, 25%, 45%]
high defense dice meta: [15%, 25%, 60%]

For a reference on what's currently popular in high level play, see this thread:

http://community.fantasyflightgames.com/index.php?/topic/105107-2014-regionals-results/

Another note while we are here: you don't always need to spend your focus for attack or for defense, so adding up the probability of having focus available for both can certainly be more than 100%. Since we only care about the overall statistical averages, and not conditional probabilities in a specific scenario, we can treat these as independent variables. That being said, please speak up if you have a better method of estimating how often the defender has focus available, since it does affect the result.

I used the ranges at the Worlds 2013 Finals game as a baseline: 15 @ range 1, 23 @ range 2, 9 @ range 3, and 4 @ range 3 through a rock.

This results in the following damage numbers, normalized to 2 attack dice:

defense meta

low defense normal defense high defense

1 dice: 0.4319 0.4189 0.4022

2 dice: 1 1 1

3 dice: 1.7137 1.7590 1.817

4 dice: 2.5129 2.6297 2.779

Also for reference:

2 dice + 1 reroll: 1.3436 1.355 1.3695

3 dice + 1 reroll: 2.2288 2.3125 2.4196

HLC: 2.3935 2.4923 2.6185

Blaster w/ 1 base: 0.2765 0.2670 0.2549

Ion turret w/ 1 base: 0.7407 0.7698 0.8070

Ion turret w/ 2 base: 0.8123 0.8247 0.8406

Durability

In order to calculate the average durability that 1 or 2 base defense dice have relative to 3 dice, the following assumptions were used:

attacker action economy:
- No action: 35%
- Focus: 30%
- Target Lock: 25%
- Focus + Target Lock: 10%
defender has focus 1/2 the time
range bins probabilities are [15 23 9 4]/(15+23+9+4) for [R1 R2 R3 R3+asteroid]
attacker base dice number is meta dependent, see below
critical hit values specially weighted, see below.

I do not directly account for a percentage of shots that are affected by Howlrunner. However I wanted to capture some of Howlrunner's reroll ability in how it changes the hit / critical hit ratio, so I shifted some of the focus actions into Target Lock actions. I realize that normally, Target Lock will be taken far less often than focus. I also included some small percentage of shots as having both TL + F to simulate the occasional Rebel PtL, and estimating Howlrunner's reroll affect.

The base number of attack dice was evaluated in three different "meta" environments. I used 5% as a baseline for 4 attack dice once wave 4 comes out, and assume the Phantom will see slightly below average table time. [2 dice%, 3 dice%, 4% dice]:

low attack dice meta: [67%, 32%, 1%]
"standard" attack dice meta: [33%, 62%, 5%]
high attack dice meta: [20%, 70%, 10%]

Critical hits are weighted specially. The are 7 Direct Hit cards, and 2 Minor Explosion cards can be directly computed. The remaining critical hits are treated as being worth an additional 1/3 of a regular hit. None of those other cards do direct damage in the strictest sense, but some of the effects can be very nasty, so I had to put some value on them. That makes the critical hit weighting:

Crit Weighting = 1 + 7/33 + (3/8)*2/33 + (1/3)*(33-7-2)/33 = 1.4773

The damage is then calculated two ways: once with critical hits weighted as above to simulate Hull durability, and again with critical hits weighted as 1 to simulate Shield durability. The results normalized to Hull Durability for 3 defense dice are:

Hull Durability Shield Durability

low attack normal attack high attack low attack normal attack high attack

1 dice: 0.5113 0.5490 0.5647 0.5737 0.6161 0.6336

2 dice: 0.7071 0.7312 0.7411 0.8030 0.8313 0.8427

3 dice: 1 1 1 1.1458 1.1482 1.1490

4 dice: 1.4268 1.3876 1.3717 1.6458 1.6053 1.5886

These results indicate that Shields are worth about 12% to 15% more than Hull, which was far below my earlier estimate of 25%. The meta dependent durability coefficient is therefore:

( Shields*shield_coeffcient + Hull*hull_coefficient ) / 3

Dial

The dial has been broken down into various categories and each category has been given a value. The total dial coefficient is 1 + all the category scores, which are:

Tightest white turn

0: 1 turn

-0.03: red 1 turn, white 2 turn

-0.03: large base ship white 1 turn

-0.04: white 2 turn

-0.05: red 1 turn, red 2 turn, white 3 turn

-0.06: red 2 turn, white 3 turn

-0.06: Large ship base white 2 turn

-0.2: large base 3 red turn

slowest straight (+1 for large ship)

+0.025: 1 forward

0: 2 forward

fastest straight (+1 for large ship)

-0.025: 3 straight, red 4 straight

-0.01: 4 straight

0: 5 straight

K-turns

-0.3: no K-turn

-0.02: 1 red K-turn

0: 2 red K-turns

+0.3: 1 white K-turn

stress clear

-0.06: green on 2 straights

-0.05: green on 4 straights

0: green on 2 straights, 1 bank

+0.01: green on 3 straights, 1 bank

+0.05: green on 3 straights, 1 bank, 1 turn

+0.055: green on 4 straights, 1 bank, 1 turn

specialty

0: none

+0.05: red 0

Resulting dial values

TIE Fighter 1

TIE Advanced 0.94

Firespray 0.94

TIE Interceptor 1.05

TIE Bomber 0.945

Lambda 0.54

TIE Defender 1.19

TIE Phantom 0.99

X-wing 0.955

Y-wing 0.88

A-wing 1.055

YT-1300 0.97

B-wing 0.95

HWK-290 0.66

E-wing 0.995

Z-95 0.955

Actions

For actions, I sum together all the values listed as deltas relative to a TIE Fighter: positive points if the ship has the action but the TIE doesn't, and negative points if the ship does not have the action but the TIE does. The reason I add all these actions together first, rather than multiplying them together, is because there are diminishing returns on having many actions on your bar, since you can only perform one per round. This obviously excludes Push the Limit, which I am not analyzing.

I started by estimating how much additional damage Target Lock yields, since this is the only action that can be easily quantifiable in terms of dice rolls. I modified the damage calculator so that instead of the attacker having no action 1/3 the time, and a focus 2/3 the time, the attacker has no action 1/3 the time, focus 5/9 the time, and focus+target lock 1/9 the time. The damage increase, for both 2 and 3 attack ships, is right around 5%. So I gave Target Lock a value of 0.05, and based everything off of that.

Target Lock 0.05

Evade 0.015

Barrel Roll 0.035

Boost = (attack/durabilty)/50

Cloak 0.15

I weighted boost as more useful for glass cannons than tanks, since glass cannons need to use boost to remain out of arc. I only used the "normal" attack/durability meta to calculate the boost coefficients.

So, for example, an X-wing is lacking Evade and Barrel Roll (-0.015 -0.035), but gains Target Lock (+0.5), so its net action value is 1. These values can certainly be fine-tuned, but they are an approximate starting point. Cloak is obviously a complete guess at this point, I'm just guessing that it will be very good. If we ever see (or dream up) a ship that has no focus, then we would need a weight for that as well, but so far all small and large base ships have focus. Again, these coefficients are certainly up for debate. I would love to hear people's thoughts.

Firing Arc

This only affects large base ships that have an innate primary firing arc that is not simply forward facing. Firing arcs help both offensively (closer shots, and more of them), and defensively (getting out of arcs while still being able to fire). I used the following coefficients:

normal arc: 1

360 degree arc: 1.75 (YT-1300)

forward + rear arc: 1.25 (Firespray)

It's obviously difficult to exactly quantify these numbers, with these firing arcs being unique to these ships.

Upgrades

All of the upgrade values are multiplied together, including having multiple crew, which currently only affects the YT-1300 (i.e. 2 crew is worth 1.075^2 not 2*1.075).

Turret: 1 attack ship 1.97 (Turret on 2 attack ship * Ion1 / 1 attack)

Turret: 2 attack ship 1.1

Cannon: 3 attack ship 1.01

System Upgrade 1.05

Crew 1.075

Droid 1.05

Ordnance

value description

1 No ordnance

1.01 1 missile / torpedo with 3 base attack

1.02 1 missile / torpedo with 2 base attack

1.03 1 missile + 1 bomb with 3 base attack

1.04 full loadout with 2 base attack

Cannons are very expensive, just like missiles and torpedoes, so I consider the cannon slot to be basically self-balancing, and only give it a coefficient of 1.01. I was curious about the HLC, so I calculated the net increase in a ship's effectiveness is (2.4506/1.7401)^(1/1.92) = 1.195. That's pretty good, but the problem is that it costs a whopping 7 points, so your ship would have to cost 7/(1.195-1) = 35.8 points at PS1 to break even! When you factor in the increased durability from being able to stay at long range then it helps justify the cost, but in general I still think the high point cost of the cannons themselves make them self-balancing.

Recap: What's the "Jousting Cost" again?

Again, for clarification: the "Jousting Cost" is literally calculated as: 12*(A*D)^(1/1.92),

where A is the normalized attack damage, and D is the normalized durability. This calculates the combat efficiency of a ship considering ONLY its [attack dice] / [defense dice] / [hull] / [shields]. It quantifies how effective a ship is at simply rolling dice on attack and defense. It assumes that all ships are on equal footing, that nobody can maneuver better than anyone else, but rather, everyone gets the same number and same kinds of shots. It literally calculates every single possible attack combination, and then weights the ranges, opponents, and action economy according to how often we observe it happening in the actual game.

The closest that we get to this statistical average in the actual game is probably the initial "jousting" pass, so I used that word to describe this concept. There is probably a much better word to describe it, if anyone thinks of a better word, please speak.

Results

All costs and efficiency are based on their equivalent PS1 cost:

X-wing: 20

Y-wing: 17

A-wing: 17

A-wing + refit: 15

ORS: 27

Named YT-1300: 37

B-wing: 21

HWK-290: 15

Z-95: 11.5

E-wing: 27

TIE Fighter: 12

TIE Advanced: 20

TIE Interceptor: 18

Firespray: 31

TIE Bomber: 15

Lambda Shuttle: 20

TIE Defender: 30

TIE Phantom: 23

TIE Phantom + permanent cloaking: 27 (Advanced Cloaking Device, PS bid not included)

min, std. and max columns are to cover various meta games, which changes the ship's underlying jousting value. TIE Fighters are used as the 100% reference point for all meta.

Very High Degree of Certainty

Jousting Efficiency Total Efficiency

Ship min std. max min std max

TIE Fighter 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%

TIE Fighter + Howl 116.6% 117.1% 117.8% 116.6% 117.1% 117.8%

TIE Advanced 80.6% 80.7% 80.7% 80.9% 81.0% 81.0%

TIE Interceptor 88.3% 89.5% 91.0% 94.5% 95.8% 97.4%

TIE Interceptor + Howl 101.2% 103.2% 105.6% 106.3% 108.3% 110.9%

X-wing 88.9% 91.8% 94.0% 94.1% 97.1% 99.4%

A-wing 85.1% 85.1% 85.1% 89.5% 89.5% 89.5%

B-wing 92.4% 97.2% 100.3% 100.0% 105.2% 108.5%

E-wing 80.2% 81.4% 82.8% 89.2% 90.5% 92.1%

Z-95 104.7% 106.6% 107.3% 108.5% 110.5% 111.2%

A-wing + Refit 96.4% 96.5% 96.5% 98.8% 98.9% 98.9%

High Degree of Certainty

TIE Bomber: requires ordnance to fill a useful role.

Jousting Efficiency Total Efficiency

Ship min std. max min std max

TIE Bomber 95.8% 97.5% 98.2% 96.2% 97.9% 98.6%

Medium Degree of Certainty

Y-wing: turret on a 2 attack ship.

YT-1300: 360 degree primary weapon

Firespray: rear arc

Jousting Efficiency Total Efficiency

Ship min std. max min std max

Y-wing 84.9% 88.1% 89.4% 86.5% 89.8% 91.1%

ORS 60.1% 62.4% 63.3% 82.0% 85.1% 86.3%

Named YT-1300 66.5% 70.0% 72.3% 90.7% 95.4% 98.5%

Firespray 82.3% 84.9% 87.0% 95.2% 98.2% 100.6%

Low Degree of Certainty

HWK-290: turret on a 1 attack ship.

Lambda: No K-turns and no white turns

TIE Defender: white K-turn

TIE Phantom: cloak action

Jousting Efficiency Total Efficiency

Ship min std. max min std max

HWK-290 55.0% 57.2% 58.5% 38.2% 39.7% 40.7%

Lambda 108.1% 113.8% 117.4% 83.6% 87.9% 90.7%

TIE Defender 78.8% 79.9% 81.3% 88.8% 90.0% 91.6%

TIE Phantom 84.6% 88.2% 91.4% 96.4% 100.5% 104.2%

TIE Phantom + cloak 102.3% 105.4% 110.0% 116.6% 120.1% 125.3%

Changelog

March 2, 2014: Updated Worlds 2013 meta with the 16th squad.
March 20, 2014: Updated to include A-wing Chardaan Refit.
March 22, 2014: Updated to include TIE Fighter with Howlrunner reroll, HWK-290 with blaster turret instead of Ion1 + Chewie, and permanently cloaked TIE Phantom.
April 14, 2014: Added the Limitations section.
June 6, 2014: several changes:
- Added meta dependent attack and durability coefficients
- critical hits explicitly being calculated now to more accurately value shields
- maneuver dial broken down into major categories
- results and ship breakdowns moved into the next post
July 2, 2014: Added spoiler headers for easier reading.
Coming soon. Major changes: (see new thread)
- Ship durability is now normalized to the number of shots required to destroy each stat line, including the probability of double damage critical hits, rather than the average damage divided by ship hit points. The effect of critical hits that do not deal double damage are then weighted in after the direct durability calculation, and depends on the ship's shields vs hull. This has a significant effect on the calculations vs the previous method. the "average damage" method was less accurate, and artificially favored high hit point ships.
- Changed the exponential curve fit to more accurately reflect high-value ship costs.
- Added a more accurate PS1 equivalent cost adjustment.
- Attack meta (used to calculate ship durability) now includes the possibility of a Heavy Laser Cannon in addition to the possibility of 2-4 base dice.
- Updated the attack and defense metas (both attack and defense, and low / standard / high) to the projected wave 5 meta.
- Now including a "required efficiency" value for each ship to break even based on its jousting value and absolute cost.
- Add wave 5 and wave 6 ships.

Edited December 4, 2014 by MajorJuggler

Posted at 2014-03-02 05:58:00+00:00

[content moved / removed]

Edited December 4, 2014 by MajorJuggler

Posted at 2014-03-02 05:58:08+00:00

[reserved]

Edited December 4, 2014 by MajorJuggler

Posted at 2014-03-02 18:11:45+00:00

Well color me interested.

Things to consider:

Should ships be compared cross faction? Xwing as the Rebel {scum} tie fighter?

What would the Bwings #'s look like with 1 less sheild?

Hold off on wave 4. Z-95 can't be that good, phantom can't be that bad, hope the defender isn't the worst.

Posted at 2014-03-02 20:38:12+00:00

Well color me interested.

Things to consider:

Should ships be compared cross faction? Xwing as the Rebel {scum} tie fighter?

What would the Bwings #'s look like with 1 less sheild?

Hold off on wave 4. Z-95 can't be that good, phantom can't be that bad, hope the defender isn't the worst.

I don't see why you can't consider ships cross faction, especially for calculating the jousting values, which are only based on attack/defense/hull/shields. I think the numbers here point to the Rebels having more diversity in their viable options, whereas for Imperials you basically see a ton of TIE Fighters, and not much else. TIE Fighters are really good, but they don't have many 2nd place options.

B-wing with 4 shields instead of 5:

Durability: 1.38 (down from 1.59)

Jousting Efficiency: 90.1%

Cost: 19.2 (91.3%)

As for wave 4, we already know the jousting numbers. They are what they are. The only variable there is the Phantom's cost, which I assumed to be 25@PS3 = 23@PS1.

I'm pretty sure the Z-95 can be that good. Its advantage is basically 1 point of PS over the PS1 TIE Fighter, at the same cost. At +2PS, the TIE makes back the point deficit with a cheaper PS cost progression. This is why I wonder if we'll start seeing PS3 TIEs more often when the wave 4 meta matures.

On the flip side, as far as the jousting value is concerned, the Phantom is the same as the Z-95 but with 2.58 attack instead of 1. So its jousting cost goes as 2.58^0.52 = 1.637. That's not enough to get you to 23 points at PS1.

But we don't know what Cloak does. The best guess I have seen is that it swaps attack and defense values.

Edited March 2, 2014 by MajorJuggler

Posted at 2014-03-02 21:18:55+00:00

Target Lock 0.05

Evade 0.015

Barrel Roll 0.035

Boost 0.025

Cloak 0.1

So, for example, an X-wing is lacking Evade and Barrel Roll (-0.015 -0.035), but gains Target Lock (+0.5), so its net action value is 1. These values can certainly be fine-tuned, but they are an approximate starting point. Stealth is obviously a complete guess at this point, I'm just guessing that it will be very good. If we ever see (or dream up) a ship that has no focus, then we would need a weight for that as well, but so far all small and large base ships have focus. Again, these coefficients are certainly up for debate. I would love to hear people's thoughts.

later posts Edited March 2, 2014 by Lappenlocker

Posted at 2014-03-02 21:50:31+00:00

Target Lock 0.05

Evade 0.015

Barrel Roll 0.035

Boost 0.025

Cloak 0.1

So, for example, an X-wing is lacking Evade and Barrel Roll (-0.015 -0.035), but gains Target Lock (+0.5), so its net action value is 1. These values can certainly be fine-tuned, but they are an approximate starting point. Stealth is obviously a complete guess at this point, I'm just guessing that it will be very good. If we ever see (or dream up) a ship that has no focus, then we would need a weight for that as well, but so far all small and large base ships have focus. Again, these coefficients are certainly up for debate. I would love to hear people's thoughts.

Very interesting read. Shouldn't you be referring to "cloak" in the paragraph? You also referred to "stealth" in one of your later posts and I think you meant "cloak" there too.

Ah yes, thank you. Good catch. I'm sure in a post that long there are some other typos. Fixing now.

Posted at 2014-03-02 22:11:31+00:00

A question about the Z-95. You say that the PS 1 version costs 11 pts. Is that just an inference based on the fact that the PS 2 version costs 12, and thus the PS 1 would be one point cheaper?

Posted at 2014-03-02 22:15:23+00:00

I think you've neglected one fairly important thing - What exactly is Jousting, for your model? That definition will have some impact - for example...

Jousting consists of moving as slowly as possible forward, to get multiple shots, until a ship dies... (and with a minimum of 4-5 forward, tie moves 2, x moves 1, or x moves 1 y moves 1, plus the bases of both ships - you may get 2-3 shots...)

Or is Jousting, Fire with an action, K-turn fire without action, move, repeat ? (In this case the shuttle dial says it can't effectively joust.

Or is jousting - ships remain stationary and blow each other up.

Number-wise, I don't have a problem, it's just the use of "jousting" in quotes - and I couldn't find exactly how you were modelling that. so some words about that would be appreciated

Posted at 2014-03-02 22:31:18+00:00

Lanchester's laws apply to attrition warfare, while there is most definitely a component of attrition in an X-Wing match, there is a lot more to the game. A more apt description is maneuver warfare. Lanchester's laws don't apply directly since the attrition is not constant in a maneuver warfare.

Your fundamental assumptions are flawed.

Posted at 2014-03-02 22:32:07+00:00

Ah yes, thank you. Good catch. I'm sure in a post that long there are some other typos. Fixing now.

In some ways I'd almost rather you took this time to work on the Combat Salvo Model given the underlying issues you brought up about Lanchester's Square Law. Note that I'm not stating that from any position of authority... I'm fascinated by mathematics but didn't get much chance to pursue that love in my younger days (alas). But from what you stated and what I've read on both of the models, the Combat Salvo Model seems more appropriate, albeit more difficult as you mentioned. Can you explain further why you think you would only apply the Combat Salvo Method to only specific matchup analysis? It seems that both methods are used to predict combat results which should provide the basis for ship cost analysis, correct?

Posted at 2014-03-02 22:57:44+00:00

Lanchester's laws apply to attrition warfare, while there is most definitely a component of attrition in an X-Wing match, there is a lot more to the game. A more apt description is maneuver warfare. Lanchester's laws don't apply directly since the attrition is not constant in a maneuver warfare.

Your fundamental assumptions are flawed.

While I agree that the assumption is flawed, I think we can apply the concept to traded shots, as micro-cosmically an x wing chasing a firespray would be in an attritional situation. (And given the dice probabilities, I think salvo would still be an improvement) so, temporary attrition situations existing suggests that when applicable we have another guide to sacrificing position to trade shots. This in turn suggests which ships need to rely more on the positional game.

Of course, we tend to learn that through experience, so I'm not sure what else the analysis is good for.

Posted at 2014-03-02 22:58:25+00:00

Well color me interested.

Things to consider:

Should ships be compared cross faction? Xwing as the Rebel {scum} tie fighter?

What would the Bwings #'s look like with 1 less sheild?

Hold off on wave 4. Z-95 can't be that good, phantom can't be that bad, hope the defender isn't the worst.

I don't see why you can't consider ships cross faction, especially for calculating the jousting values, which are only based on attack/defense/hull/shields. I think the numbers here point to the Rebels having more diversity in their viable options, whereas for Imperials you basically see a ton of TIE Fighters, and not much else. TIE Fighters are really good, but they don't have many 2nd place options.

B-wing with 4 shields instead of 5:

Durability: 1.38 (down from 1.59)

Jousting Efficiency: 90.1%

Cost: 19.2 (91.3%)

Your numbers align pretty well with perception already and can tell you which ships you can't just point in the enemies direction and let the dice decide it. Looks like the imperials have to take plain ties or be prepared to weather some dice swings.

Edited March 2, 2014 by Rakky Wistol

Posted at 2014-03-02 23:28:36+00:00

Maybe you have an idea on how the general approach can be improved, or maybe you have an idea on some of the specifics - feedback is welcome! Criticism to this statistical approach (both constructive and destructive) usually falls into one of three categories:

"Your assumptions and/or methodology are all wrong!" (Not very helpful without offering specifics)

"Your process is wrong, because of XYZ." (More helpful)

"Your process is wrong because of XYZ, have you considered ABC?" (Best!)

I'm going to start with what I hope is a polite discussion of that first point, with an attempt to offer specific criticism.

Lanchester's Law

Lanchester's is an entirely theoretical approach dating to the 1910s. It is specifically intended to describe the damage inflicted by large ground forces engaging one another in continuous time, as you pointed out. It's popular in military theory circles, and it's included in a number of models developed and used by the US military.

But it's also worth considering that every attempt I can easily find to validate Lanchester's Law using actual combat data has demonstrated substantial mis-fit with the model. The only way to get reasonable fit is to fudge your methodology (i.e., "Look! After I estimate model parameters using a particular dataset, my model fits that dataset!) There are lots of possible links; this one has the downside of being a bit old, but is fairly comprehensive and is available without a journal subscription.

And I think it's also worth pointing out that Lanchester's Square Law is, in the narrow technical sense, a system of differential equations relating the damage inflicted on Force B by Force A to the reverse--specifically in the context of a contest between A and B. It's not a large stretch to your "fighting strength = fighting effectiveness * n^2" approach, but strictly speaking Lanchester's doesn't permit that sort of decoupling of estimates--it's fundamentally a pairwise comparison.

(Of course there's also the fact that Lanchester's works in continuous time while X-wing works in discrete time, but you've already discussed this at length as a limitation.)

What Other Methods Are Available?

The heading is really just an excuse to respond to a couple of these:

Play testing: This is obviously the best method, and technically doesn't belong in this list since it doesn't predict balanced costs, it actually proves what the balanced values are. The main downside is that it requires a large sample size, and therefore cannot be used for discussing upcoming waves (even fully revealed ones), or custom stat ships that people have put together but haven't had much (or any) time to test yet.

FFG certainly does conduct playtesting. We can infer some things about the quality of that testing, and I'm certainly willing to concede that it's not infallible (which can be trivially disproved by the existence of the TIE Advanced, and to a lesser extent by the A-wing). But, as you've said, playtesting is a superior method to theoretical estimates-- which means that if you develop a model which disagrees substantially with FFG's costs, the burden is on you to explain why your model is a better reflection of the underlying reality than the playtesting that's already been conducted.

Differential point costing: This is the easiest of the three "theory crafting" methods, and it works OK for comparing ships that are very similar. As the differences between the two ships becomes greater, the margin for error becomes much larger. It also assumes that you can accurately assign point values for the different capabilities, which isn't trivial, but sometimes can be gleaned from looking at the point structure and capabilities of existing ships.

What You're Trying To Do

Characterize each ship's "jousting" value... There is a non trivial correlation with a ship's jousting value, and the frequency of its successful use in competitive tournaments. I have referred to this in previous posts as a "baseline cost", but I'll use "jousting" going forward, since it is more intuitively descriptive.

Characterize each ship's "fair" value, considering all of it's capabilities, across the entire meta game...

But with respect to characterizing "fair value", I agree with you that it's very difficult; in fact I think it's so difficult that I wish you hadn't done it. I suspect this is an aspect of something we've discussed before, which is that my tolerance for error in a model appears to be substantially lower than yours. But your approach here has essentially developed four different, arbitrary weighting schemes, then multiplied them together with less arbitrary (but still problematic, due to issues relating to the action economy) attack and durability values in order to find E.

And that is, to me, an insufficient foundation on principle (that is, literally axiomatically--and I say that recognizing that we simply aren't on operating with precisely the same set of axioms). So when I say that I wish you hadn't done it, I mean I wish you'd stopped at "jousting value": using unit weighting for dials, actions, etc. is clearly wrong, of course, because (e.g.) it calls the Lambda's dial the equivalent of the A-wing's.

But any other weighting scheme falls into Pauli's "not even wrong": you can demonstrate, step by step, how you got to the Lambda's E of 2.49 [= 1.74 (attack) * 1.94 (durability) * 1 (arc) * 1 (actions) * 1.23 (upgrades) * 0.6 (dial)]. But that 2.49 isn't meaningful, because many of the values that go into it aren't meaningful.

To draw a parallel, there's a "debate" in social science about whether you can take the average of ordinal data across multiple respondents--say, a three-point unipolar Likert-type scale of "not at all", "somewhat", and "very much". I say it's a "debate" because the pro camp's argument is basically convenience and "my calculator lets me do it"; the anti camp points out that in most cases there's no non-arbitrary numerical representation for the values in that scale, and accordingly, averaging is literally meaningless.

Similarly, I think we'd agree that the Lambda has a bottom-tier dial and an upper-tier set of upgrade slots. Where we part ways is the point where you assign values to those tiers. And the values you assign are arbitrary in a technical sense--they "feel" right, but there's no particular constraint or mathematical reasoning behind many of them (particularly the dials and upgrades). I think the kind of feedback you're looking for is (e.g.) that the HWK's dial should really be downweighted further, or that system upgrades should have a value of 1.075 instead of 1.05, but what I'm trying to say is that's like arguing over whether that three-point Likert scale should be coded as 0-2 or 1-3.

I (still) have some misgivings about the way you've taken the action economy into account for your attack numbers, but that's not what I'm talking about here (and not what I'm interested in discussing; for now I'll take your word that the range of variation is negligible). What I mean is that the expected value of an attack is quantifiable in a way that many other things about a ship simply aren't, and attempting to quantify them is at best an exercise that's fraught with potential failure points.

Clearly you disagree with a great deal of that, and I'll attempt in my next post to address your model on the basis you'd prefer, but I wanted to try to give a calm and clear description why I (continue to) object to the methodology.

(Up next: an empirical look at model fit.)

Posted at 2014-03-03 00:32:57+00:00

A question about the Z-95. You say that the PS 1 version costs 11 pts. Is that just an inference based on the fact that the PS 2 version costs 12, and thus the PS 1 would be one point cheaper?

Yes. Everything is compared at its equivalent PS1 cost to keep the comparison apples to apples. If PS levels are different, then you need to consider firing order. For ships that start at greater than PS1, you're essentially forced into some level of PS bid. See here:

This process only considers the baseline ships at their equivalent PS1 cost. Higher PS pilot abilities (like Howlrunner) are not considered. Elite Pilot skills are not considered.

I think you've neglected one fairly important thing - What exactly is Jousting, for your model? That definition will have some impact - for example...

Jousting consists of moving as slowly as possible forward, to get multiple shots, until a ship dies... (and with a minimum of 4-5 forward, tie moves 2, x moves 1, or x moves 1 y moves 1, plus the bases of both ships - you may get 2-3 shots...)

Or is Jousting, Fire with an action, K-turn fire without action, move, repeat ? (In this case the shuttle dial says it can't effectively joust.

Or is jousting - ships remain stationary and blow each other up.

Number-wise, I don't have a problem, it's just the use of "jousting" in quotes - and I couldn't find exactly how you were modelling that. so some words about that would be appreciated

Good point, neither answer is correct, I'll update the OP to more clearly define it. Here is a more complete definition.

The "Jousting Cost" is literally calculated as: 12*(A*D)^(1/1.92),

where A is the normalized attack damage, and D is the normalized durability. This calculates the combat efficiency of a ship considering ONLY its [attack dice] / [defense dice] / [hull] / [shields]. It quantifies how effective a ship is at simply rolling dice on attack and defense. It assumes that all ships are on equal footing, that nobody can maneuver better than anyone else, but rather, everyone gets the same number and same kinds of shots. It literally calculates every single possible attack combination, and then weights the ranges, opponents, and action economy according to how often we observe it happening in the actual game.

The closest that we get to this statistical average in the actual game is probably the initial "jousting" pass, so I used that word to describe this concept. There is probably a much better word to describe it, if anyone thinks of a better word, please speak.

Lanchester's laws apply to attrition warfare, while there is most definitely a component of attrition in an X-Wing match, there is a lot more to the game. A more apt description is maneuver warfare. Lanchester's laws don't apply directly since the attrition is not constant in a maneuver warfare.

Your fundamental assumptions are flawed.

I'll first continue by expanding upon your point, which has merit.

At a minimum, the variables in determining average damage (opponent's dice, range, action economy) should not be treated as independent. But the real problem is that Lanchester's assumes a linear time invariant model. If you had an Armageddon scenario with hundreds of ships simultaneously, then an LTI model using the average damage numbers would probably reflect reality very closely. But we have a 100 point limit, so things like Alpha Strikes dynamically change how much damage each side is doing compared to the average.

The maneuvering problem is very difficult to deal with even in the Salvo Combat Model. At some point, you're simulating player skill, not ship capabilities. That becomes a very large grey line to deal with. It is fundamentally impossible to build any model that completely represents how the game is played, because the game is played by humans. Therefore, ANY model, no matter how good, has some flawed assumptions.

I believe the pertinent question is therefore, how accurate does a given model need to be, in order to be considered useful?

Can you explain further why you think you would only apply the Combat Salvo Method to only specific matchup analysis? It seems that both methods are used to predict combat results which should provide the basis for ship cost analysis, correct?

The Combat Salvo Model, by definition, is one specific squad vs another specific squad. You can't calculate the salvos and resulting probabilities unless you first exactly define the squads. If you want to analyze one particular aspect of the game, then it is very helpful. But you can't get a high level view of it across the entire meta game, unless you brute force all possible game matchups and then somehow interpret the data.

Lanchester's can be used multiple ways. The most straightforward way is Blue Team vs Red Team, just like like the Salvo Combat Model. If the damage is LTI, then the Combat Salvo Model essentially turns into Lanchester's.

What I did here, is solve for the point balance predicted by Lanchester's, and used the meta-game average numbers for damage, rather than a specific Red Team vs. Blue Team matchup.

Posted at 2014-03-03 01:20:27+00:00

Can you explain further why you think you would only apply the Combat Salvo Method to only specific matchup analysis? It seems that both methods are used to predict combat results which should provide the basis for ship cost analysis, correct?

The Combat Salvo Model, by definition, is one specific squad vs another specific squad. You can't calculate the salvos and resulting probabilities unless you first exactly define the squads. If you want to analyze one particular aspect of the game, then it is very helpful. But you can't get a high level view of it across the entire meta game, unless you brute force all possible game matchups and then somehow interpret the data.

Lanchester's can be used multiple ways. The most straightforward way is Blue Team vs Red Team, just like like the Salvo Combat Model. If the damage is LTI, then the Combat Salvo Model essentially turns into Lanchester's.

What I did here, is solve for the point balance predicted by Lanchester's, and used the meta-game average numbers for damage, rather than a specific Red Team vs. Blue Team matchup.

Posted at 2014-03-03 04:45:04+00:00

But it's also worth considering that every attempt I can easily find to validate Lanchester's Law using actual combat data has demonstrated substantial mis-fit with the model.

I actually don't find that too surprising, given the increasingly asymmetrical nature of warfare in the recent centuries. I'm not sure there are any real world examples that directly compare to X-wing though, where you have perfect knowledge of the enemy position and capabilities.

And I think it's also worth pointing out that Lanchester's Square Law is, in the narrow technical sense, a system of differential equations relating the damage inflicted on Force B by Force A to the reverse--specifically in the context of a contest between A and B. It's not a large stretch to your "fighting strength = fighting effectiveness * n^2" approach, but strictly speaking Lanchester's doesn't permit that sort of decoupling of estimates--it's fundamentally a pairwise comparison.

Running a Lanchester's simulation requires only 2 inputs:

ratio of the starting strength (number of combatants) of each army
ratio of the damage-rate per unit

The first ratio is straightforward: it is the ratio of the number of ships, mathematically represented as A(0)/B(0). The combat effectiveness E determines the damage-rate per unit ratio. I.e. if the combat effectiveness of squad A is twice that of squad B, then the ratio of the damage-rate per unit is 2:1.

Graphical example (more for everyone else, I'm sure you're familiar with it):

But, as you've said, playtesting is a superior method to theoretical estimates-- which means that if you develop a model which disagrees substantially with FFG's costs, the burden is on you to explain why your model is a better reflection of the underlying reality than the playtesting that's already been conducted.

Correction: if the model disagrees substantially with the community's consensus on balanced cost , the burden is on me to explain the model. As you said, FFG has on occasion clearly gotten it wrong:

FFG certainly does conduct playtesting. We can infer some things about the quality of that testing, and I'm certainly willing to concede that it's not infallible (which can be trivially disproved by the existence of the TIE Advanced, and to a lesser extent by the A-wing).

"Differential point costing" is, frankly, rubbish in almost every case, for the same reasons that applying a linear regression model is doomed. (I actually think we mostly agree here, but I want to underline this point.)

Yeah, we pretty much agree here.

With respect to "jousting value", it's hard to disagree with "non trivial correlation" without a definition of what qualifies as trivial for you, but the degree of correlation is actually something we can actually assess; I'll get to that in a later post.

Ships represented in Worlds 2013 Top 16, ranked by percentage points:

Ship % points at Worlds Jousting % Total %

TIE Fighter 33.8 100 100

X-wing 23.4 91.9 93.3

YT-1300* 20 *(large ship, don't trust the model)

B-wing 8.1 97.2 98.5

Y-wing 5.2 87.4 90.3

Firespray* 4.8 *(large ship, don't trust the model)

TIE bomber 4.8 96.1 96.4

Remaining ships (excluding Shuttle and HWK):

TIE Interceptor 89.0 94.8

A-wing 87.2 92.5

TIE Advanced 82.3 83.2

The only small base ship below 90% jousting value that made it to Top 16 was a small showing of Y-wings, which was due to:

Rebel Convoy (awesome)
required to get 5 ships (5^2 > 4^2)

My impression, and this could be wrong, is that the reliance on high jousting ships has actually increased since then.

Similarly, I think we'd agree that the Lambda has a bottom-tier dial and an upper-tier set of upgrade slots. Where we part ways is the point where you assign values to those tiers. And the values you assign are arbitrary in a technical sense --they "feel" right, but there's no particular constraint or mathematical reasoning behind many of them (particularly the dials and upgrades).

That's only partially correct. The coefficient value in each category technically refers to a corresponding increase in attack damage and/or combination of increase in durability. This is fairly obvious for the attack and durability values, which can be computed based on dice probabilities. Actions are less straightforward, but you can use Target Lock as a starting point, since that's quantifiable. The remaining actions can then be given a value proportional to how often they are used.

The dial and upgrades, as you pointed out, are more difficult to quantify, although the goal is still to quantify how damage output (generally by getting into closer range or having less stress) or durability (i.e. evading arcs) is numerically related to each "tier" of dial. Technically there are as many "tiers" of dials as there are ships with different dials, and you're right, I haven't taken all of the categories into the realm of mathematical rigor as I have with the attack and durability.

It's a work in progress.

The specific values almost certainly have room for improvement. But I think the general approach is solid. As evidence, I point to the predicted "fair" points costs, which, for all of the small base ships wave 1-3 (HWK-290 notwithstanding), all fall within a point or less of the community's consensus. One full point is probably conservative; it might actually be as low as half a point. That's pretty good for a "dumb" formula that has no special fudge factors for any of the 8 small base ships that all use a common pool of values.

If the final results are all accurate, then its either an incredible coincidence, or else the assumptions aren't as bad the detractors are making them out to be.

Edited March 3, 2014 by MajorJuggler

Posted at 2014-03-03 06:19:39+00:00

I don't think it's possible to scale actions by proportion of use, as different ships don't use them proportionally the same. My interceptors live and die by boost, my a-wings don't. Likewise barrel roll is dependent on asteroid setup and enemy formation. I'm more willing to skip a barrel roll with 1 of 7 ties for an asteroid than I am with 1 of 2. Target lock doesn't even have the same weight on different ships, I'm less likely to need it on a hwk than a phantom.

Actions values are not constant in time, or equivalent across ships, or even on the same ship. Then we complicate by tossing 2nd actions in.....

Posted at 2014-03-03 06:27:17+00:00

I have a very minor critique on the tiers for dials. You have the B-Wing in the 2.5 category and the Y-Wing in the 2 category. I'm unclear as to why you would do that. The Y-Wings dial has 4 red moves where the B-Wing has 6 red moves. The B-Wing has 2 more green, then the Y-Wing. I would group the B-Wing and the Y-Wing in the same group. Like I said this is a very minor critique.

Posted at 2014-03-03 07:59:14+00:00

the red moves should be pretty meaningless on a jousting model, even with red banks and turns you will be able to point your ship toward the enemy and then go straight at him, getting rid of the stress on the way.

Posted at 2014-03-03 09:12:35+00:00

the red moves should be pretty meaningless on a jousting model, even with red banks and turns you will be able to point your ship toward the enemy and then go straight at him, getting rid of the stress on the way.

If your mainly focusing on Straights then the Y-Wing and the B-Wing are the same. Both have a red 4 and both have green 1 and 2. So why would they be in a different category?

Edited March 3, 2014 by mrfroggies

Posted at 2014-03-03 10:21:51+00:00

Just on top of what has already been said about the use of Lanchester's Laws for a discrete time game.

You are using observed use of ships in top level gaming (albeit modified) to fit your attack/defence dice distribution model. This means you are falling into classic model calibration issues. Now we have no in-sample/out-of-sample distribution unfortunately. The closest you could come would be to look at distribution prior to a wave release and then use your model to predict for the new wave of ships. But given the paucity of data and lack of stable equilibrium in the metagame between waves this will be quite hard.

Posted at 2014-03-03 10:43:41+00:00

the red moves should be pretty meaningless on a jousting model, even with red banks and turns you will be able to point your ship toward the enemy and then go straight at him, getting rid of the stress on the way.

If your mainly focusing on Straights then the Y-Wing and the B-Wing are the same. Both have a red 4 and both have green 1 and 2. So why would they be in a different category?

Why would the dial be considered at all on a jousting? :-)

I don't know, need to wait for MJuggler to answer that.

Posted at 2014-03-03 14:22:49+00:00

Just on top of what has already been said about the use of Lanchester's Laws for a discrete time game.

You are using observed use of ships in top level gaming (albeit modified) to fit your attack/defence dice distribution model. This means you are falling into classic model calibration issues. Now we have no in-sample/out-of-sample distribution unfortunately. The closest you could come would be to look at distribution prior to a wave release and then use your model to predict for the new wave of ships. But given the paucity of data and lack of stable equilibrium in the metagame between waves this will be quite hard.

I seriously considered addressing this point, and pulled it out of my post at the last minute because I didn't want to climb out on this particular limb alone. But you're absolutely right.

The problem is that it doesn't even really make sense to talk about unsampled ships: the relative strength of a dial, in particular, can only really be evaluated when situated in the set of dials. And MajorJuggler deliberately adjusted his attack factor to fit a predicted distribution of Wave 1-4 ships in the metagame, and although it turned out to be a negligibly small adjustment, it gives us further evidence that model fit is narrowly tailored to the sample used to estimate parameters.

And this is really one of my remaining fundamental questions about the model. MajorJuggler and I have, er, previously discussed the wisdom of relying on the "jousting value" model, which requires a substantial amount of post-hoc adjustment to fit the community consensus; the "fair value" model uses a set of nominally a priori factors instead, but those factors are still arbitrary--and although we can adjust them so the model tells us what we already know about ships with which we're already familiar, it's very difficult to meaningfully use the model to extrapolate to ships with which we're not familiar.

Edited March 3, 2014 by Vorpal Sword

Posted at 2014-03-03 15:40:28+00:00

Indeed. You cannot fit a model to a set of observed facts and then say model proves your views about those observed facts, Would be interesting to see from MajorJuggler what the sensitivity of his outputs is to changes in the different inputs. i.e. How stable is the model...

Using Lanchester's Square Law to predict ships' jousting values and fair point values (work in progress)