2'nd edition: Point-costs are in the App so everthing will be balanced, or... Part II

By Sciencius, in X-Wing

@Sciencius Your point pricing aside (which I agree with) I think your example is really showing the importance of player choice in game design. You want to make sure the player is presented with reasonable decision points during the game and that the opposing side has the ability to respond reasonably or be able to prevent the situation with their own previous decisions. Your upgrade card certainly takes away all the agency of the defending player (which is bad) but in addition it effectively takes away the decision of the attacking player as well (it's going to be very likely you target the most powerful ship every single time).

So yes -- no amount of point balancing can really save poor game design. Unless of course you simply price it out of the game completely as a whoops-an-upgrade.

@Major Juggler, what you states makes sense. However, I do challenge the fact that everyone claims that balance is here and there and the effect of luck is such and such.

Until someone can support a study where games were played, varying ship cost within predetermined ranges, allowing sufficient data points to accommodate all sources of variance (dice, player skill, squadron complexity, etc), and do this with a test matrix with hundreds of data points (google design of experiments), it will remain an opinion based on some logic/theory.

Example:

Everyone always states that the x-wing and TIE Advanced in 1e were costed wrong. Say you were to do the following:

Squadron 1: Red Squadron Pilot x 4 = 92pts.

Squadron 2: Storm Squadron Pilot x4 = 92pts.

In theory both have the same points. If there is balance, if you play hundreds of games, allowing for dice variance to statistically even out (same for player skill all other sources of variance, such as initiative, etc, etc), then both squadrons should have a comparable winning distribution. In this simple example, only then you can confidently say whether these two ships are balanced compared to one another or not.

If you have only played a few games with each ship, and furthermore let other factors get mixed with those games, then it becomes just an opinion of balance. We can't get anecdotal evidence become factual (statistical) evidence, specially when variance is such a big factor.

Let's say it turns out that the TIE-Advanced squadron win consistently after the above study. Then you decide to reduce the X-Wing cost (say to 22pts each) which allows to add perhaps an R2 unit to bring it back to 23 for a total of 92pts. Then you test again. You can clearly see that if the above is done for every single ship/upgrade combination out there, you will need all eternity to test, let alone to decide how to attribute points. Which brings me back to my original thought, where trying to bring "mathematically correct" balance, where each ship/combination has the same chances of winning is a futile exercise since:

1) It's impractical to test such an overwhelming amount of data sets

2) Nobody will play so many games. I.e. their experience will purely be "anecdotal" thus more influenced by other factors (skill, luck, list-building).

To summarize, I'll state it again: variance from many sources plays a bigger role than deciding whether that ship should cost n or n+1 points. The discussion whether n-points allows you to field more ships/upgrades than n+1-points is another altogether, since it is an artifact of the 200pt limit.

I like your 0-20% examples. Reaching perhaps a +/- 5% within the ideal cost should be a target, but trying to get to that magical 0% would be a waste of time. Having ships released above the 15% shows lack of play-testing.

Edited by OoALEJOoO
37 minutes ago, OoALEJOoO said:

@Major Juggler, what you states makes sense. However, I do challenge the fact that everyone claims that balance is here and there and the effect of luck is such and such.

Until someone can support a study where games were played, varying ship cost within predetermined ranges, allowing sufficient data points to accommodate all sources of variance (dice, player skill, squadron complexity, etc), and do this with a test matrix with hundreds of data points (google design of experiments), it will remain an opinion based on some logic/theory.

Predicting X-wing ship performance within +-5% does not require the level of testing you cite with hundreds of data points. Here is a counter example. No bridge like the Brooklyn Bridge had been made before it was constructed. It's design was calculated without hundreds of test bridges being built to determine if it would fail or succeed. Proper engineering can predict the future. Some focused experimentation was likely used. Focused experimentation can improve mathematical predictions of X-wing ship performance. The amount of experimentation that is useful is orders of magnitude less than what you describe. I hope my crude example was useful.

6 hours ago, Sciencius said:

like another potential bad combination of cards last seen fleeing the Death Star.

Okay, so it's apparent you don't like new Luke and I feel that's where this entire discussion has come from.

So here what I have to say to you jacka**. If it is impossible to balance points effectively due to it having to be compared against your opponents list, then how would you design the entire game of X-Wing to be in any way balanced.

The way I see it you are offering a lot of criticism and very little in the term of viable solutions.

4 hours ago, OoALEJOoO said:

The whole balancing discussion is a bit mute.

I think you mean the discussion is moo: like a cow's opinion, it doesn't matter.

3 hours ago, OoALEJOoO said:

Until someone can support a study where games were played, varying ship cost within predetermined ranges, allowing sufficient data points to accommodate all sources of variance (dice, player skill, squadron complexity, etc), and do this with a test matrix with hundreds of data points (google design of experiments), it will remain an opinion based on some logic/theory.

Between historical List Juggler data (and before that the Regionals Results threads where I was manually collecting data), and my scripts which can calculate the related figures of merit for all the ships, I can unequivocally state that the theory and empirical data strongly correlate. So, it exists.

3 hours ago, OoALEJOoO said:

Example:

Everyone always states that the x-wing and TIE Advanced in 1e were costed wrong. Say you were to do the following:

Squadron 1: Red Squadron Pilot x 4 = 92pts.

Squadron 2: Storm Squadron Pilot x4 = 92pts.

In theory both have the same points. If there is balance, if you play hundreds of games, allowing for dice variance to statistically even out (same for player skill all other sources of variance, such as initiative, etc, etc), then both squadrons should have a comparable winning distribution. In this simple example, only then you can confidently say whether these two ships are balanced compared to one another or not.

If you have only played a few games with each ship, and furthermore let other factors get mixed with those games, then it becomes just an opinion of balance. We can't get anecdotal evidence become factual (statistical) evidence, specially when variance is such a big factor.

Let's say it turns out that the TIE-Advanced squadron win consistently after the above study. Then you decide to reduce the X-Wing cost (say to 22pts each) which allows to add perhaps an R2 unit to bring it back to 23 for a total of 92pts. Then you test again. You can clearly see that if the above is done for every single ship/upgrade combination out there, you will need all eternity to test, let alone to decide how to attribute points. Which brings me back to my original thought, where trying to bring "mathematically correct" balance, where each ship/combination has the same chances of winning is a futile exercise since:

1) It's impractical to test such an overwhelming amount of data sets

2) Nobody will play so many games. I.e. their experience will purely be "anecdotal" thus more influenced by other factors (skill, luck, list-building).

To summarize, I'll state it again: variance from many sources plays a bigger role than deciding whether that ship should cost n or n+1 points. The discussion whether n-points allows you to field more ships/upgrades than n+1-points is another altogether, since it is an artifact of the 200pt limit.

I like your 0-20% examples. Reaching perhaps a +/- 5% within the ideal cost should be a target, but trying to get to that magical 0% would be a waste of time. Having ships released above the 15% shows lack of play-testing.

Interesting example, remember the historical background though, for this context. The TIE Advanced was obvious, but I was the first one to point out why they were both overcosted, and by how much. Notably, the tournament results eventually followed my predictions which were based purely on analysis (goodbye X-wing, hello replacement B-wing), not the other way around. This does not mean that there was even a public consensus despite clear tournament data. Two years after I made these predictions, people were still arguing about it on the boardgamegeek forum that X-wings were fine. It has now been over 4 years since the original analysis, and FFG's design still hasn't caught up to my original predictions. And that was with an early analysis, which has been much improved on in the meantime.

@Dengar5 hit the nail on the head when he pointed out how engineering design actually works. You are trying to argue that the only way to design something and converge to the optimal solution is by iterating a lot of times. That is.... really really really REALLY REALLY wrong. Design is about understanding the first principles and the fundamentals of what's actually going on, so you can give yourself a massive head start getting closer to your target optimization range. There are some engineers that approach the design process using your "1000 iterative trial and error" approach, but it frequently results in someone else having to fix the mistakes, and in the process figuring out the fundamentals anyway.


X-wing balance is essentially the same thing.

Also, if you can change point costs later like FFG will be able to do, then it's not a waste of time, it's essentially "free". Just imagine if everything started in a +/- 5% window, and then FFG could just do small tweaks from there. Unfortunately there's almost no chance it will happen this quickly, but in theory they will eventually be able to get there, even without understanding the fundamentals.

2 hours ago, Dengar5 said:

Predicting X-wing ship performance within +-5% does not require the level of testing you cite with hundreds of data points. Here is a counter example. No bridge like the Brooklyn Bridge had been made before it was constructed. It's design was calculated without hundreds of test bridges being built to determine if it would fail or succeed. Proper engineering can predict the future. Some focused experimentation was likely used. Focused experimentation can improve mathematical predictions of X-wing ship performance. The amount of experimentation that is useful is orders of magnitude less than what you describe. I hope my crude example was useful.

To play devil's advocate, engineering involves a certain amount of over-design to safely meet requirements in all scenarios, but this portion of the analogy doesn't really translate to the X-wing design. X-wing balancing is essentially an estimation problem. It does turn out that if you have made several similar ships/bridges before, then it's easier to estimate. As an example, the target efficiency of a turret ship vs a jouster is pretty well-defined through historical data alone, presuming you can calculate the requisite historical values. But when you're adding new functionality, the difficulty is in figuring out what sort of analytical playtesting you need to target in order to quantify "that new thing". Usually it degenerates into "how often does thing [X] happen" and you use playtesting to get you the answer. But that case of "breaking new ground" with new mechanics is harder than just fitting something to a previously known curve.

But fundamentally yes your point stands -- @OoALEJOoO is missing the point that some problems can be solved analytically, some problems can be solved empirically, and some problems, like X-wing balance, can be solved by both.

Edited by MajorJuggler
22 minutes ago, MajorJuggler said:

some problems can be solved analytically Rebelliously, some problems can be solved empirically Imperially, and some problems, like X-wing balance, can be solved by both.

FTFY

8 hours ago, Sciencius said:

Fluff never overrules good game design.

Negative ghost rider. This game is a Star Wars game. If it stops feeling and playing like a Star Wars game, then it will instead be a soulless mechanical competition that a lot of people will have no interest in playing. Making it feel like Star Wars is vastly more important than having a finely balanced game, but it needs to be said that this is not a zero sum game. You can have both.

However, in the rare case we run into a situation where the only way a component can be implemented is by either breaking the lore or by harming the game balance, then either that component should not be included in the game at all, or the game balance should suffer. The lore should never be the one to lose out.

And before everyone gets all nit-picky, because this is the internet and that's what people like to do, this post doesn't translate into a call for hard and fast extreme application of lore capabilities and differences. The game just needs to FEEL like Star Wars. The ships and characters need to broadly operate and perform roughly as they would be expected to by a Star Wars fan. The state of the game at the moment, prior to the 2.0 reboot, is about as un Star Wars as you can get. Both X Wings and TIE fighters suck balls and no one uses them. The major characters like Luke, Han, Leia, Vader, etc are never seen because taking them is basically the same as conceding the game before you start. And quite apart from WHAT ships and characters we see on the tabletop, is HOW those ships are being used. The most important thing is NOT flying well, getting onto your opponents six and out-guessing them/double-bluffing their next move, it's about card manipulation, timing sequences, boosting action economy and control of your opponents pieces, card combinations...

Hopefully 2.0 takes the game back to the heady days of actually flying well again, where upgrades were cool but building combos and executing them perfectly wasn't the only path to victory.

On 5/24/2018 at 11:20 AM, Sciencius said:

Fluff never overrules good game design.

18 hours ago, Chucknuckle said:

Negative ghost rider. This game is a Star Wars game. If it stops feeling and playing like a Star Wars game, then it will instead be a soulless mechanical competition that a lot of people will have no interest in playing. Making it feel like Star Wars is vastly more important than having a finely balanced game, but it needs to be said that this is not a zero sum game. You can have both.

However, in the rare case we run into a situation where the only way a component can be implemented is by either breaking the lore or by harming the game balance, then either that component should not be included in the game at all, or the game balance should suffer. The lore should never be the one to lose out.

This. This is the only miniatures I have ever even remotely been interested in and most likely ever will be. There is one reason for that: It freaking STAR WARS! Yes as I got more involved in the game I could clearly see there were imbalances that meant if I wanted to win more than 30% of the time, I had to start to compromise between what I wanted to fly (Luke, Wedge, Fett, Fel, etc.) and what actually had a real chance.

If I can fly my favorites and have a statistical 45% or better chance, then I'm willing to accept that challenge and see if my piloting and action decisions can put me over the top.

Edited by pickirk01

I personally love rock-paper-scissors type games and finding the best list to compensate or work around your weaknesses,

17 hours ago, MajorJuggler said:

@Dengar5 hit the nail on the head when he pointed out how engineering design actually works. You are trying to argue that the only way to design something and converge to the optimal solution is by iterating a lot of times. That is.... really really really REALLY REALLY wrong. Design is about understanding the first principles and the fundamentals of what's actually going on, so you can give yourself a massive head start getting closer to your target optimization range. There are some engineers that approach the design process using your "1000 iterative trial and error" approach, but it frequently results in someone else having to fix the mistakes, and in the process figuring out the fundamentals anyway.

But fundamentally yes your point stands -- @OoALEJOoO is missing the point that some problems can be solved analytically, some problems can be solved empirically, and some problems, like X-wing balance, can be solved by both.

I digress.

I'm afraid you won't be able to calculate X-Wing point costs 100% analytically. Yes you can create a simple model that tells you a ball-park figure on how much a ship should cost, but that's about it. To solve the great level of variance the branching decision-making a human player would do (which is far more complex that a simple dice-rolling probability calculation) would need a far more complex analytical model. It would border into A.I. development, and one that even includes varying player skill for that matter. Can it be done? Yes, but it would require a budget superior than play-testing hundreds of times would require ($5000 conversion kits anyone?). You said it yourself already, the data you have collected is experimental (from games played). Perhaps your model from the test data is analytical, but the source of the data is experimental and as such is not a purely analytical model. I believe a model developed like the one you have (i.e. using experimental data) can be very accurate, and I am not implying by any means that requiring large sample quantities (which by the way is not iterating, that's a different thing) is the only way to get an accurate model. I am saying that for this particular problem (X-Wing), testing would be an easier way of developing such model.

The bridge design comparison is misleading. Designing a bridge is challenging yes, but the challenge mainly comes from economical constraints (budget, supply, etc) and physical constraints (material properties, manufacturing limits, etc). All the variance and unknowns present on the design process are ironed out by accounting for the maximums (max # of cars/load possible at any given time, max wind loads, max seismic loads, etc), by putting specification on materials to be used (minimum yield strength, minimum thickness, etc). Then on top of that you put a safety factor. You manage the risk posed by variance by over-design. The calculations are actually very straightforward, which is why the Brooklyn Bridge of Dengar5's example was designed in 1867. This is why bridge engineering design can be done 100% analytically.

Edited by OoALEJOoO

The TL;DR for aspiring game designers: Balance and Fun are correlated, but just because something is balanced doesn't mean it's fun, and vice versa.

Non-TL:DR, Anything can be brought into balance through points adjustments, but some mechanics are so broken due to lack of counterplay that the only way to balance them is to price them to the point that they are not feasible for competitive play.

All this means for us non-game designers is that if you see a card or upgrade that has very little counterplay to it, points adjustments are not going to change that without an errata.

So that leads to the question, are unfun mechanics with no counterplay okay to have around so long as they are not competitively costed?

21 minutes ago, OoALEJOoO said:

I digress.

I'm afraid you won't be able to calculate X-Wing point costs 100% analytically. Yes you can create a simple model that tells you a ball-park figure on how much a ship should cost, but that's about it. To solve the great level of variance the branching decision-making a human player would do (which is far more complex that a simple dice-rolling probability calculation) would need a far more complex analytical model. It would border into A.I. development, and one that even includes varying player skill for that matter. Can it be done? Yes, but it would require a budget superior than play-testing hundreds of times would require ($5000 conversion kits anyone?). You said it yourself already, the data you have collected is experimental (from games played). Perhaps your model from the test data is analytical, but the source of the data is experimental and as such is not a purely analytical model. I believe a model developed like the one you have (i.e. using experimental data) can be very accurate, and I am not implying by any means that requiring large sample quantities (which by the way is not iterating, that's a different thing) is the only way to get an accurate model. I am saying that for this particular problem (X-Wing), testing would be an easier way of developing such model.

The bridge design comparison is misleading. Designing a bridge is challenging yes, but the challenge mainly comes from economical constraints (budget, supply, etc) and physical constraints (material properties, manufacturing limits, etc). All the variance and unknowns present on the design process are ironed out by accounting for the maximums (max # of cars/load possible at any given time, max wind loads, max seismic loads, etc), by putting specification on materials to be used (minimum yield strength, minimum thickness, etc). Then on top of that you put a safety factor. You manage the risk posed by variance by over-design. The calculations are actually very straightforward, which is why the Brooklyn Bridge of Dengar5's example was designed in 1867. This is why bridge engineering design can be done 100% analytically.

I did not say that I could calculate X-wing point costs 100% analytically to 100% pricing accuracy, I said that I can get them within about 5% of the ideal cost.

As for the effects of variance on design, it is apparent that you are presuming to be the subject matter expert and attempting to educate me... I don't know how to put this politely, but I believe the appropriate phrase is "you don't know what you don't know". There are a lot of elements of truth in your points. But, you are making entirely qualitative arguments, in an attempt to argue / convince someone who has actually done the quantitative work behind all your statements. If this subject interests you, I would encourage you to put some work into actually drilling into and quantifying some of the claims you are making. you might be surprised at what you find. :-)

As for engineering design analogies, I understand them plenty well enough, I work in the field. :-)

No need to turn this into a pissing contest, I work in engineering as well. May the force be with you.