No, it's not playtesting. FFG simply releases broken stuff, sells loads of it, and then nerfs it

By haritos, in X-Wing

Quote

Also make no mistake, he is very much intentionally using words like that to paint himself as the reasonable adult and everyone who disagrees with him and a child.

People tend to vary in how verbally conscious they are of it. It took me a while to articulate it to my satisfaction. Some people know that patronising others feels good and tends to provoke them without fully understanding why.

2 hours ago, haritos said:

not a single person has found an explanation for defenders also missing their mark? Zuckuss they claim, was probably not foreseen due to the very specific requirements to set him up. How so for defenders? Or palp?

Balancing the product requires using math: either applying in-depth data analytics on playtesting data, or having a deep understanding about the fundamental equations that govern the game. You can't just hire someone who has a STEM degree and automatically expect that they could solve this problem. It's not as simple as tossing a set of equations and X-wing ships at a freshly minted Mathematics Bachelor of Science graduate and having them balance the game. Frank Brooks was an engineer (chemical, I believe) before he became an FFG game designer, but that doesn't directly translate to technical balance capability.

The "solutions" for how to balance a game like X-wing basically do not exist publicly, it is an area that requires more fundamental academic research. I took the approach of adapting a 100-year old mathematical combat model and re-deriving some of the fundamentals so it would apply to X-wing. I have >10k lines of MATLAB code dedicated to it, and I still really need to overhaul the entire approach. Vorpal Sword is doing his PhD dissertation on X-wing balance. It is not a trivial task.

28 minutes ago, MajorJuggler said:

Balancing the product requires using math: either applying in-depth data analytics on playtesting data, or having a deep understanding about the fundamental equations that govern the game. You can't just hire someone who has a STEM degree and automatically expect that they could solve this problem. It's not as simple as tossing a set of equations and X-wing ships at a freshly minted Mathematics Bachelor of Science graduate and having them balance the game. Frank Brooks was an engineer (chemical, I believe) before he became an FFG game designer, but that doesn't directly translate to technical balance capability.

The "solutions" for how to balance a game like X-wing basically do not exist publicly, it is an area that requires more fundamental academic research. I took the approach of adapting a 100-year old mathematical combat model and re-deriving some of the fundamentals so it would apply to X-wing. I have >10k lines of MATLAB code dedicated to it, and I still really need to overhaul the entire approach. Vorpal Sword is doing his PhD dissertation on X-wing balance. It is not a trivial task.

Ha! That's only your opinion. And the mere fact that it's based on deep analysed data does not make it better than mine, which is based on... feelings... Feelings!

If 2016 showed me something, it's that your brains isn't better than someone else guts, even more if this someone is self endorsed with the flaming wrath of the righteous, burning with the fury of a thousand suns... Apparently no in a public forum (or a democracy).

Quote

The "solutions" for how to balance a game like X-wing basically do not exist publicly, it is an area that requires more fundamental academic research. I took the approach of adapting a 100-year old mathematical combat model and re-deriving some of the fundamentals so it would apply to X-wing. I have >10k lines of MATLAB code dedicated to it, and I still really need to overhaul the entire approach. Vorpal Sword is doing his PhD dissertation on X-wing balance. It is not a trivial task.

Then add FFG's internal corporate culture to mix...

1 hour ago, MajorJuggler said:

Balancing the product requires using math: either applying in-depth data analytics on playtesting data, or having a deep understanding about the fundamental equations that govern the game. You can't just hire someone who has a STEM degree and automatically expect that they could solve this problem. It's not as simple as tossing a set of equations and X-wing ships at a freshly minted Mathematics Bachelor of Science graduate and having them balance the game. Frank Brooks was an engineer (chemical, I believe) before he became an FFG game designer, but that doesn't directly translate to technical balance capability.

The "solutions" for how to balance a game like X-wing basically do not exist publicly, it is an area that requires more fundamental academic research. I took the approach of adapting a 100-year old mathematical combat model and re-deriving some of the fundamentals so it would apply to X-wing. I have >10k lines of MATLAB code dedicated to it, and I still really need to overhaul the entire approach. Vorpal Sword is doing his PhD dissertation on X-wing balance. It is not a trivial task.

I'm glad the community has you. Sometimes i feel like you're the only voice of objective reasoning the community has left.

Also, as a side note, i'd be curious to hear what you think Palpatine post-FAQ generates in true value. Mostly because i personally believe it is still higher than 8 points, and it'd go a long way of getting whiners who say that Palp "isn't 8 points and 2 crew slots good now...." to shut up.

52 minutes ago, Razgriz25thinf said:

I'm glad the community has you. Sometimes i feel like you're the only voice of objective reasoning the community has left.

Also, as a side note, i'd be curious to hear what you think Palpatine post-FAQ generates in true value. Mostly because i personally believe it is still higher than 8 points, and it'd go a long way of getting whiners who say that Palp "isn't 8 points and 2 crew slots good now...." to shut up.

Imo, I don't think its worth it now, but it might be much more in line. Decent, but much worse.

Considerable match ups were very unfavorably affected: Palp Soontir Vader vs 4 TLTs for example.

6 hours ago, haritos said:

It de-legitimizes it for you because you already disagree with me, its all perspective. I find it cute so im gonna say it, i wont suck up to somebody to get a point across, this isn't politics :)

Speaking of cute, isn't it also adorable how not a single person has found an explanation for defenders also missing their mark? Zuckuss they claim, was probably not foreseen due to the very specific requirements to set him up. How so for defenders? Or palp?

I think that's the way to go if you want to deconstruct my argument/offer another explanation. Not point out that I made someone salty for finding him cute!

I think its pretty cute that's its you of all people to critique my lack of an argument, given your response to everything has been some passive-aggressive bs. But I'll play your game.

If your crack pot theory is to be believed, then why would FFG wait so long to nerf Palp? Wouldn't it make more sense to nerf the power lists every time they planned on releasing new op-ness? Also, as far as I know, they've never done this for any other game, so why start with this one?

On 3/8/2017 at 7:55 PM, haritos said:

Again, it's incredibly cute that you actually think that there is some God like ability behind the most simple cash grabbing business model. Or that this is a conspiracy theory.

Is everyone here so naive to think it actually takes brains to think of a plan like that? Would you buy as many uboats if they weren't so strong? Would you EVER buy a raider? You actually find the simplest idea, which is to pack strong cards in new products to sell them, genius? Wow, you are cute.

Awww, that's 'cute', the kid doesn't understand sarcasm.

I never said they didn't pack great cards in new products; you seem to be struggling to read what I actually wrote. That's not very cute actually, just sad.

Go troll elsewhere kiddo.

14 hours ago, MajorJuggler said:

Balancing the product requires using math: either applying in-depth data analytics on playtesting data, or having a deep understanding about the fundamental equations that govern the game. You can't just hire someone who has a STEM degree and automatically expect that they could solve this problem. It's not as simple as tossing a set of equations and X-wing ships at a freshly minted Mathematics Bachelor of Science graduate and having them balance the game. Frank Brooks was an engineer (chemical, I believe) before he became an FFG game designer, but that doesn't directly translate to technical balance capability.

The "solutions" for how to balance a game like X-wing basically do not exist publicly, it is an area that requires more fundamental academic research. I took the approach of adapting a 100-year old mathematical combat model and re-deriving some of the fundamentals so it would apply to X-wing. I have >10k lines of MATLAB code dedicated to it, and I still really need to overhaul the entire approach. Vorpal Sword is doing his PhD dissertation on X-wing balance. It is not a trivial task.

Finally, someone started forming arguments again. I got tired of hurt adults that can't get over the word cute.

I still don't agree with you though. As I said myself, balancing is very hard, and all the math in the world won't help you if you don't playtest. In the case of defenders, I find it incredibly hard to believe they couldn't assess the power of tie x7 or palp.

15 hours ago, Blue Five said:

People tend to vary in how verbally conscious they are of it. It took me a while to articulate it to my satisfaction. Some people know that patronising others feels good and tends to provoke them without fully understanding why.

I am truly honored that the two of you have struck a conversation involving a deep analysis of 2 of the hundred of words I ve used. Your essay on cute babies was a bit disturbing, I also use the word cute on attractive women, am I doing it wrong?

You can answer me in a pm. Since your ad hominem attacks are kind of off topic.

And to close this parenthesis, calling your argument cute isn't ad hominem. It's there in the definition. I'm attacking your argument, not your character (except that one guy who got so obsessed with cute I decided to keep calling him that, sorry bro :() . I really don't think you guys are cute, I am finding your approach to the topic cute. And I will say it out loud, cause I'm not a politician trying to win over votes. You can also feel free to call my argument stupid if you feel it is.

I hope this ends this whole hurt saga and we can continue talking about how brilliantly FFG is making money in 2017. If you don't like it, steer clear of the topic and you won't have to put up with me.

WTB ARGUMENTS!

Edited by haritos
10 hours ago, Mattman7306 said:

If your crack pot theory is to be believed, then why would FFG wait so long to nerf Palp? Wouldn't it make more sense to nerf the power lists every time they planned on releasing new op-ness? Also, as far as I know, they've never done this for any other game, so why start with this one?

Because they were busy selling Raiders my friend.

Does that sound like a crazy conspiracy theory as well? Sell sell sell Raiders. OK people bought the Raiders, now I can nerf Palp and fix the game.

Why ELSE do you think they waited so long? I'm curious.

Why else do you think they fixed dead eye so fast? Cause everyone had it, and it wasn't selling 100$ Raiders.

And no, I'm not aware of any other FFG game that has such a greedy business model. I played agot,netrunner and 40k lcgs, and I payed 15$ a month for all the OP cards, period.

The only other greedy product is the brand new star wars destiny, but let me not start another "conspiracy" theory of how this Is all part of the new greedy FFG strategy.

Edited by haritos
15 hours ago, MajorJuggler said:

Balancing the product requires using math: either applying in-depth data analytics on playtesting data, or having a deep understanding about the fundamental equations that govern the game. You can't just hire someone who has a STEM degree and automatically expect that they could solve this problem. It's not as simple as tossing a set of equations and X-wing ships at a freshly minted Mathematics Bachelor of Science graduate and having them balance the game. Frank Brooks was an engineer (chemical, I believe) before he became an FFG game designer, but that doesn't directly translate to technical balance capability.

The "solutions" for how to balance a game like X-wing basically do not exist publicly, it is an area that requires more fundamental academic research. I took the approach of adapting a 100-year old mathematical combat model and re-deriving some of the fundamentals so it would apply to X-wing. I have >10k lines of MATLAB code dedicated to it, and I still really need to overhaul the entire approach. Vorpal Sword is doing his PhD dissertation on X-wing balance. It is not a trivial task.

While true, it also took me about half an hour on excel to work out Defenders had gone off the deep end in terms of efficiency.

That alone was only part of the problem, as the unpreventable nature of the Evade, the white K-turn and even Ryad's ridiculous ability all played their part and that's harder to math out.

The point stands - if it takes Deep Blue to work out if an upgrade is too good then it's probably not that far off the mark. It should have taken about halfway through the first playtest game with /x7 Ryad for the alarm bells to start ringing.

Quote

I really don't think you guys are cute, I am finding your approach to the topic cute.

You think our arguments have endearing and aesthetically pleasing appearances? I wasn't aware we could write them in calligraphy.

You're either being deliberately patronising in a manner that contributes nothing to a rational discussion or you don't get the concept of words.

Quote

You can also feel free to call my argument stupid if you feel it is.

Which would achieve what, exactly?

Quote

Your essay on cute babies was a bit disturbing, I also use the word cute on attractive women, am I doing it wrong?

Historically it does indeed come from the description of juvenile appearance: the first people to call women cute were drawing that exact slightly creepy comparison. People emulated the usage without understanding it and the definition broadened a bit and nowadays tends to mean endearing appearance. To which it refers depends on context. I very much doubt you were flirting with everyone who disagrees with you.

Quote

Since your ad hominem attacks are kind of off topic.

The fact that you saw it as an attack (it's not) is very interesting in of itself.

Edited by Blue Five
Quote

Why ELSE do you think they waited so long? I'm curious.

To errata Palpatine they needed to -

  1. - be certain the situation was dire enough to warrant breaking FFG's internal policy against errata.
  2. - design an errataed card.
  3. - playtest that errataed card.
  4. - iterate the redesign.
  5. - go through LFL.
Quote

Why else do you think they fixed dead eye so fast? Cause everyone had it, and it wasn't selling 100$ Raiders.

The edit was a build restriction which requires no playtesting. I challenge you to find an errata involving significant mechanical changes that was quick.

Edited by Blue Five

Corollary: as I understand it a lot of the playtesting is done on a volunteer basis by 'the great and the good' of X-Wing.

Two things: they're likely using a mixture of established competitive lists from the currently released Wave and theoretical 'these are going to be the best lists' that include the waves they've playtested but haven't been released. There has been more development behind those lists than the ones using the new upgrades. If near-optimal opponents are managing to beat sub-optimal playtest squads then are you going to spot issues?

Secondly, if 'the great and the good' can find ways to beat these new things being playtested then that may not be representative of the experience of the majority of players when the product is released. One of the worst things they can make is something very powerful but which can be beaten by perfect flying, because for 98% of players that card is not balanced even if the 2% good enough to be in playtest groups found that it was balanced. That's not about 'it's ok for good players to beat bad players' it's about the hundreds of GNK and store champs that don't feature Paul Heaver vs Duncan Howard where the power of the upgrade is not being matched by flying ability to defeat it.

17 hours ago, Razgriz25thinf said:

i'd be curious to hear what you think Palpatine post-FAQ generates in true value. Mostly because i personally believe it is still higher than 8 points, and it'd go a long way of getting whiners who say that Palp "isn't 8 points and 2 crew slots good now...." to shut up.

And how would you measure "true value" when you have an take to equation the fact that one must now decide beforehand whether to use Palpatine or not?

Any calculations don´t change the fact that a working card was made weak. And for what? Clearly not because Palpatine was OP or dominating competitive scene. The card was not broken, so why errata it? I feel cheated.

2 hours ago, Blue Five said:

To errata Palpatine they needed to -

  1. - be certain the situation was dire enough to warrant breaking FFG's internal policy against errata.
  2. - design an errataed card.
  3. - playtest that errataed card.
  4. - iterate the redesign.
  5. - go through LFL.

So basically do what they do every day when designing cards.

And this one was a card that is already out. You know the lists it's played and there aren't that many combos since few ships can carry it.

Seriously, you think FFG are that bad at their job.

PS: Dont forget the most important step:

6 - estimate lost revenues due to fewer raider sales and assess whether fixing a card is worth losing sweet sweet cash

Edited by haritos
On 3/7/2017 at 0:20 PM, haritos said:

Playtesting can often miss something, that is something we all understand in designing games.

But are you seriously telling me that FFG, with now years of x wing design experience suddenly fails so spectacularly to hit the mark?

Let's not be naive. They made amazing cards. Everyone ran to buy scum. Now they nerf them and get to keep the cash.

I think you had a preferred ship or crew get whacked which gave your guy a quick and visceral retort, or perhaps your just trolling. In either event, I will claim that you are wrong. There is one glaring proof that FFG wishes to create awesome ships that sell like hot-cakes and then nerf them as they dance happily to the bank. This proof is practically immune to cross-examination: The Punisher. It has the toughest, bad *** name of all released ships, but it is the biggest dud ever sold. Period.

And it makes me weep.....where's my tea.

2 hours ago, clanofwolves said:

I think you had a preferred ship or crew get whacked which gave your guy a quick and visceral retort, or perhaps your just trolling.

nope

3 hours ago, haritos said:

So basically do what they do every day when designing cards.

And this one was a card that is already out. You know the lists it's played and there aren't that many combos since few ships can carry it.

Seriously, you think FFG are that bad at their job.

PS: Dont forget the most important step:

6 - estimate lost revenues due to fewer raider sales and assess whether fixing a card is worth losing sweet sweet cash

If your argument is they figure out if making changes will hurt their business yes, of course they do that.

If your argument is that they designed the Emperor to break the game so that they would sell out the raider print run and then nerf him into the ground... there is no evidence for that and plenty of evidence to the contrary given how incredibly cautious they are about breaking the balance and regularly print cards that cost too much to be competitively viable.

I also think you severely overestimate the amount of playtesting they have the capacity for. FFG is a small company. X-wing is a seriously complicated game. I would consider it on par with trying to balance a MOBA or RTS. Even companies who are very good at that and have the ability to patch regularly and have 10's of thousands of playtesters have a hard time getting anything close to perfect balance.

I will believe your theory when I see a pattern arise of attaching clearly designed cards to dominate the meta followed by nerfs on release of new product with similarly broken cards. So far I see one year of releases which included many more ships than ever before including some rapid product design/release for SW7.

7 hours ago, haritos said:

I still don't agree with you though. As I said myself, balancing is very hard, and all the math in the world won't help you if you don't playtest. In the case of defenders, I find it incredibly hard to believe they couldn't assess the power of tie x7 or palp.

FWIW, my pre-release predictions of ship / pilot effectiveness over the last several years has used 0 playtesting data, based instead entirely on math, and this approach has consistently been better at predicting ship's effectiveness than FFG's own design and playtest process. Certainly I could do even better with hard playtest data and using further data analytics on that data, but even a straight math approach has a lot of value.

6 hours ago, Stay On The Leader said:

While true, it also took me about half an hour on excel to work out Defenders had gone off the deep end in terms of efficiency.

That alone was only part of the problem, as the unpreventable nature of the Evade, the white K-turn and even Ryad's ridiculous ability all played their part and that's harder to math out.

The point stands - if it takes Deep Blue to work out if an upgrade is too good then it's probably not that far off the mark. It should have taken about halfway through the first playtest game with /x7 Ryad for the alarm bells to start ringing.

6 hours ago, Stay On The Leader said:

Corollary: as I understand it a lot of the playtesting is done on a volunteer basis by 'the great and the good' of X-Wing.

Two things: they're likely using a mixture of established competitive lists from the currently released Wave and theoretical 'these are going to be the best lists' that include the waves they've playtested but haven't been released. There has been more development behind those lists than the ones using the new upgrades. If near-optimal opponents are managing to beat sub-optimal playtest squads then are you going to spot issues?

Secondly, if 'the great and the good' can find ways to beat these new things being playtested then that may not be representative of the experience of the majority of players when the product is released. One of the worst things they can make is something very powerful but which can be beaten by perfect flying, because for 98% of players that card is not balanced even if the 2% good enough to be in playtest groups found that it was balanced. That's not about 'it's ok for good players to beat bad players' it's about the hundreds of GNK and store champs that don't feature Paul Heaver vs Duncan Howard where the power of the upgrade is not being matched by flying ability to defeat it.

Yeah, playtesting without going back and doing analytical number crunching on that data can, at times, be worse than not playtesting at all. Without using math to get an impartial view, you are susceptible to the human elements of favoritism and perception bias.

The flip side is that if you understand the fundamentals, you can get a really good first-order approximation of the x7 title's added value in about 30 seconds.

JoustingValueNew ~= JoustingValueOld * ( [free evades) + 6] / 6 )^0.5

Assuming you already know the Defender's old jousting value (which I had already published as ~23 points), it's trivial to estimate that if x7 triggers three times per ship per game, its worth 28 points in raw dice... which basically means the x7 Delta is getting the K-turn for free.

Edited by MajorJuggler
1 minute ago, MajorJuggler said:

Assuming you already know the Defender's old jousting value (which I had already published as ~23 points), it's trivial to estimate that if x7 triggers three times per ship per game, its worth 28 points in raw dice... which basically means the x7 Delta is getting the K-turn for free.

Not just "triggers." It needs to have a game effect. (That doesn't necessarily mean "stopping a point of damage," except as you've calculated it. Affecting target priority also has value ... though that's very tough to math.)

18 minutes ago, Jeff Wilder said:

Not just "triggers." It needs to have a game effect. (That doesn't necessarily mean "stopping a point of damage," except as you've calculated it. Affecting target priority also has value ... though that's very tough to math.)

Yes, I implied triggered + shot at, so it's really the number of times that the free evades get used. Just use a lowball estimate of 2 free shields per ship per game: that's a very reasonable estimate that should set the floor of the ship cost, and results in a JV of 23*(8/6)^0.5 = 26.5. So a Delta is paying 1.5 points for a white K-turn. If a Delta can get 2 extra shields per game via x7 and leverage the K-turn to get extra shots on target about 11% of the time (relative to its target's firing rate), it just hit it's 28 point jousting value. Or it can get 3 extra shields per game and not even need it's white K-turn to hit it's JV. Those are very easy metrics to make, and we haven't even talked about Vessery yet. It was trivial to demonstrate with zero playtesting that the x7 title was overpowered upon release, provided a proper understanding of the fundamentals.

You can get more precise numbers with a second order analysis that takes more into account, but a first order off-the-cuff calculation already demonstrates why the title is too good.

Changing target priority to shoot "around" the x7 title still helps the Defenders, though, as it results in otherwise non-ideal target priority. This is particularly important now post-FAQ. Jousting value math without focus fire is not trivial, but the above still sets the floor on what the ship price should be.

Edited by MajorJuggler
Just now, MajorJuggler said:

It was trivial to demonstrate with zero playtesting that the x7 title was overpowered upon release.

You're preaching to the choir. Just keep in mind that "got released broken" isn't congruent to "nobody saw the issue in playtest."

42 minutes ago, MajorJuggler said:

FWIW, my pre-release predictions of ship / pilot effectiveness over the last several years has used 0 playtesting data, based instead entirely on math, and this approach has consistently been better at predicting ship's effectiveness than FFG's own design and playtest process. Certainly I could do even better with hard playtest data and using further data analytics on that data, but even a straight math approach has a lot of value.

Yeah, playtesting without going back and doing analytical number crunching on that data can, at times, be worse than not playtesting at all. Without using math to get an impartial view, you are susceptible to the human elements of favoritism and perception bias.

The flip side is that if you understand the fundamentals, you can get a really good first-order approximation of the x7 title's added value in about 30 seconds.

JoustingValueNew ~= JoustingValueOld * ( [free evades) + 6] / 6 )^0.5

Assuming you already know the Defender's old jousting value (which I had already published as ~23 points), it's trivial to estimate that if x7 triggers three times per ship per game, its worth 28 points in raw dice... which basically means the x7 Delta is getting the K-turn for free.

Two things:

1. Mathing X-wing has many strengths and I would also imagine many weaknesses. I am not sure how you program it to iterate every combination of abilities so things like the U-boats would be difficult to math. X7 being raw efficiency presents itself as a much easier math target and even without math you can see it is likely to be very strong, but given the defender was overcosted to begin with and it's necessity for flying higher speeds is harder to math.

2. I have no idea what if any hard math tools are used to design X-wing. I would argue the game could easily be designed without them and just raw playtesting and the current game state would make sense.

Edited by Jetfire