Boost dice vs Upgrades to produce successes.

By LethalDose, in Game Mechanics

@Gribble

gribble said:

Yeah, the other thing to bear in mind is that it's not necessarily the probability of getting a single success or advantage that matters for any given roll, but more the probability of getting any particular outcome . There are some outcomes that are not possible (triumph) with just ability + boost, and some that may not increase in probability (double success, which isn't present on the boost die, but I suppose is simulated by one success on the ability die + one success on the boost die). Then again, the only way to get three successes (even if a vanishingly small chance) is with ability + boost, so it does seem a bit wonky there…

I think in your first line, you're talking about success and advantage symbols . And in that case you're right, and it speaks to part reason why its ridiculous to talk about "how many faces have successes on them".

It doesn't matter how many faces you rolled with success on them, or the probability of getting single success more, it matters how many total successes you rolled. And THAT value is ONLY really useful if its greater than the total failures that the roll produced. Just the system for determining a binary success/failure outcome is so complex that dice roll simulation really is the easiest method for evaluating the mechanic by far.

I'm concentrating on the probability of succeeding or failing at a task (NOT the number of success or failures produced), because success at this task is the primary reason the character chose to take an action in the first place: Succeeding on this task is the best way I have to help resolve our current conflict.

And I appreciate that you can't get a triumph without proficiency dice, but that's only part of the roll. And I don't think its anywhere near the most important part. I really think that getting a triumph is GREAT, but remains secondary to the character's ability to successfully tasks (I've said this before in this thread). During our play session, my players consistently said "wow, I seem to be failing on a lot more rolls than it feels like I should be based on my characters skill/proficiency dice/upgrade".

I think you have the causality of what I'm doing reversed. I am doing the math to explain why my simulations are providing the observed results. I am NOT doing simulations to show my math is right. And both of these activities are being performed because my players and I found the upgrade mechanic lacking compared to how we expected it to perform.

The proof is in the pudding, which is the purpose of the simulations I've provided. But if you're willing to post

gribble said:

I suspect that if you're after more successes or advantages - rather than just a higher chance of at least one success or advantage - then you're better off with upgrading rather than boosting.

then you either haven't read my posts in this thread (or at least the ones where I provide evidence), or you disbelieve the evidence that I've provided, because, once again, the following are demonstrably true facts:

  • Upgrading an ability die to a proficiency die leads to a smaller increase of the probability of producing a successful roll than a boost does.
  • Upgrading an ability die to a proficiency die has a smaller impact on the number of advantages rolled than a boost does.

Reality doesn't care what "you suspect", and I shared the results so you don't have to calculate them.

Hell, the reason I started this thread was because I had the same suspicion:

"Upgrading will provide more successes and advantages than boosting"

and I was theoretically and empirically proven wrong wrong WRONG.

-WJL

LethalDose said:

then you either haven't read my posts in this thread (or at least the ones where I provide evidence), or you disbelieve the evidence that I've provided, because, once again, the following are demonstrably true facts:
  • Upgrading an ability die to a proficiency die leads to a smaller increase of the probability of producing a successful roll than a boost does.
  • Upgrading an ability die to a proficiency die has a smaller impact on the number of advantages rolled than a boost does.

I haven't gone through your calculations in detail, so I can't speak to their correctness, but what I did read through were demonstrating the probabilities of generating at least one success/advantage. Apologies if you went beyond that, but my point was that proving you're more likely to roll at least one success isn't sufficient.

Don't forget that you're rolling against difficulty dice in practice, not just rolling ability, proficiency or boost dice, so you need to consider your chances of rolling 2, 3, etc, successes. If you're rolling against 2-3 difficulty dice, then the chances of rolling 1 success are irrelevant, as you'll probably need more than one success to succeed on the task.

gribble said:

I haven't gone through your calculations in detail, so I can't speak to their correctness, but what I did read through were demonstrating the probabilities of generating at least one success/advantage. Apologies if you went beyond that, but my point was that proving you're more likely to roll at least one success isn't sufficient.

Don't forget that you're rolling against difficulty dice in practice, not just rolling ability, proficiency or boost dice, so you need to consider your chances of rolling 2, 3, etc, successes. If you're rolling against 2-3 difficulty dice, then the chances of rolling 1 success are irrelevant, as you'll probably need more than one success to succeed on the task.

Any time I've provided a success probability, refers to the probability that rolled success > failures (there are 1 or more "net" successes) and every example i've provided in this thread is based on at least 10,000 simulations (50k after the OP) and reports the difficulty being rolled against. I've already addressed this exactly in my response post:

LethalDose said:


[…] it matters how many total successes you rolled. And THAT value is ONLY really useful if its greater than the total failures that the roll produced…

and its really what makes calculating and assessing success probability so damned difficult in this system: You're aiming at moving target. TSR's old Alternity system had a similar quirk, though their 'pool' rarely consisted of more than 2 dice.

I guess just be aware what I claim as my evidence is not calculated, but instead based on Monte Carlo simulation. Therefore my results aren't exactly reproducible (its a stochastic, aka random, process), though with 50k simulations, I've found the results of subsequent rolls never change by more than 0.5%. I've provided expected values of changes to net success caused by a mechanism (which is absolutely valid given that the result of any rolled die is independent of the others), so that may be where you were thinking I was providing probability of a single success. But you're right, who gives a **** about the probability of getting a additional single success, in only matters in the context of the roll. "Expected value" is a measure of centrality we use in statistics instead of means, more info on the concept is available here .

-WJL

LethalDose said:

and its really what makes calculating and assessing success probability so damned difficult in this system: You're aiming at moving target.

Yeah, definitely. I guess the only real way to judge is how it feels in play, over the course of a few sessions (one or two probably isn't enough for a fair sample). Sounds like you've done that - unfortunately I haven't been able to do that yet, but plan to kick it off next Tuesday.

Now you made me all bothered by this issue.

I've noticed using the app, and then putting the stickers onto dice and rolling them, that I produce a serious amount of both threats and advantages. I haven't done the math, but it seems I often get results with either one success (remaining after cancellation) and then either lots of threats or advantages.

I have been wondering in my tiny head, and as I said I have not done the math so I can't take these things into account, so this is just a wild notion. What about reducing the number of advantages, perhaps only by one, on the ability die and adding another success. Similarly one could reduce the number of advantages on the proficiency die and adding the same amount of successes. If this is mirrored on the difficulty and challenge dice (or not, the difficulty and challenge dice could remain untouched really), this should according to my flawed logic make successes "slightly" more frequent than advantages (which seems to pop up all the time), while simultaneously at least adjust the boost from upgrading the ability die versus adding a boost die. Of course as I can't state enough, I have not done the math, so what actual change this brings I'm not sure.

I am only curious why so many advantages pop up all the time (of course I might just be "spending" the good results on the dice lengua.gif ).

Alternately one could just up the number of successes on the proficiency die to adjust for the boost die.

Of course, the logic at the base here is that adding a die, whatever sort, is better (ie creates greater change and variant results) than just increasing a die type. Increasing the number of sides rolled is not necessarily better than adding the number of dice (although adding a d6 to a d8 is the same as rolling 1d12 (or really 2d6) EDIT: No its not… my arithmetic is not strong this late at night it seems, its 2d7 or 1d14 if those existed - so adding a d6 creates a greater variation of results than replacing a d8 with a d12), although I'm not sure if this logic holds either for the problem you have pointed out.

Hm… these are but wine induced thoughts after watching MIB3…

@Jegergryte: Sorry to get under your skin with this issue. Well, a little sorry, but also a little glad to get conversation rolling.

As far as producing advantages and threats, that's definitely an issue that's been discussed in the dice mechanic thread ., though discussion there seems to have died out. It'd be a great place to post your ideas to deal with that issue (adjusting symbols on die faces).

Though, again, there's little point to discussing how fix it until the devs say "Yeah, its problem that bears attention and effort". ynnen and FFG_Sam Stewart have pointed out a lot of peripheral considerations when making these comparisons, but have not said the mechanic is not working as intended. As far as I've read (and I've looked), they've never stated any aspect of the dice system is not working as intended. They also haven't said that my simulated numbers look different than what they expect.

In other news, you're right in your assessment that the variance** of 1d6 [Var(1d6) = 2.92] is less than the variance of 2d6 [Var(2d6) = 5.83], but the variance of a 1d12 is much higher than either [Var(1d12) = 11.92]*. This can be intuitively understood when you look at the probabilities associated with the extreme values of these events:

  • 2d6 min and max are 2 and 12, respectively, and probability of both being 1/36.
  • 1d12 min and max are 1 and 12, respectively, and probability of both being 1/12

More likely extreme values lead to higher variance, typically.

*Originally used excel to get these numbers, but found its wrong for what I'm trying to represent. Instead, I used variance formulas for discrete uniform dist'ns provided in wiki from cited source .

** This refers to the variance of the distribution, not the variance of the mean, which dependent on sample size.

But the variance produced by these dice don't have so much to do with the number of faces on the dice, as it does with the results of the faces. I haven't reported the variance of the results for a few reason. Mainly because its not very informative to the audience, its invalid because of the lack of independence between success and advantage production, and its not useful because the distributions of the results are not normally distributed (though this may be debatable in larger pools by invoking the Central Limit Theorem). The only measures of dispersion I have reported are 90% probability intervals (PI90), which means 90% of the results laid w/in those bounds.

Really, I find its just easier and more informative to evaluate the PI's and point estimates than try to calculate the variances produced by the simulation results. Also, since this is starting to converge with stuff I will be posting on my blog, I'll point out that the address is now linked in my profile. You may find it to be worthwhile.

-WJL

What I meant, which might have been unclear is that replacing a die in a dice pool with one with 4 more faces, is not - as far as my ramblings tried to figure out - as good as adding a die with 6. If you have a dice pool of three d8, add a d6, will basically increase the dice pool to four dice, whereas improving a die does not increase the dice pool size as such.

I believe that having the "dice pool" in mind when discussing this is important.

And I might have used the term "variant results" in a way not specific to an academic field. Statistics is not my strength within my field of study, even if I work with it (granted "work with" means documenting and "cleaning" before online publication of surveys).

So, I decided to do a couple of quick tests of this. I used a random number generator in my little perl script and had it roll each die 1,000,000 times in each test.

The results are

Number of successes, % for Ability, % for Proficiency, % for Ability + Boost

0 49.85% 33.35% 33.33%
1 37.64% 49.92% 41.76%
2 12.51% 16.73% 20.73%
3 0.00% 0.00% 4.17%

Upgrading the die results in a significant improvement (33% vs 49% chance of no successes; 50% vs 37% for just one success). Adding a single Boost die to the Ability die also helped, in certain areas. Slightly better chance of 2 successes (20% vs 16% for the proficiency) and the first to have 3 successes as a possibility (4% chance vs 0%).

I could probably extend the script to do lots of test runs of different pools to see how they all fair (success vs failure, advantage vs threat, triumph vs despair).

Kallabecca said:

So, I decided to do a couple of quick tests of this. I used a random number generator in my little perl script and had it roll each die 1,000,000 times in each test.

The results are

Number of successes, % for Ability, % for Proficiency, % for Ability + Boost

0 49.85% 33.35% 33.33%
1 37.64% 49.92% 41.76%
2 12.51% 16.73% 20.73%
3 0.00% 0.00% 4.17%

Upgrading the die results in a significant improvement (33% vs 49% chance of no successes; 50% vs 37% for just one success). Adding a single Boost die to the Ability die also helped, in certain areas. Slightly better chance of 2 successes (20% vs 16% for the proficiency) and the first to have 3 successes as a possibility (4% chance vs 0%).

I could probably extend the script to do lots of test runs of different pools to see how they all fair (success vs failure, advantage vs threat, triumph vs despair).

Great to see that what I'm doing is being reproduced by some other interested parties. Validation is absolutely key and I welcome people to question my findings or propose alternate methods for comparison.

Based on your results, the dice produce expected successes as follows:

  • Ability ONLY: 0(.4985) + 1 (.3764) + 2(.1251) = 0.6266
  • Proficiency ONLY: 0(.3335) + 1 (.4992) + 2(.1673) = 0.8338
  • Ability + Boost: 0(.3333) + 1 (.4176) + 2(.2073) + 3(.0417)= 0.9573

Again based on these numbers we see the upgrade produces .8338 - .6266 = .2042 additional successes, and the ability + boost die produces .9573 - .6266 = .3307 successes.

Looking at a small set of examples with 1 to 2 dice makes some things become apparent I missed. The big one is that an ability die + a boost die is just as likely to produce no successes as a proficiency die, and is overall more likely to produce 2 or more successes. This comparison neglects the advantages produced, but is still an interesting finding.

Extending that thought, it can be shown that an ability + boost die have a 1/8 * 2/6 = 2/48 = 1/24 (about 4%) probability of producing NO symbols, while a Proficiency die has 1/12 (about 8%) chance of producing no symbols. Twice as high…

Again, Thanks Kallabecca! I'd love to see you expand your perl script and see how your results compare for some of the dice pools I've posted.

-WJL

Speaking of validation I should compare these simulated (based on 1,000,000) results to the initial expected values in the OP.

  • Expected value of increased successes produced by upgrade: 5/24 = .2083, compared to simulated by Kallabecca: .2042
  • Expected value of increased successes produced by boost + Ability: .3333 compared to simulated by Kallabecca: .3307

Seems pretty close.

-WJL

OK. Advantages were easy to add (just had to remember there is the possibility of 4 for Ability + Boost).

# of successes or advantages, successes from ability, advantages from ability

0 49.94% 50.00%
1 37.51% 37.48%
2 12.54% 12.52%
3 0.00% 0.00%
4 0.00% 0.00%

# of successes or advantages, successes from proficiency, advantages from proficiency
0 33.27% 50.06%
1 50.07% 33.34%
2 16.66% 16.60%
3 0.00% 0.00%
4 0.00% 0.00%

# of successes or advantages, successes from ability + boost, advantages from ability + boost
0 33.30% 24.99%
1 41.73% 35.42%
2 20.77% 27.08%
3 4.20% 10.43%
4 0.00% 2.08%

Not sure if FFG forums support tables in posts :(

I don't track the individual results, just the accumulated results. So, the first set of data means that each time you roll the Ability die 50% of the time will have no successes, 50% of the time it will have no advantages (which fits since half the die lacks a success, half the die lacks an advantage). It wouldn't be hard to track the actual possibilities, just didn't bother with this simple script.

OK, and here's the breakdown by the actual roll results (order doesn't matter so ssa and sas are the same result).

my key: a = advantage, s=success, T = triumph, t=threat, f=failure, D=despair

results are in the following order: dice results, ability die %, proficiency die %, ability+boost dice result %

a 25.00% 8.27% 10.43%
aa 12.47% 16.69% 10.41%
aaa 0.00% 0.00% 6.20%
aaaa 0.00% 0.00% 2.08%
aaas 0.00% 0.00% 4.18%
aas 0.00% 0.00% 12.54%
aass 0.00% 0.00% 4.15%
as 12.56% 25.03% 14.61%
ass 0.00% 0.00% 8.30%
asss 0.00% 0.00% 2.09%
s 25.02% 16.66% 10.44%
ss 12.53% 16.69% 8.36%
sss 0.00% 0.00% 2.06%
Ts 0.00% 8.31% 0.00%

A Triumph is also a success, so the Ts is so that future changes to the script can handle things like successes - failures since the success part of a Triumph can be countered by a failure.

LethalDose said:

Speaking of validation I should compare these simulated (based on 1,000,000) results to the initial expected values in the OP.

  • Expected value of increased successes produced by upgrade: 5/24 = .2083, compared to simulated by Kallabecca: .2042
  • Expected value of increased successes produced by boost + Ability: .3333 compared to simulated by Kallabecca: .3307

Seems pretty close.

-WJL

That is known as the Central Limit Theorem. In a normally distributed population, the more samples you take the closer to the population mean and variance the results get. If I were to up the results another log you'd see the numbers close in on your expected result. Dice will also tend towards the mean as more dice are rolled since each die face has an approximately equal chance of coming up with each roll.

Kallabecca said:

That is known as the Central Limit Theorem. In a normally distributed population, the more samples you take the closer to the population mean and variance the results get. If I were to up the results another log you'd see the numbers close in on your expected result. Dice will also tend towards the mean as more dice are rolled since each die face has an approximately equal chance of coming up with each roll.

Technically, you're citing the weak law of large numbers : As the sample size increases, the mean converges to the expected value. While you're correct in that the CLT states that as the sample size increases (i.e. goes to infinity) , the mean becomes normally distributed, I urge you to be very careful about invoking it in situations involving simulated data.

in when simulating results like this, you can (and have) generated an arbitrarily large volume of data, meaning that you can create an arbitrarily small confidence interval around your mean/expected value which, yes, can be assumed to be normal because of CLT. BUT that represents the distribution of the mean, not the distribution of simulated data.

What Ive found to be much more informative when discussing the source data is to report probability intervals (These are also called "credibility intervals", if you're feeling particularly Bayesian). Basically, PI's give the values that bound (1-alpha)x100% of the data. See some of my earlier posts for PI's.

My point in drawing attention to the estimated vs simulated values was just to show that your simulations are consistent with the values I've been presenting since the OP.

-WJL

I think this is an important thread. It demonstrates statistically that some talents and abilities are not costed or gauged correctly and that's a big problem. It honestly requires going back through the entire book with the knowledge that "Upgrades" are less valuable, and not more, than "Adding boosts" and correcting all abilities and maneuvers that used them. I honestly think this is the most significant and important find in the Beta so far. Bravo Lethaldose.

That depends… how many abilities need Triumph to fire off. The more abilities that need Triumph, the better the upgrades from ability to proficiency are (since they are the only dice with Triumph). This was just a blind test of one small condition. It hardly makes the case for needing to change everything, yet.

Kallabecca said:

That depends… how many abilities need Triumph to fire off. The more abilities that need Triumph, the better the upgrades from ability to proficiency are (since they are the only dice with Triumph).

This [i assume] is the exact reason that Sam Stewart stated above:

FFG_Sam Stewart said:


The benefits of a Triumph are not easily quantified. Yes, there's a lot in this book that talks about different ways you can spend a Triumph. However, these are just some of the ways Triumphs can be utilized, and GMs and players are encouraged to be creative…

The value of a triumph is, simply put, a f***ing b!tch to quantify. A previous poster insisted I needed treat a triumph as advantages, to use his words:

koralas said:


[Triumphs] certainly do count as Advantages, or as I put it "super advantages", in that a Triumph provides a success, but it also provides an unlimited pool of Advantage when activating a single special ability; that is a single Triumph will activate one special effect of a weapon or score a critical wound regardless of the number of Advantages it would normally require. Further, while Advantages can be used to remove Strain, Triumph instead can remove wounds, some effects require a Triumph, and the GM or scenario may call out other uses for Triumph in the narrative (Page 20).

While I think "unlimited pool of advantage" is just nonsense, I tried it in a very limited sense as a way of providing SOME kind of quantitative comparison (e.g. avoid apples to oranges) between the two mechanisms. I found that when you consider the triumph to be about 5.5 advantages, the upgrade to the proficiency die is, on average, as effective as adding a boost die for producing advantages (not successes).

For the record, I am not stating that triumphs should count as 5.5 advantages! I'm merely stating a value to at which equivalency is reached in this comparison!

There are actually very few abilities that require a triumph to activate, the talents disorient and knockdown are the ones that come to mind. Honestly, Knockdown typically just costs the target a maneuver to get up. This could be more useful since it's in a melee tree and provides a melee bonus, bu theres also a big chance the prone target will just get up before the marauder has a second chance to attack him. There are so few ranks of Disorient, I'd be surprised if any character will take more than 2, which means spending a triumph on it is unlikely to cause more than 1 setback die two turns in a row.

Speaking of ways to spend triumph [non-narratively], i think its kind of funny that, based on what's been shown in this thread, reading the 6-2 on page 133:

  • Triumph or two adv can be spent on a boost die to any allies roll.
  • A Triumph can be spent to upgrade any allies roll.

If you've read the thread, this seems weird. Also with only one rank in Disorient (the talent appears once in the gadgeteer tree and twice in the scout tree), a triumph can again be spent for the same benefit as two adv: Targeted enemy gains [setback] on his next skill check. Again, based on the math and simulations, these costs just seem borked.

Now, these weakness of the upgrade vs the boost may be by design. It may be the intent of the devs that players and GMs should use the triumphs for narrative purposes, not as mechanical boosts, because the potential to "Do something vital to turn the tide of a battle" seems to beat the crap out of an upgrade of ability to proficiency. I lend very little credibility to this theory, but it bears mention.

Besides, if the triumph is the king of all symbols, and its the reason we need to upgrade… then using a triumph ( a factual extant rolled triumph ) to buy someone an additional 8% chance to roll a triumph… seems like a pretty raw deal. I understand that some situations require an individual to roll a triumph, but they're not frequent enough to justify this.

Uh, yeah… Long answer to the first quote. Hope this clarifies some things though…

-WJL

Taking what I did for the stats tests, I wrote a simple SW:EotE dice roller. You can see a