The new "whiteboarding": why DPR as a metric is broken.


Advice

51 to 100 of 153 << first < prev | 1 | 2 | 3 | 4 | next > last >>

8 people marked this as a favorite.
N N 959 wrote:
This reminds me of a friend I have who insist Einstein's General Relativity theory is wrong. I told him he's going to have to prove it, to which he insists he doesn't have to prove anything.

Does your friend also call people intellectually dishonest when they disagree with him? The behavior sounds familiar.


Yes DPR calculations need to be a lot more detailed in this edition, probably taking into account defense/mobility and maybe creating a new derived statistic. Yes the dice matters a lot more so static values can get lost.

But no it doesn't mean it's broken. And no it doesn't mean its misinformation (in itself).

********
Misinformation is something that will always occur regardless of what a data set might says. One example is what the misuse of the p value and significance, which is often arbitrarily set to 1/20 (due to tradition) or its treated incorrectly which draws wrong conclusion.


1 person marked this as a favorite.
Squiggit wrote:
Does your friend also call people intellectually dishonest when they disagree with him? The behavior sounds familiar.

If I tried to tell him that people who propound DPR calcs don't believe those calcs are generally applicable? Yes, he would absolutely call me out on that and he'd be right. Because that's what my statement was directed at, and that alone.


5 people marked this as a favorite.
krazmuze wrote:

@Strill

Yes you clearly do not understand the law of large numbers if you think that the precise odds of a uniform dice determines your performance in your RPG. The very simple fact is that your rolls that you will have are way way in the high variance because you will always have a low low number of rolls within a level. This is why I insist that if you want to talk DPR include the +/- variance so that I can see which options are not significantly different. I know that 1+/-2 is not significant and I will pick the option with more flavor that I like, but a 5+/-.5 is worth taking unless I really dislike its flavor.

I never said that variance wouldn't affect my performance in an RPG. I'm saying that it's not rational to base your choices on random noise in someone else's game.

I'm also saying that it is incorrect to call two data sets "not significantly different" when you know with 100% certainty that the odds which produced them ARE different. When you say "Not significantly different", that means that the two data sets are most likely the same data set, and their differences are most likely due to variance. If you know the exact odds that produced each data set, and know that the odds for each set were different, you are lying if you say they're not significantly different, regardless of what the data sets are. This situation is not a valid application of statistical significance, because statistical significance is a tool for testing hypotheses on unknown systems, not for introducing false equivalency into known systems.

Quote:

As I said I did the 40 rolls because someone else says all it takes is a few dozen rolls to overcome the variance of the die. This analysis shows this is clearly not true - the law of large numbers does apply.

The Pathfinder devs do know this - it is why they added level to everything so that you could have a range of outcomes where the dice variance will simply not matter because you will always hit or always fail if you go beyond the threat range. That is because it is not fun that the kobold killed your legendary fighter.

Whereas D&D 5e decided to embrace that variance in the uniformity of the die and did away with the level stepping that 4e had, because random variance makes for more interesting improv story telling. It is fun to remember the time your wizard crit the dragon with their dagger, even though that makes little sense.

Not sure how you can say variance is irrelevant when the two major RPG have decided to take advantage of it in different directions? If the wanted to remove its significance - they would use the dice pools that wargames use.

As best I can tell, your argument about taking variance into account depends on the sample size, and how close your odds of success are to 0% and 100%. The closer your odds are to the extreme, the more reliable they are. The closer your odds are to 50%, the less reliable they are. Also, the more times you roll, the more reliable your results. The fewer times you roll, the less reliable. I guess that's useful to know, but it's not like that information is in any way hidden. You can easily calculate your odds of success on the fly to know what kind of variance to expect. Moreover, the strongest builds tend to be those with the most attacks, and the highest chances of success, so strong builds naturally tend to be the most reliable builds as well.

If you're concerned that crits are skewing the average, then that just means we need to look at the mean DPR, in addition to average DPR.


Quote:
If you're concerned that crits are skewing the average, then that just means we need to look at the mean DPR, in addition to average DPR.

That should say "median DPR". Not "mean".


4 people marked this as a favorite.

So OP here... You all can argue how statistically significant DPR is, but just to point this out, that's not *actually* the point of the original post. The point of it is that even if DPR is statistically significant (I actually think it is), it's still not a good metric for predicting character effectiveness, or even a good measure of how much damage you'll *actually* do because other factors matter more in PF2 and you need to account for them.

In other words, I feel like some of you are missing the forest for the trees...


Pathfinder Lost Omens Subscriber
krazmuze wrote:

Then do the level simulation and prove me wrong. Every simulation I have seen says they calculated the fractional odds - which is only true for infinite simulation, or I simulated 50000 runs to get a precise average - which is not the reality of any players level.

The fact is that IF the variance is greater than the differences in average, then the build is not more important than the dice.

This is statistics 101

3.7+/-1 and 3.6+/-1.2

you cannot conclude that A is better than B, you instead must conclude that they are not significantly different because the range of averages have significant overlap.

3.7+/-0.1 and 3.4+/-0.15

You absolutely can conclude that A is better than B the averages do not overlap (to whatever confidence you calculated - usually 95% confidence is used) You cannot however conclude by how much as it could be 3.6 vs. 3.55 or it could be 3.8 vs. 3.25

It is this very gamblers fallacy that think the average odds apply to them that makes Vegas rich. The house can play the averages (because they make all the plays) - the player cannot (because they cannot play enough)

right so if you're going to be unlucky one way or unlucky another way, wouldn't you still want to know which is a better build if you're being unlucky?


tivadar27 wrote:

So OP here... You all can argue how statistically significant DPR is, but just to point this out, that's not *actually* the point of the original post. The point of it is that even if DPR is statistically significant (I actually think it is), it's still not a good metric for predicting character effectiveness, or even a good measure of how much damage you'll *actually* do because other factors matter more in PF2 and you need to account for them.

In other words, I feel like some of you are missing the forest for the trees...

DPR is a metric giving your character effectiveness in a specific situation. It's a useful information to assess a character efficiency and to improve it. It's not the end all be all in terms of character definition. But starting to use experience to disprove DPR accuracy is clearly a fallacy.


4 people marked this as a favorite.

I don't think the OP is trying to disprove the accuracy of DPR calculations. Rather their point seems to be that the value of using them to guide your build choices is reduced compared to PF1, potentially to the point of being misguiding in the real game situations, because the way combat is run has changed compared to the old edition.

Previously, DPR could be roughly translated to the likelihood of one-shotting your opponent with a full attack upon winning initiative. Now, this chance is close to zero anyway, and instead each turn the combatants should choose between tactical maneuvering, applying buffs/debuffs, or straight-up damage, the latter only being the best choice when it has a chance to finish the opponent or other options have been exhausted.

So it's no longer just a contest between how much DPR you can get with various builds, but instead how much DPR are you sacrificing to improve your defense or weaken the enemy or become more versatile against different threats or weaknesses. Which is tricky to measure in numbers that can show if the DPR trade-off is worth it or not.

I am looking forward to the evolution of the Beastmass challenge in PF2. I wonder if the devs run something like this in-house when considering the class/feat/spell balance.


1 person marked this as a favorite.
SuperBidi wrote:
tivadar27 wrote:

So OP here... You all can argue how statistically significant DPR is, but just to point this out, that's not *actually* the point of the original post. The point of it is that even if DPR is statistically significant (I actually think it is), it's still not a good metric for predicting character effectiveness, or even a good measure of how much damage you'll *actually* do because other factors matter more in PF2 and you need to account for them.

In other words, I feel like some of you are missing the forest for the trees...

DPR is a metric giving your character effectiveness in a specific situation. It's a useful information to assess a character efficiency and to improve it. It's not the end all be all in terms of character definition. But starting to use experience to disprove DPR accuracy is clearly a fallacy.

Basically what @CyberMephit said. More directly, DPR is a metric giving your character effectiveness is a specific situation that by-and-large doesn't exist in PF2. The "swing-away" attitude of PF1 doesn't work in PF2 because it's a more dynamic system, and therefor DPR as a metric by itself is no longer a good indicator of very much.

Outside of using a strictly superior weapon, methods that improved DPR are typically going to come at a feat cost, and those feats could be things that give AC, improve saves, allow for debuffs... Improving DPR without examining the wider impact could very easily hurt your character.

Also, you may have been addressing someone else with the comment, but I never made any statements that discredited DPR accuracy. At worst, I said there was more variance with DPR in PF2 as compared to PF1, but that's a very different statement.


1 person marked this as a favorite.
Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber
Bandw2 wrote:
krazmuze wrote:

Then do the level simulation and prove me wrong. Every simulation I have seen says they calculated the fractional odds - which is only true for infinite simulation, or I simulated 50000 runs to get a precise average - which is not the reality of any players level.

The fact is that IF the variance is greater than the differences in average, then the build is not more important than the dice.

This is statistics 101

3.7+/-1 and 3.6+/-1.2

you cannot conclude that A is better than B, you instead must conclude that they are not significantly different because the range of averages have significant overlap.

3.7+/-0.1 and 3.4+/-0.15

You absolutely can conclude that A is better than B the averages do not overlap (to whatever confidence you calculated - usually 95% confidence is used) You cannot however conclude by how much as it could be 3.6 vs. 3.55 or it could be 3.8 vs. 3.25

It is this very gamblers fallacy that think the average odds apply to them that makes Vegas rich. The house can play the averages (because they make all the plays) - the player cannot (because they cannot play enough)

right so if you're going to be unlucky one way or unlucky another way, wouldn't you still want to know which is a better build if you're being unlucky?

Sure but my point is you cannot do that unless someone gives you the DPR+/-variance. It turns out for the specific medic example (I will post histograms later) that most people will not see the benefit of the +1. The thing to realize is that average means half the people do worse, half the people do better. The half the people that do worse with the +1 are not doing better than half the people doing better without the +1. That is the gamblers fallacy that people fall into when they think that if they follow DPR advice that they will always do better with the bonus option, that is simply not the case!

I will post histograms that better show this, the best options is where the bonus difference is such that the histograms do not signficantly overlap. This happens when the bonus is large and/or the spread is narrow because you roll it much more.

With the medic example you can only say if you are very lucky with the +1 and the other cleric without +1 are very unlucky are you doing better. This is simply because the variance is greater than the modifier.


1 person marked this as a favorite.
tivadar27 wrote:

Basically what @CyberMephit said. More directly, DPR is a metric giving your character effectiveness is a specific situation that by-and-large doesn't exist in PF2. The "swing-away" attitude of PF1 doesn't work in PF2 because it's a more dynamic system, and therefor DPR as a metric by itself is no longer a good indicator of very much.

Outside of using a strictly superior weapon, methods that improved DPR are typically going to come at a feat cost, and those feats could be things that give AC, improve saves, allow for debuffs... Improving DPR without examining the wider impact could very easily hurt your character.

Not at all.

For example, if I want to build a critical-focused character, using a pick, I'm very much interested by my DPR to assess:
- if it's viable compared to a more classical build.
- what class and feats would give me the best efficiency.
Without this metric, I will either have to prey it works or just go for a more classical build to be sure it'll work.

DPR is a very interesting metric.

PS: I was answering to someone else for the experience vs metrics. It's just not easy to make multiple quotes on a single post.


7 people marked this as a favorite.
krazmuze wrote:
Sure but my point is you cannot do that unless someone gives you the DPR+/-variance. It turns out for the specific medic example (I will post histograms later) that most people will not see the benefit of the +1. The thing to realize is that average means half the people do worse, half the people do better.

I feel like there's some confusion on this point. A DPR calculation is not intended to compare how lucky two different people are, or how lucky a given person is at two different point in times. No one in this thread is claiming that if a DPR calculation says you get are expected to get 10.5 damage in 1 round with that build, that when they sit down at the table and start rolling dice they're going to do 10.5 damage on that 1st round. They'll readily acknowledge some fights the build will do literally 0 damage because the player can't roll higher than a 3 that fight. I've certainly seen it happen.

I agree distribution is important. I have read some threads where I feel picks are being over sold because of how much of their average damage is coming from 5-15% of their rolls. Although often times someone will pop in to such a thread and point out that spikiness and that a good portion of that damage might be wasted when the enemy is low on hit points, or if you don't happen to get a crit that particular fight. Simulations or simply calculating the exact damage distribution would probably help .

Something like a histogram of how many rolls produce X damage on the horizontal axis might be useful in conveying that, and seeing the few rolls producing excessive damage for picks. A pick's median or mode damage is very different from its mean damage. To be honest, even the 1 sigma +/- probably doesn't tell the whole story for that, so a graph is good.

However, it doesn't mean the statistical analysis is worthless or that people making a DPR calculation are spreading misinformation. Generally they'll say the expected or mean damage is X per round. And that has a statistical meaning. Just as much as saying the mode or median. Or drawing a probability distribution graph. As the OP notes, it might not be telling the whole story, and that perhaps you might want to care about more than mean damage per round, but they are valid statements, and still useful for quick general comparisons keeping the assumptions and statements about what it is in mind.

Not everyone wants to spend several hours writing up a full game simulation and producing multiple graphs as opposed to just playing the game. Or spend hours looking at charts, so boiling it down to a single number is sometimes useful.

I generally view DPR calculation as trying to understand the relative effectiveness of builds all else being equal. Among those "all else being equal" assumptions is that the rolls of the player don't change between build comparisons. It is a hypothetical what if we could go back in time and swap this character sheet for that character sheet, but the player's rolls stay the same. We're trying to isolate player luck from the comparison, to get at just the build itself.

It is possible people are in very violent agreement, and perhaps misunderstanding what the other side is saying.

krazmuze wrote:
The half the people that do worse with the +1 are not doing better than half the people doing better without the +1. That is the gamblers fallacy that people fall into when they think that if they follow DPR advice that they will always do better with the bonus option, that is simply not the case!

I'm trying to figure out who you're aiming this comment at in this thread. No one has said if you roll a d20 and add 7 and then roll a different d20 and add 6 that the d20+6 is guaranteed to be lower. What I think has been pointed out in this thread is if you roll a 3 on a d20, and then add 7, its bigger than if you add 6.

A player's rolls are in some sense independent of what is written down on the character sheet. Having a d20+6 versus a d20+7 doesn't change the fact I rolled a 3 on the d20 for example. I personally care about the effects of the build, not the effects of the player or their luck. I can't change the player's rolls or luck. So combining the two muddies the water.

Or another way to put it, I don't know what rolls I will roll tonight in my game. Given any given permutation of say ten d20 rolls is as likely as any other permutation of ten d20 rolls, what character sheet gives me the largest ratio of successful permutations over total permutations. What gives me the best odds of success before I know what my rolls are. While non-trivial, given some assumptions, that kind of thing is calculable, and what a DPR calculation is trying to do just for attack rolls, with a bunch of caveats thrown on top to make the whole thing not be a PhD thesis.


1 person marked this as a favorite.
SuperBidi wrote:


Not at all.
For example, if I want to build a critical-focused character, using a pick, I'm very much interested by my DPR to assess:
- if it's viable compared to a more classical build.
- what class and feats would give me the best efficiency.
Without this metric, I will either have to prey it works or just go for a more classical build to be sure it'll work.

DPR is a very interesting metric.

PS: I was answering to someone else for the experience vs metrics. It's just not easy to make multiple quotes on a single post.

Totally fair. You're right, there are situations where you can do this analysis and its definitive* (*critical specializations matter). But you're citing the case that I already mentioned, looking to switch between two different weapons :). Granted here it's not d8 vs d6, it's more complicated than that.

My point is more that comparing a dual wielding fighter even to a free-hand fighter can't look at straight-up DPR and expect to glean a lot from it. Dual wield will almost certainly do more flat damage per single action, but free-hand tends to come with better defenses, and the ability to make things flat footed... In PF1, we'd say "who cares?", in PF2, we can't as easily do that.

I'm mostly worried about people looking at the DPR thread such as the one that compared across classes and using it to say X does more damage than Y. I realize a lot of people won't do that, but I've already seen some do it (and am probably guilty of it to an extent myself).

It might seem counterintuitive, but better DPR doesn't imply better actual damage in PF2, as the situation where DPR (okay, this should probably be DPA now, 'damage per action') is a valid metric of actual overall damage (when you have 4 straight rounds to swing at an enemy and never do any debuffing and never be at risk of being debuffed yourself) are going to be pretty rare. If you can damage an enemy *and* make them flat footed for a whole turn, for example, that is likely to do more damage overall that simply doing 10% more damage on a single hit.

NOTE: I'd agree that there's probably some threshold to determine viability, at least when considering pure damage dealers (not paladins, for example). For example, if my free-hand fighter did 50% of the damage of a two-weapon fighter, I'd probably not consider it viable, even if it did come with extra perks. From what I've seen of a lot of the numbers, though, a lot of the times those differences tend to be a lot smaller.


2 people marked this as a favorite.
tivadar27 wrote:
My point is more that comparing a dual wielding fighter even to a free-hand fighter can't look at straight-up DPR and expect to glean a lot from it. Dual wield will almost certainly do more flat damage per single action, but free-hand tends to come with better defenses, and the ability to make things flat footed... In PF1, we'd say "who cares?", in PF2, we can't as easily do that.

Even that is a valid comparison, if you hesitate between two builds. You obviously expect a build with more utility to have a lower DPR, but if you don't know the difference, it's hard to choose between both builds.

Anyway, even in PF1, focusing solely on DPR to assess a character efficiency was a mistake. It doesn't invalidate DPR, it just invalidates one-trick poneys.
Personally, I've never loved DPR-focused players (or offensive-focused, to be more general). They tend to build characters with a tendency to take all the credit without helping the party much. It's a cooperative game!


1 person marked this as a favorite.
krazmuze wrote:
Bandw2 wrote:
krazmuze wrote:

Then do the level simulation and prove me wrong. Every simulation I have seen says they calculated the fractional odds - which is only true for infinite simulation, or I simulated 50000 runs to get a precise average - which is not the reality of any players level.

The fact is that IF the variance is greater than the differences in average, then the build is not more important than the dice.

This is statistics 101

3.7+/-1 and 3.6+/-1.2

you cannot conclude that A is better than B, you instead must conclude that they are not significantly different because the range of averages have significant overlap.

3.7+/-0.1 and 3.4+/-0.15

You absolutely can conclude that A is better than B the averages do not overlap (to whatever confidence you calculated - usually 95% confidence is used) You cannot however conclude by how much as it could be 3.6 vs. 3.55 or it could be 3.8 vs. 3.25

It is this very gamblers fallacy that think the average odds apply to them that makes Vegas rich. The house can play the averages (because they make all the plays) - the player cannot (because they cannot play enough)

right so if you're going to be unlucky one way or unlucky another way, wouldn't you still want to know which is a better build if you're being unlucky?
Sure but my point is you cannot do that unless someone gives you the DPR+/-variance. It turns out for the specific medic example (I will post histograms later) that most people will not see the benefit of the +1. The thing to realize is that average means half the people do worse, half the people do better. The half the people that do worse with the +1 are not doing better than half the people doing better without the +1. That is the gamblers fallacy that people fall into when they think that if they follow DPR advice that they will always do better with the bonus option, that is simply not the case!

You're using deceptive ambiguous language. When you say "the half of the people that do worse with the +1", what do you mean by "worse"? Worse than who? When you say "the half the people that do better with the +1", what do you mean by "better"? Better than who? When people discuss DPR, they mean "If you choose this option, you'll perform x points better than how you otherwise would've performed with the alternative option". It says nothing about how well anyone else's dice will roll.

No one's saying that if you take a +1, you'll always roll above-average, or that a player with a +0 won't get lucky and roll better than your +1. They're saying that you'll always get a result that's 1 higher than what you would've gotten otherwise. Comparing your rolls to other people who got lucky or unlucky, and thinking that their rolls have anything to do with yours is the gambler's fallacy.


1 person marked this as a favorite.
Strill wrote:
krazmuze wrote:
Bandw2 wrote:
krazmuze wrote:

Then do the level simulation and prove me wrong. Every simulation I have seen says they calculated the fractional odds - which is only true for infinite simulation, or I simulated 50000 runs to get a precise average - which is not the reality of any players level.

The fact is that IF the variance is greater than the differences in average, then the build is not more important than the dice.

This is statistics 101

3.7+/-1 and 3.6+/-1.2

you cannot conclude that A is better than B, you instead must conclude that they are not significantly different because the range of averages have significant overlap.

3.7+/-0.1 and 3.4+/-0.15

You absolutely can conclude that A is better than B the averages do not overlap (to whatever confidence you calculated - usually 95% confidence is used) You cannot however conclude by how much as it could be 3.6 vs. 3.55 or it could be 3.8 vs. 3.25

It is this very gamblers fallacy that think the average odds apply to them that makes Vegas rich. The house can play the averages (because they make all the plays) - the player cannot (because they cannot play enough)

right so if you're going to be unlucky one way or unlucky another way, wouldn't you still want to know which is a better build if you're being unlucky?
Sure but my point is you cannot do that unless someone gives you the DPR+/-variance. It turns out for the specific medic example (I will post histograms later) that most people will not see the benefit of the +1. The thing to realize is that average means half the people do worse, half the people do better. The half the people that do worse with the +1 are not doing better than half the people doing better without the +1. That is the gamblers fallacy that people fall into when they think that if they follow DPR advice that they will always do better with the bonus option, that is simply not the case!
You're using deceptive ambiguous language. When you say "the half...

They’ve been rather clear this entire time; and the language has been anything but ambiguous. They have mostly been focusing on the ambiguous nature that DPR ends up being when put into practice, and that most people talking about about such things tend to gravitate towards how good something is in a vacuum rather than in practice. Two builds with little difference in numbers in practice aren’t any better than the other; but people will claim that there is a difference because the perfect whiteboard says there is, but in practice the difference doesn’t always show, if it shows at all. That’s why Kraz has been saying ‘gambler’s fallacy’ and ‘state the variance between competent builds’.

The variance bit would actually be much appreciated personally speaking.


Pumpkinhead11 wrote:
They’ve been rather clear this entire time; and the language has been anything but ambiguous

Agreed.


1 person marked this as a favorite.
Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

The everybody I speak of is the million clerics I already simulated, spefically to put the law of large numbers to work and remove the noise from the low number of clerics.

And when you do that analysis you realize it is mathematically the case that accumulating infinite results for a uniform die results in a gaussian distribution. I already showed that a million results returns the exact same mean that the fractional odds analysis will prove. So the million results can have normal statistics applied and obtain precise results. Thus rather than showing the histogram I can calculate the sigma, and it will have to do for illustrating this histogram since this is a text forum

Since I am interested in the 95% range of the histogram I will take 2*std

The trained WIS+4 cleric has a 65+/-15% chance of succeeding at healing their party at each break.

mean(mean((randi([1,20],40,1e6)+7)>=15))
std(mean((randi([1,20],40,1e6)+7)>=15))*2

The trained WIS+3 cleric has a 60+/-15% chance of succeeding at healing their party at each break.

mean(mean((randi([1,20],40,1e6)+6)>=15))
std(mean((randi([1,20],40,1e6)+6)>=15))*2

Thus the variance of the die means that bonus +1 modifier results in a range similar to the average odds calculation as if using a bonus -2 to +4 modifier. There is significant overlap of the 95% range of the histograms, thus only the very (un)lucky will see a difference in these builds.

So lets use histogram analysis to come up with build advice where people will see differences, they will not have to worry about the lesser cleric build outperforming them at the game store and get into arguments about how DPR advice is wrong because they have not had the experience that this was better than that.

The expert WIS+4 cleric has a 75+/-15% chance of succeeding at healing their party at each break over a level

mean(mean((randi([1,20],40,1e6)+9)>=15))
std(mean((randi([1,20],40,1e6)+9)>=15))*2

The trained WIS+0 cleric has a 45+/-15% chance of succeeding at healing their party at each break over a level.

mean(mean((randi([1,20],40,1e6)+3)>=15))
std(mean((randi([1,20],40,1e6)+3)>=15))*2

Now the histograms do not overlap each other except for their few percent tails. Thus for these checks we can conclude it takes a modifier difference of +6 to overcome the d20 and guarantee performance over most everyone else.

Now lets take this further and use this method to look at ATK+7 vs. AC 18? Again we will assume 40 checks - 4 rounds at 10 encounters over the level

ATK 1 is 50+/-15%
mean(mean((randi([1,20],40,1e6))+7>=18))
std(mean((randi([1,20],40,1e6))+7>=18))*2

ATK 2 is 25+/-15%
mean(mean((randi([1,20],40,1e6)+2)>=18))
std(mean((randi([1,20],40,1e6))+2>=18))*2

ATK 3 is 0% (it is not possible to hit)

mean(mean((randi([1,20],40,1e6)-3)>=18))
std(mean((randi([1,20],40,1e6)-3)>=18))*2

Now you see the design of the system has indeed accounted for the histograms - they used the -5 for more attacks because they want you to feel that in your build regardless what your rolls are - the odds for each attack do not overlap. They give you options to narrow this to -2 because that means you will feel like all of them have a similar chance of hitting and it was worth taking those feats.

Lets assume someone else flat-footed them and I got agile weapon and flurry and twin feat on my ranger. Now the histograms are starting to overlap and blurring the odds on the attacks.

ATK 1 is 60+/-15%
mean(mean((randi([1,20],40,1e6))+7>=16))
std(mean((randi([1,20],40,1e6))+7>=16))*2

ATK 2 is 50+/-15%
mean(mean((randi([1,20],40,1e6)+5)>=16))
std(mean((randi([1,20],40,1e6))+5>=16))*2

ATK 3&4 is 40+/-15%
mean(mean((randi([1,20],40,1e6)+3)>=16))
std(mean((randi([1,20],40,1e6)+3)>=16))*2

And this matches the experience I see with the ranger - the worst your first attack can do is 45% and the best the last attacks can do is 55%. So let someone else take care of the utility and defense and you SHOULD be blowing all your actions on attacks.

So bottom line when the variance is greater than the difference in modifiers it will feel similar in play, when the variance is less than the difference in modifier it will feel different in play.

Yes I rounded everything to 5% because this game is not that granular, and it allows us to understand how it relates back to the bonus modifier.

Now this may not be the case that things are gaussian when considering critical effects damage, you would have to study their histograms to know if they are normal gaussian distributions. But that cannot be done with oneliner matlab code.


4 people marked this as a favorite.

I think I see what you're trying to explain, but you're using terminology incorrectly.

Mathematically speaking, variance of a set of numbers is the mean of the squared difference of each number from the mean of the set of numbers. What you're using here is twice the standard deviation (which you do note at one point but then change your terminology for some reason).

You're trying to give people a feel for the probability distribution, but that is not quite the same thing as saying the chance of a particular event occurring. You can integrate the probability distribution to determine the chance of a particular threshold being met for example, but saying 65%+/-15% "chance of succeeding in healing their party at each break" doesn't make much sense to me, again, mathematically speaking. The +/-15% only makes sense if you don't know the odds of your dice (or the true probability distribution) and all you have to go off of is an experimentally determined series of numbers.

But the fact is we do know the actual chance if we assume fair dice.

The chance of rolling an 8 or higher on a fair d20 is 65% for example. That is literally the definition of a fair die. That there is equal probability of any given face showing on a roll. If it that is not true, then it is not a fair die.

Or in other words, any mathematician who heard you say you have an 80% chance of rolling an 8 or higher on a d20 would point out you are cheating with weighted dice. :)

What you're talking about is the 2 sigma region (also known as plus and minus twice the standard deviation) around the mean of the probability distribution function. You're basically saying 95% of all 40 d20+7>=15 roll sequences would be consistent with a ratio of success to failure of 0.5 to 0.8. And of course, 5% of all 40 d20 rolls sequences will actually fall outside those ratios of success to failure.

Anyways, I suggest plotting the probability distribution functions for 40 d20 roll sequences as that will make it clearer to more people reading this thread. If you overlap them in two different colors for 1d20+6 and 1d20+7 you can see the amount that overlaps and the amount that doesn't. An interesting number might be ratio of the probability distribution that doesn't overlap over the sum of the two probability distributions (which would be basically non-overlaping density /2.0).

Already at 40 rolls, you can distinguish a 60% versus 55% chance pretty well, with approximately 25% of the probability distribution not overlapping. For example: Google doc with plot

Edit: I also suggest perhaps using free (as in beer and speech) software when giving code examples to other people. Not everyone has access to a MATLAB license already. I'm personally fond of numpy/scipy combined with matplotlib in python.

Edit 2: I can apparently do plots, but not basic arithmetic. Odds of 8 or higher on d20 is 65%, not 60%.


One thing I think that would be needed for DPR is a significance test. IF one options does 2.5 and another does 2.7 Is that a sig diff to the point where one option is better then the other?


At that point it's a matter of build and flavor. For example: The 2.7 could be a 2-h Fighter while the 2.5 is a 2 weapon Ranger, which have completely different play styles.


Well but what about 3.2 vrs 2.7? at what point is the difference enough to be worth one option over the other?


1 person marked this as a favorite.
Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

@hiruma Yes I was using variance in english use - 'your results will vary' and not meaning the math variable - don't get so hung up on terminology. I could also have said deviation and it would not be entirely correct as well. but I certainly not going to say 'you must take plus or minus twice the standard deviation' every time. But plenty of people understood what I was saying anyways. Would it feel better if it said dice variation?

And I did not want to be bothered with uploading and linking charts simply because I was not supposed to be wasting time at work, and Matlab is what I use at work so I just knocked out some one liners to check my guesses on what I thought might be a +/-1 only to find out it was a +/-3 (d20 equivalence of 15%).

Yes it is easier to understand the chart version if you are not familar with what a normal distribution looks like. I did not integrate the overlap, but what I see it fits my statement that only the (un)lucky will see a difference on a +1 and most people will not feel it is any different and some will even feel they are doing worse.

The bottom line is a +5/6 has little overlap, a +1/2 is mostly overlap. So why are the small bonuses in the game if they do not feel that different in play? The point is stacking them up higher as you level with training and combining them with other small bonuses of different types so that they add up to something that feels like you overcame the dice to make your build feel differnt in play. Leveling up and gaining system mastery. The histogram analysis gave me more appreciation for PF2e compared to D&D5e which takes the same +5 (max) and puts it to work as advantage but leaves it up to your DM to grant it.

Anyways I have a quick and dirty two liner idea on how to incorporate crits but it will have to wait until I have free time at work again. All I need to do is multiply the success array elements that are crits by 2, then multiply that by the damage (which will have its own variation) then take the mean/std (if the histogram is still normal)

The point of making it a one liner matlab was to show it is not that hard to do if you have a matrix math program, it is very easy and quick to knock out 40 million rolls.


1 person marked this as a favorite.
Vidmaster7 wrote:
Well but what about 3.2 vrs 2.7? at what point is the difference enough to be worth one option over the other?

It depends on what you're planning for your character. It's like having a faster car or a slower car. The slower car is not invalidated by its slower speed, especially if you mostly drive in towns.

If you consider your character damage output as one of his main (or his main) contribution to combat, then a small difference in DPR can be important. If you just want to check that your character is viable, you'll accept a lower DPR as long as it's not ridiculously low. If you don't consider damage as one of your main contribution, you may calculate DPR just to know if it's a valid option to attack or if you should nearly always stick to your main role.
DPR is an information, as much as AC or hit points.


SuperBidi wrote:
Vidmaster7 wrote:
Well but what about 3.2 vrs 2.7? at what point is the difference enough to be worth one option over the other?

It depends on what you're planning for your character. It's like having a faster car or a slower car. The slower car is not invalidated by its slower speed, especially if you mostly drive in towns.

If you consider your character damage output as one of his main (or his main) contribution to combat, then a small difference in DPR can be important. If you just want to check that your character is viable, you'll accept a lower DPR as long as it's not ridiculously low. If you don't consider damage as one of your main contribution, you may calculate DPR just to know if it's a valid option to attack or if you should nearly always stick to your main role.
DPR is an information, as much as AC or hit points.

I agree and your kind of making the point I am trying but failing to make.

I think that with the new edition you will need to take into account a variety of factors past just DPR. For example is 1 more point of damage per attack worth 1 point of ac or 5 extra hp etc. It can get very complicated once you start thinking about movement and range etc. DPR still might be good for weapon selection if they are identical for every other way but then anyone can look at a d6 vrs a d10 and say one is better then the other. Then you have to start working out crit effects too. Whew whole thing sounds like a lot of work.


2 people marked this as a favorite.

I'm by no means a statistician or mathematician, but I'd like to take a crack at explaining in very simple layman's terms, why small bonuses to-hit are less important than people like to think. A lot of people are saying things like "higher average roll = better performance, it's really that simple," but it actually isn't.

I think there are a couple simple things that people overlook:
1) This one is fairly obvious, but I think it still bears repeating: a success is a success and a failure is a failure. Whether you need an 11 or a 10 to succeed makes no difference if you actually rolled a 12. Similarly, if the monster only has 6 hit points left, it doesn't matter if you deal 7 or 8 damage with your strike. So that +1 bonus is actually only relevant in a small number of cases.
2) You don't make an infinite number of rolls in a campaign. With a limited number of rolls, it is quite possible that the variance in your rolls will render the theoretically higher average result of a +1 bonus irrelevant. A lot of people are saying "Sure, but that's just bad luck. You can't account for that mathematically", but what they aren't getting is that you actually can, thanks to the magic of statistics.

So how does this matter for you, in practical terms? It's simple: for any given character over the course of a campaign, the variance in dice rolls is almost certainly going to be high enough to make that +1 bonus moot. That's right; once the campaign is over, if you were to go back over all the checks/saves/attacks you rolled to analyze the impact of your +1 bonus, you'd most likely discover that statistically it didn't impact your performance in a meaningful way. Meaning it probably doesn't matter if you start with an 18 or a 16 in your main stat. Meaning that min/maxing often doesn't actually result in a statistically significant improvement in your performance.


1 person marked this as a favorite.
Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

Just downloaded matlab trial, only $149 for a home license. I paid more than that for Office and I would rather use Matlab than Excel for doing stats.

I will probably start a new thread for doing build stats that includes the two sigma range for those that would like that information and not interested in debating further why it does not matter....

I am more doing it for my own theorycrafting.


3 people marked this as a favorite.
Bardic Dave wrote:

2) You don't make an infinite number of rolls in a campaign. With a limited number of rolls, it is quite possible that the variance in your rolls will render the theoretically higher average result of a +1 bonus irrelevant. A lot of people are saying "Sure, but that's just bad luck. You can't account for that mathematically", but what they aren't getting is that you actually can, thanks to the magic of statistics.

You're right. We can calculate it. Lets do it.

How many rolls do you think you make with your primary statistic in a campaign? I'll assume we have 12.5 encounters per level, with approximately 2 equivalent level enemies. Call it 6 successful attacks per enemy, maybe 12 attempted attacks per, so 24 attack split between 4 players for an encounter, so 6 rolls each? That means 75 attack rolls a level and assume a 1-12 level campaign for roughly 900 attack rolls with your primary attack stat.

900 factorial is going to be a bit rough to do for exact combination counts, so I'll switch over to simulation.

So I've thrown up a simulated plot for 1 million sets of 900 rolls in the same google doc I had a plot for 40, again with a 60% versus 55% success chance. The google doc is here: Google Doc. Second page has the 900 roll plot.

The big thing to note is that as you add more rolls over the course of a campaign, there is less and less overlap between the distributions - making it easier to tell the difference if you had a +1 bonus or not.

With 40 rolls, you have a 75% chance of there being no difference in your success ratio (i.e. the overlap between the distributions). However with 900, you have only about a 13% chance of having the exact same number of successes. Sometimes it might only be a 1 roll difference, but the two peaks are pretty distinct. The most probable outcome is that you've gotten 45 more successes out of 900 with that +1.

Another way to phrase it, is the standard deviation goes down as the square root of the number of the samples. So if you're requiring the bonus to be larger than twice the standard deviation to be significant, if you play the character long enough, +1 will become statistically significant. At the point you're doing 400 rolls, you're looking at a +1 being roughly the same size as the standard deviation. So noticeable. By 900 rolls, twice the standard deviation in the parlance presented here is +/-3.4%. +1 is equivalent to 5% and thus larger.

This isn't disproving anything about the OP's assertions of course.

All I'm saying a long campaign has a sufficient number of rolls that a +1 would probably be statistically noticeable if you went back and looked, all other things being equal. It probably won't determine the overall success of the campaign as a whole, since encounters usually have some slack to them. Its also not saying about what getting +1 to a different stat buys you, which might be more in terms of survivabilty or out of combat success.


Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

But since success in a campaign is quantized into actions, rounds, encounters, and levels - fail any of those steps you die.

So how badly do most people need to suffer in the short term before their luck balances things out again?

Accuracy defined as 95% people (+/- two standard deviations) range of percent success vs flat DC.

So how long do you want to gamble that the inaccuracy of the d20 will not erase the benefit of your modifier?

(rounded % variations as d20 steps)

For 6 rolls the d20 accuracy is +/-7
For 9 rolls the d20 accuracy is +/-6
For 12 rolls the d20 accuracy is +/-5
For 20 rolls the d20 accuracy is +/-4
For 30 rolls the d20 accuracy is +/-3
For 61 rolls the d20 accuracy is +/-2
For 173 rolls the d20 accuracy is +/-1
For 1561 rolls the d20 is accurate

In other words the +5 bonus does not make you reliable after an encounter, the +1 bonus does not make you reliable after a level, but after the long campaign the d20 is accurate.


Hiruma Kai wrote:
Bardic Dave wrote:

2) You don't make an infinite number of rolls in a campaign. With a limited number of rolls, it is quite possible that the variance in your rolls will render the theoretically higher average result of a +1 bonus irrelevant. A lot of people are saying "Sure, but that's just bad luck. You can't account for that mathematically", but what they aren't getting is that you actually can, thanks to the magic of statistics.

You're right. We can calculate it. Lets do it.

How many rolls do you think you make with your primary statistic in a campaign? I'll assume we have 12.5 encounters per level, with approximately 2 equivalent level enemies. Call it 6 successful attacks per enemy, maybe 12 attempted attacks per, so 24 attack split between 4 players for an encounter, so 6 rolls each? That means 75 attack rolls a level and assume a 1-12 level campaign for roughly 900 attack rolls with your primary attack stat.

900 factorial is going to be a bit rough to do for exact combination counts, so I'll switch over to simulation.

So I've thrown up a simulated plot for 1 million sets of 900 rolls in the same google doc I had a plot for 40, again with a 60% versus 55% success chance. The google doc is here: Google Doc. Second page has the 900 roll plot.

The big thing to note is that as you add more rolls over the course of a campaign, there is less and less overlap between the distributions - making it easier to tell the difference if you had a +1 bonus or not.

With 40 rolls, you have a 75% chance of there being no difference in your success ratio (i.e. the overlap between the distributions). However with 900, you have only about a 13% chance of having the exact same number of successes. Sometimes it might only be a 1 roll difference, but the two peaks are pretty distinct. The most probable outcome is that you've gotten 45 more successes out of 900 with that +1....

Thanks for this post. I think some of your assumptions are potentially slightly inflated (12.5 encounters, 6 attack rolls per encounter) but I appreciate that you took the time to do this, and I don’t disagree with your findings.


Also of note: if you start with a 16 in your main stat, you’re only behind by one *some of the time*, so I think your calculations likely lend credence to the idea that it won’t be statistically significant (in that particular case at least)


1 person marked this as a favorite.
Pathfinder Lost Omens Subscriber
krazmuze wrote:

The everybody I speak of is the million clerics I already simulated, spefically to put the law of large numbers to work and remove the noise from the low number of clerics.

And when you do that analysis you realize it is mathematically the case that accumulating infinite results for a uniform die results in a gaussian distribution. I already showed that a million results returns the exact same mean that the fractional odds analysis will prove. So the million results can have normal statistics applied and obtain precise results. Thus rather than showing the histogram I can calculate the sigma, and it will have to do for illustrating this histogram since this is a text forum

Since I am interested in the 95% range of the histogram I will take 2*std

The trained WIS+4 cleric has a 65+/-15% chance of succeeding at healing their party at each break.

mean(mean((randi([1,20],40,1e6)+7)>=15))
std(mean((randi([1,20],40,1e6)+7)>=15))*2

The trained WIS+3 cleric has a 60+/-15% chance of succeeding at healing their party at each break.

mean(mean((randi([1,20],40,1e6)+6)>=15))
std(mean((randi([1,20],40,1e6)+6)>=15))*2

Thus the variance of the die means that bonus +1 modifier results in a range similar to the average odds calculation as if using a bonus -2 to +4 modifier. There is significant overlap of the 95% range of the histograms, thus only the very (un)lucky will see a difference in these builds.

So lets use histogram analysis to come up with build advice where people will see differences, they will not have to worry about the lesser cleric build outperforming them at the game store and get into arguments about how DPR advice is wrong because they have not had the experience that this was better than that.

The expert WIS+4 cleric has a 75+/-15% chance of succeeding at healing their party at each break over a level

mean(mean((randi([1,20],40,1e6)+9)>=15))
std(mean((randi([1,20],40,1e6)+9)>=15))*2

The trained WIS+0 cleric has a 45+/-15% chance of succeeding at healing their party...

DPR is a heuristic, not an in depth analysis. it was never meant to be accurate, just a way to compare various builds and their efficacy in certain situations.


Bardic Dave wrote:
Also of note: if you start with a 16 in your main stat, you’re only behind by one *some of the time*, so I think your calculations likely lend credence to the idea that it won’t be statistically significant (in that particular case at least)

Believe me there's a whole host of things I've ignored if you want to fully simulate attack rolls through a campaign. As you note I've ignored level ups. Crit successes can have a large effect on combat. My estimate of the number of rolls is very rough, no consideration to secondary benefits of a higher combat stat (+1 to hit and +1 to damage), and so forth.

So all the graph is saying is if you roll 900 rolls with the threshold of 10 or higher vs a threshold of 9 or higher, you'll likely be able to tell that there was in fact a 1 point difference. Does 5% more hits make the difference between overall success and failure in a campaign? I personally don't think so. At least for most APs.

Nor does this even begin to look at the trade offs one makes for a +1 to hit. Presumably, you'd have +1 to some other rolls, or more hit points which could potentially be more important than 5% more hits. At 1st level, 2 more Con for a human fighter is ~5.6% more hit points, plus a better fort save. Is 5% more hitpoints worth 5% more hits over the course of a campaign? Maybe? Depends on the player?

I also agree the way stat boosts work every 5 levels, they favor a more even spread at 1st level and make MAD characters much more viable than PF1. I've personally played characters in Starfinder with a 14 in their primary combat stat at 1st level, which has a similar stat upgrades by leveling. It worked out fine.

I figure over the course of a campaign, smart tactics and average builds will trump perfect optimization and poor tactics. I feel that was true even in PF1.

At the end of the day, DPR is just a piece of information, which players can do with what they want. Doing a simple DPR or DPA calculation is better than walking in blind and realizing after the fact the melee staff wizard is not working out as the melee gish build they thought it could be.


Hiruma Kai wrote:
Bardic Dave wrote:
Also of note: if you start with a 16 in your main stat, you’re only behind by one *some of the time*, so I think your calculations likely lend credence to the idea that it won’t be statistically significant (in that particular case at least)

Believe me there's a whole host of things I've ignored if you want to fully simulate attack rolls through a campaign. As you note I've ignored level ups. Crit successes can have a large effect on combat. My estimate of the number of rolls is very rough, no consideration to secondary benefits of a higher combat stat (+1 to hit and +1 to damage), and so forth.

So all the graph is saying is if you roll 900 rolls with the threshold of 10 or higher vs a threshold of 9 or higher, you'll likely be able to tell that there was in fact a 1 point difference. Does 5% more hits make the difference between overall success and failure in a campaign? I personally don't think so. At least for most APs.

Nor does this even begin to look at the trade offs one makes for a +1 to hit. Presumably, you'd have +1 to some other rolls, or more hit points which could potentially be more important than 5% more hits. At 1st level, 2 more Con for a human fighter is ~5.6% more hit points, plus a better fort save. Is 5% more hitpoints worth 5% more hits over the course of a campaign? Maybe? Depends on the player?

I also agree the way stat boosts work every 5 levels, they favor a more even spread at 1st level and make MAD characters much more viable than PF1. I've personally played characters in Starfinder with a 14 in their primary combat stat at 1st level, which has a similar stat upgrades by leveling. It worked out fine.

I figure over the course of a campaign, smart tactics and average builds will trump perfect optimization and poor tactics. I feel that was true even in PF1.

At the end of the day, DPR is just a piece of information, which players can do with what they want. Doing a simple DPR or DPA calculation is better than walking in...

Oh yeah, that wasn't meant as a critique of your methodology. I was just coming up with a new hypothesis about a particular area of concern for me, based on your calculations.


Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

I am changing to DRPE (Damage Range per Encounter) Assume 4 rounds, because that is the average devs targeted in their math - and play seems to back that up. 11 moderate and 1 severe adds up to a level, but would be interesting to compare different severity mixes. Fewer encounters means less accurate rolling over the level, do the stronger bonuses help protect against that for the bosses?

The reason for doing encounter math is twin blade ranger has to draw, draw, hunt prey then each round is move him and his dog into flanking melee before he can wail away with all the strikes and move away. Or is it worth ignoring the dog and getting an extra strike, or making the tradeoff of shield vs. extra strike or hoping others provide the flanking and defense.

The long bow ranger only has to do draw, hunt prey, and can just keep striking for the rest of the encounter, maybe switching bows or prey.

Entirely different action economy in the encounter. Is it possible over a level my unlucky bow ranger is outperformed by the lucky melee ranger? Or are the bonuses and actions such that their performance never overlaps. Good to know how much encumbrance to suffer for arrow quivers if melee ends up not even being a good idea if lucky. Or can I save inventory room for loot and switch to melee when my few arrows are gone because there is enough overlap in performance.

average DPR cannot answer these questions, it cannot tell you how performance overlaps.


1 person marked this as a favorite.
Pathfinder Lost Omens Subscriber
krazmuze wrote:


average DPR cannot answer these questions, it cannot tell you how performance overlaps.

i mean it can if you run simulations for 4 rounds... most analysis do look into start up time especially considering rangers who need to use an action each time they switch targets. so calculating which build is better at dealing with multiple targets can give you insight into who you should target in a fight. someone bad at switching targets should focus the big guy, while someone who has little loss from action economy should clear the small guys first.

basically, if you're thinking DPR calculations are just how much damage can i do in prime conditions, you are severely underestimating the types of calculations many people do and what their purpose is.

likewise many people try to calculate rounds to kill as a metric, which is highly based upon DPR calculations. if you're in a fight with many lower leveled enemies at what point does trading hunt prey for raise shield change the ratios of rounds to kill for you versus them from one being more beneficial than the other.


Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

This redditor is doing a leveled combat simulator (50 rounds)

https://www.reddit.com/r/Pathfinder2e/comments/dau1pq/p2e_simulator_dev_upd ate/

The charts currently do not have the deviations/variance bounds as he has to figure out how to chart it, as things quickly will start overlapping and blurring together. Which is pretty much the point of giving the deviation ranges, is to show that options blur together. If you want accuracy you have to give the deviation ranges, giving a more precise average calculated to the Nth decimal place is not more accurate information.


1 person marked this as a favorite.
Pathfinder Lost Omens Subscriber
krazmuze wrote:

This redditor is doing a leveled combat simulator (50 rounds)

https://www.reddit.com/r/Pathfinder2e/comments/dau1pq/p2e_simulator_dev_upd ate/

The charts currently do not have the deviations/variance bounds as he has to figure out how to chart it, as things quickly will start overlapping and blurring together. Which is pretty much the point of giving the deviation ranges, is to show that options blur together. If you want accuracy you have to give the deviation ranges, giving a more precise average calculated to the Nth decimal place is not more accurate information.

deviation is frankly, not useful information. it adds what most people already know that the damage will be above or below these numbers in actual play.

since you cannot replace a d20 with d10+5 or some such, there is no part of a build that should take this into account. you cannot make a build accounting for luck and so having the deviation isn't that useful.

besides i think the most useful stat is still rounds to kill and how your rounds to kill compared to your enemy's, which using DPR/DPA as their basis.


Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

At AC19 encounter damage the d8+4 longsword build does 23+/-11 while the d4+1 dagger build does 17+/-6

Yes the longsword average damage is a third higher than the dagger, but this comes at a cost of twice the variation - so if you are not lucky the longsword can do as badly as the dagger.

What stats can tell you is the AC breakpoint where most dagger players are doing just as well as the longsword, below that AC more longsword players are doing better than the daggers.

Statistics CAN be used to quantify 'luck' - these analysis are normal gaussian distributions.

For more details I posted here, since the OP has been wanting to talk about utility other than DPR.

https://paizo.com/threads/rzs42scm?Encounter-Damage-Range-build-analysis#1


Bandw2 wrote:
krazmuze wrote:

This redditor is doing a leveled combat simulator (50 rounds)

https://www.reddit.com/r/Pathfinder2e/comments/dau1pq/p2e_simulator_dev_upd ate/

The charts currently do not have the deviations/variance bounds as he has to figure out how to chart it, as things quickly will start overlapping and blurring together. Which is pretty much the point of giving the deviation ranges, is to show that options blur together. If you want accuracy you have to give the deviation ranges, giving a more precise average calculated to the Nth decimal place is not more accurate information.

deviation is frankly, not useful information. it adds what most people already know that the damage will be above or below these numbers in actual play.

since you cannot replace a d20 with d10+5 or some such, there is no part of a build that should take this into account. you cannot make a build accounting for luck and so having the deviation isn't that useful.

besides i think the most useful stat is still rounds to kill and how your rounds to kill compared to your enemy's, which using DPR/DPA as their basis.

I agree with what you're saying, but I think he's trying to account for which builds have crippling worst-case scenarios, and which builds decent worst-case scenarios, but he's doing a bad job of explaining it.

For example, a pick build is gonna be pretty mediocre if your crits don't land on the right enemies, while other weapons might have stats that perform well consistently, and are less subject to luck.


Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

It is the old question which is better 1d12 or 2d6?. The 2d6 is more reliable so you are likely to do better with it than the 1d12. 2d6 gives you the opportunity to cancel a bad roll, but you are stuck with the bad roll with the 1d12. Now the 1d12 is maybe more fun to play as you can get more 12's but it comes at the cost of 1's while the 2d6 can never do worse than 2's. The 2d6 is more likely to be closer to the average because it has a narrower histogram spread.

It is how Vegas makes money - they play forever so they can achieve the theoretical odds which are always in their favor, however you just visited for the weekend - so your luck is more important than odds. They know if your luck is bad, you will call in sick on Monday to try to get some good luck back. That is why they will comp you rooms to stick around, they know that no matter how long you stay you can never play as long as them so they will always win.

1d6+6 is guaranteed to always be more reliable than 1d6 you will always do better, but 1d6+1 you can never play long enough to be sure your bad luck was canceled out by the good luck to achieve the theoretical odds. It is likely someone without the modifier is performing better than you - because there is significant overlap in the histogram of results.


3 people marked this as a favorite.
krazmuze wrote:

It is the old question which is better 1d12 or 2d6?. The 2d6 is more reliable so you are likely to do better with it than the 1d12. 2d6 gives you the opportunity to cancel a bad roll, but you are stuck with the bad roll with the 1d12. Now the 1d12 is maybe more fun to play as you can get more 12's but it comes at the cost of 1's while the 2d6 can never do worse than 2's. The 2d6 is more likely to be closer to the average because it has a narrower histogram spread.

It is how Vegas makes money - they play forever so they can achieve the theoretical odds which are always in their favor, however you just visited for the weekend - so your luck is more important than odds. They know if your luck is bad, you will call in sick on Monday to try to get some good luck back. That is why they will comp you rooms to stick around, they know that no matter how long you stay you can never play as long as them so they will always win.

1d6+6 is guaranteed to always be more reliable than 1d6 you will always do better, but 1d6+1 you can never play long enough to be sure your bad luck was canceled out by the good luck to achieve the theoretical odds. It is likely someone without the modifier is performing better than you - because there is significant overlap in the histogram of results.

2d6. 2d6 is better than 1d12. It has a higher average. I... like, I'm the original poster, but if you're going to post proof like this to back your claim, I just don't know what to say. This argument has nothing to do with variance.


3 people marked this as a favorite.
Pathfinder Lost Omens Subscriber

1d6+1 is always better than 1d6... it's not like you ever rolled a different dice, these calculations assume you're rolling a dice once when you compare a 1d6 versus 1d6+1, like a rolled a 4, now compare 4 versus 4+1, etc...

like this doesn't make sense.

Most DPR calculations break down when you start dealing with applying conditions, etc. like a rogue relying on feint to get flatfooted, or flanking is even more unpredictable. how much of your DPR is wasted due to overkill? etc.

variance isn't the issue with DPR calculations.

to top it off, pathfinder reduces the randomness, since you get bonus damage dice instead of straight mods. as you level you'll have 2d12 then 3d12 etc...


1 person marked this as a favorite.
Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber

This is not about adding modifiers to your past rolls, it is about comparing the possibility of everyones future rolls

d6+1 is in fact not always greater than d6 - simply because you are NOT rolling the average every time, but instead are more likely to see a deviation.

sum(randi(6,10,1)+1)=34
sum(randi(6,10,1))=48

Sure a cherry picked two player 10 roll anecdotal...does not tell you anything at all.

But compare many thousands of players over an entire level and look at the average encounter? Surely the tables running d6+1 should never be as bad as the tables running d6 right?

At AC20 for a lvl2 flurry twin kukri ranger the average encounter damage for d6 is 15+/-4 with d6+1 is 19+/-4.

What that means is that 95% of players using d6 scored <=19 while 50% of players using d6+1 scored <=19. The top 5% d6 players are averaging >19 damage while the worst 5% d6+1 players are averaging < 15 damage.

In simpler words just over half of the d6+1 players did as poorly as the d6 players. That is only even 50:50 odds that a d6+1 player will do better than a d6 player.

Therefore it is not possible to say that d6+1 is statistically better.

Now if we talk about a d6+2 only the worst 5% are being beat by the best 5% of the d6.

I have to run a d6+3 to have perfect odds that an average round is always beating the d6.

However if we look at AC10, then all those crits doubling that fixed damage of +1 to +2 - now the +1 is always going to do better. In 5e this would not be the case because only nat20 crits and you do not double modifiers. This is why PF2e is more deadly, the bosses can reliably crit you to death.


2 people marked this as a favorite.

Here's why variance is important:

Lets start with an extreme example:

1d100 is way higher average damage that 2d4, but if your enemies only have 2 hp then 1d100 will finish them off in 1 attack 99% of the time while 2d4 will kill them 100% of the time.

Lets take a less extreme example of 3d4+3 vs 1d12+4 (both average 10.5)

1d12 % of at least a
1: 100%
2: 100%
3: 100%
4: 100%
5: 100%
6: 91.67%
7: 83.33%
8: 75.00%
9: 66.67%
10: 58.33%
11: 50.00%
12: 41.67%
13: 33.33%
14: 25.00%
15: 16.67%
16: 8.33%

3d4+3 % of at least a
1: 100%
2: 100%
3: 100%
4: 100%
5: 100%
6: 100%
7: 98.44%
8: 93.75%
9: 84.38%
10: 68.75%
11: 50%
12: 31.25%
13: 15.63%
14: 6.25%
15: 1.56%

So an opponent with 10 or less hp the 3d4+3 is a better choice, while if you have a single action available vs an opponent with 12-16 hp you're better off with the 1d12+4.

Further Math with 2 strikes:

Assuming you have 2 actions available and just want to know which option has the higher % of doing at least X damage...

6d4+6 is better for 11-21 while the 1d12+8 is better for 22-32.

Obviously this is a way over simplification of the math involved, but hopefully helps to explain with variance in DPR is relevant.


Pathfinder Maps, Pawns Subscriber; Pathfinder Roleplaying Game Superscriber; Starfinder Charter Superscriber

Interesting!


8 people marked this as a favorite.
krazmuze wrote:
d6+1 is in fact not always greater than d6 - simply because you are NOT rolling the average every time, but instead are more likely to see a deviation.

You... do realise that you aren't even talking to anyone here, right?

Literally nobody has claimed the thing that you're disputing here. Nobody has said that through some magic 1d6+1 will always roll better than 1d6, 100% of the time. It has been said, accurately, that 1d6+1 will always be better than 1d6, in that given the choice between the two there is no reason why 1d6 would be a better option.

As has happened several times now, the basic math you've taken it upon yourself to try and explain to everybody is something that only one person present has seemed to be struggling with, and it isn't the person you want it to be.


krazmuze wrote:
d6+1 is in fact not always greater than d6 - simply because you are NOT rolling the average every time, but instead are more likely to see a deviation.

That's the main issue of your reasoning. You're comparing 2 characters efficiency, but nobody cares about that. It's your own DPR which is important. And if you calculate your own DPR, d6+1 is always strictly better than d6.

And when you take that into account, the variance drops like crazy, as your longsword ranger only needs a few attacks to get more damage than your dagger ranger at a sigma 6 level.

You try to compare players efficiency, while we just want to know our efficiency.


2 people marked this as a favorite.
Pathfinder Adventure, Adventure Path, Lost Omens, Pathfinder Accessories, Pawns Subscriber
FowlJ wrote:


Literally nobody has claimed the thing that you're disputing here. Nobody has said that through some magic 1d6+1 will always roll better than 1d6, 100% of the time. It has been said, accurately, that 1d6+1 will always be better than 1d6, in that given the choice between the two there is no reason why 1d6 would be a better option.

You are saying nobody is saying this

"1d6+1 will always roll better than 1d6, 100% of the time"

when it exactly the same thing as saying this.

"that 1d6+1 will always be better than 1d6"

You cannot change the fact that at DC20, d6+1 is only better than d6 for half the players. If you are fine knowing that the +1 improved your crappy rolls by +1, and want to remain ignorant that you are way below average, then great for you.

Does not stop the other thousands of players from coming online and saying that they are not seeing the benefit of the modifier compared to other players - because the fact is that half of them will not see the benefit.

Unlike yourself who is happy that you did a build with average damage boost that you cannot achieve relative to other player, the half the players that are not acheiving the average boost are not happy and they want to know why that is.

So when you see a histogram response, there really is no need for you to comment because you only care about the median. Well that information is on the chart with the 50% line, you have the information you need. Feel free to ignore the 5%, 95% bounds on the chart while others take advantage of it.

Even if you want to ignore all those people, the game is still a contenst against all the NPC your DM plays. Statistics tells you that even with matched AC20 and ATK+8, you would be a fool to take that bet of using a d6+1 when the boss is using d6 and going into it saying you will always win that fight - because half the time the boss WILL win the fight despite your numerical edge you are so proud of having. You should only take that bet if you have a d6+3 or it is an AC10 fight.

51 to 100 of 153 << first < prev | 1 | 2 | 3 | 4 | next > last >>
Community / Forums / Pathfinder / Pathfinder Second Edition / Advice / The new "whiteboarding": why DPR as a metric is broken. All Messageboards

Want to post a reply? Sign in.