It doesn't particularly matter which model you use. They all require data to tune and without a released game you don't have a sufficient amount of it. Playtest data can get you part of the way there but within the small sample size of a playtest group you will get skewed results. Not to mention the issue of a meta developing within that small group, that in no way resembles the meta when released to a larger audience, which further skews things.
100% agree. If you don't account for all of this then you're in trouble. Data analytics done correctly, coupled with a solid understanding of the underlying fundamentals can yield quite a bit of useful feedback.
As a sidenote, I find the whole "I know the answer but won't tell you because it's such an amazing idea that I'm going to keep it secret and make my own game with it" to pretty much be the Godwin's Law of game design discussion. You either have an argument to contribute or you don't. If it's an amazing secret that you don't want the wider world to know, great then keep it a secret and don't mention it. If you want to use it to claim something then reveal it. But you can't have your cake and eat it too.
Understood. The converse is when folks like Forgottenlore affirm an absolute negative "it can't be done!!!". In a debate, affirming an absolute negative amounts to philosophical suicide, especially without even knowing what "it" is.
To lead in the direction of your answer without getting into specifics: at a minimum, playtester ELO rankings and full game tape is a fundamental prerequisite. Running data analytics on all that raw data is very powerful, but the how and why is where the secret sauce starts getting interesting. Getting a reasonable sample size is hard, as you pointed out, so normalizing and pruning the data is important. I hope that helps a little, or at least gets your imagination going.
There are also design side equations, the MathWing stuff I have posted obviously being one of them.