The problem with on table performance to gauge whether upgrade cards are valuable is that you factor in the skill of the commanders and we are never always perfect logicians all the time.
We need theorycrafting to understand what to expect from a ship or an upgrade card if we are skilled enough to take advantage of their capabilities in a battle.
Example : people say the Neb Bs suck. Yet, I've won most of my Rebel games fielding 1 to 2 Neb Bs.
I would argue that variable player skill is a factor that would skew that data favourably towards actual expected performance rather than taint it.
if you want pure numerical impact, you can work that out with a calculator and some time and some assumptions. Playing different lists, more or less skilled opponents etc will (with sufficient data) actually make a more informed picture of a particular Admirals worth. Sure some people will always be on the high end of the curve and some people on the low, but it SHOULD be a bell curve around actual combat performance (all of which can be shown once data is collected)