When I test my gear, I like to create easily reproducible conditions (albeit will not cut scientific muster) so I can compare equipment within what I would consider acceptable margin of error.
When I'm studying my canister stove, I at the very least factor in the following variables before making any assumptions:
1. The stove and how its underlying technology compares with similar stoves
2. I must assume that there are no defects with my stove (already not a scientific experiment) or purchase a number of stoves which I cannot afford to control for manufacturing defects (my kid plays hockey; not happening)
3. Fuel Source and corresponding characteristics at varying temperatures and altitudes (beyond me to measure purchased product)
4. Altitude
5. Water purity, quantity, and temperature
6. Pot size, shape, mass, and temperature
7. Ambient temperature
Then measure:
1. Time from 40* to appropriate boiling temperature corresponding with appropriate atmosphere of pressure
2. Heat output of stove to achieve boil (currently estimating)
3. Fuel consumed by weight
I leave a few bottles of water in my fridge so they will be 40* when I run my tests; this works better for me than tap water to achieve a consistent starting temperature. My plan is to carry a thermometer with me this summer to calculate the average water temperature of the streams I cross in the Cascades to determine how close 40* refrigerated water represents what I will encounter in the field.
IMHO most fuel sources with the exception of canisters and bio fuel are easy to control for a test. I was in recent history (I'll humbly admit) on the losing end of a debate on the best way to test canisters. On a personal level, I have no interest in refilling a canister nor do I have the skill set and knowledge to measure the fuel mixtures commercially available fuel. This leaves too many variables unmeasured before I run my test and I'm forced to assume that Jetpower/Gigapower,MSR,whatever canister will perform within 10% of what I observe in a controlled environment (huge leap of faith).
Since I do not have the resources to conduct a truly scientific test at all temperatures and altitudes, I compensate by allowing for a measured margin of error and back up my canister stove with an appropriate number of Esbit tablets, or on a longer trip, an extra canister.
For example, I know that my Soto stove will boil 14 pints of 40* water at 400' at 32* ambient temperature. If I'm on a lazy week long solo trip in late spring at 1000' where I'm boiling water twice a day, I'll pack two extra Esbit tablets, won't bring water to full boil, and maximize the use of my cozy to mitigate the larger margin of error resulting from my inability to control for fuel consistency and unknown variances in stove output due to productions inconsistencies.
I do not have the time, money, or laboratory resources to analyze my equipment to reduce the margin of error and quite honestly it would never pay for me to make that investment.