StoveBench: A Stove Testing Protocol for Comparing the Performance of Backpacking Stoves

PostedJan 6, 2019 at 12:22 pm

Yeah, I agree. A sample size of three with tight measurements is likely a fair initial test.

A sample size of 5 is what I used for 90% of my testing. Toss out the high and low as likely my error in preparing things. Average the remaining 3 to get a single average data point. Median, mode, stddev were usually pretty much ignored since they really only report on the test itself (known to be rather sloppy.) All using a large measuring cup as a volume measurement for water, and, a simple kitchen scale with an accuracy of 1 gram for fuel weights for cannisters (all calibrated at work at Cornell’s ChemEng lab initially.) Nowhere near your scale’s accuracy, though.

In the field, I brought my scale once, and measured all burns for a couple weeks. There was a discrepancy between lab testing and field results on the MINUS side. It seems I was out in summer and all my water during testing was ice water! Nice to know, I guess. This validated the lab tests in my mind, though some conditions can change this. 0F environment, rain and wind, etc…

Jon Fong / Flat Cat Gear BPL Member

PostedJan 6, 2019 at 3:34 pm

James Marco, you know that you you shouldn’t throw out the high and low! You are artificially narrowing your results. The high and low indicate experimental uncertainty with the procedure/setup. Statistics are used to quantify that variability. My 2 cents

James Marco BPL Member

PostedJan 6, 2019 at 5:32 pm

Jon, Yes, I know. I am familiar with statistics, statistical analysis, research methodology, research analysis, etc. though mostly in the guise of microbiology and medicine. But we are talking 45-50 year old info, there. But in a “kitchen lab” it didn’t really matter… I was finding a 10% repeatability to be a GOOD number between all the tests. Soo, what the hell… I am sure Ryan can get the repeatability accuracy down to within 5% or better using his setup. I already knew my stuff was guaranteed to be off, though it took a few tries before confidence was good due to the high and low values. (I ran a couple tests of 10 to check it and found that dropping the high and low actually increased the confidence to about 2stdev for the 8 remaining. Close enough…) I dropped the test number to 5 and continued to drop the high/low. I wouldn’t say high confidence, but good confidence. It gets VERY difficult to compute with small sample sizes and have any meaning (3 is not enough.)

Jon Fong / Flat Cat Gear BPL Member

PostedJan 7, 2019 at 1:12 am

OK, a couple of other notes.

F = output ÷ input
F = [ V ⨉ ΔT ] ÷ [ t ⨉ M ]

The output looks like a scalar version of just an energy: Mass* Specific Heat * delta temperature
Joules = MCp ΔT
That being said then [ V ⨉ ΔT ] ÷ t is a scalar to watts
Watts = MCp ΔT/t
That means that F is a scalar to Watts/M in other words, you are looking at effective power out per gram of material.

I just seems like will a few more constants, that the variables would actually have a physical meaning that are known in the world of physics (Roger?).

It seems like you are striving towards a metric looking at overall efficiency as a function of time
Metric = (Output / Input)/time or Efficiency/ time
Metric = [(Mass* Specific Heat* Delta T)/(Fuel Mass *Fuel Energy Density)]/Time

This would allow you to characterize stoves across various fuel types as most of the Fuel Energy Density are known.

It may be more work, however; once you set up a test metric and the database starts to fill up, it is difficult to change. Luckily, most of the changes are actually constants.

My 2 cents.

James Marco BPL Member

PostedJan 7, 2019 at 1:46 pm

Exactly. Using standard physical parameters already in existence will actually simplify overall database entries by simply placing constants in a single table/spreadsheet to be added to any calculations by reference…”pointers” if you will. I still worry that data collection is flawed though…

Roger Caffin BPL Member

PostedJan 7, 2019 at 8:36 pm

the variables would actually have a physical meaning that are known in the world of physics
I don’t have any problems with the variables Ryan is using, but I am a geek …

F = output ÷ input
F = [ V ⨉ ΔT ] ÷ [ t ⨉ M ]
The first line is not quite the same as the second. In this context ‘input’ relates to M; the ‘t’ is extra. Simple fuel efficiency (water heated per fuel used) would be V ⨉ ΔT / M , but Ryan has included the time taken for the heating in his equation.

There are two good reasons for doing so. The first is that some would prefer a stove that boils their water in a short time rather than a long time, especially if it is freezing cold and the water is for coffee. Tales of waiting 15 minutes for a tiny alky stove are well-known.

The second is an interesting and tricky trade-off, to do with how much heat is lost up the side of the pot. As we know, if you run a stove flat out in the hope of a faster boil the efficiency will fall badly. Now, if the if the amount of fuel used doubles but the time halves, then the F value stays the same. You do however end up having to carry more fuel. Will the amount of fuel used balance this way against the time taken? I don’t know, but I have my doubts.

So this brings us to the question of whether you can adequately represent a stove by a single parameter. Maybe, maybe not, but this sort of testing will give the enthusiast a LOT more information about a stove then you could get from the marketing spin (which never gives the actual test conditions anyhow). Me, I would be looking at the full data set.

Will any of this matter to the novice about to buy his first stove? Probably not.

My 2c
Cheers

Ken Thompson BPL Member

PostedJan 7, 2019 at 9:22 pm

“Will any of this matter to the novice about to buy his first stove? Probably not.”

I guess that was my question about practical application. The engineers seem to be enjoying themselves. Have at it.

Roger Caffin BPL Member

PostedJan 7, 2019 at 9:54 pm

Hi Ken

To be sure, the techno-geeks will lead the way. But we did that with the whole UL concept too, before it became mainstream.

Cheers

Rex Sanders BPL Member

PostedJan 8, 2019 at 1:02 am

Consider this: techno-geeks led the way on the most recent revival of lightweight/ultralight backpacking starting in the late 1990s – which has mostly failed to take hold in the mainstream market (I differ with Roger on this point). And the lightweight backpacking boom in the late 1970s to early 1980s was very mainstream, but faded anyway.

We might not want to make the same mistakes again – in the case of stove metrics, by using test procedures, units, or jargon that most people can’t understand, use, or (most important) see the value difference between one number and another

Virtually everyone can understand “boil times”, for all the imperfections in how stove makers and home tinkerers misuse that concept.

More later.

— Rex

DAN-Y

PostedJan 8, 2019 at 2:16 am

Joules = MCp ΔT
That being said then [ V ⨉ ΔT ] ÷ t is a scalar to watts
Watts = MCp ΔT/tMetric = (Output / Input)/time or Efficiency/ time
Metric = [(Mass* Specific Heat* Delta T)/(Fuel Mass *Fuel Energy Density)]/Time

Exactly. Using standard physical parameters already in existence will actually simplify overall database entries by simply placing constants in a single table/spreadsheet to be added to any calculations by reference. Exactly! Ryan will get it down on a spread sheet for us via the Stovebench. Look forward to it. My 3 dollars worth.

Roger Caffin BPL Member

PostedJan 8, 2019 at 2:40 am

Simplified answers generally don’t work. You need all the data.

If you want a stove for use always outdoors and you want to melt lots of snow in sunny weather, the MSR Reactor could be a good choice.

But if you want a stove you can use inside your tent at a gentle simmer – the Reactor would be a terrible choice. A single parameter cannot handle this.

Cheers

Jon Fong / Flat Cat Gear BPL Member

PostedJan 8, 2019 at 4:09 am

Exactly, one parameter will not be able to characterize a stove.

That being said, StoveBench is close to characterizing some important one. With very little work you can characterize the following

Power, Efficiency and your factor F which almost sounds like Effectiveness (efficiency/time)

A few more parameters that will likely pop up are Effectiveness per pound (of your cooking system)

Regardless of what Ryan does, I will probably calculate those parameters anyway. My 2 cents.

James Marco BPL Member

PostedJan 8, 2019 at 12:26 pm

There are simply too many variables to translate most, if not all, stove testing to real life use on the trail. And, conversely, too many variables in real life trail use to translate to stove testing.

Perhaps the most glaring example is measurement protocols. We rarely, if ever, haul a scale into the back country. It is most usual to simply mark your pot (or use premarked graduations) for volumes. Yet, as StoveBench demonstrates, weighing is generally used by most of us for testing procedures. It doesn’t really matter in the field when the goal is simply to produce a hot cuppa when needed. In this I concur with Roger in that a single calculated number has little translated value to field use.

Each parameter in the environment would need to be accounted for. StoveBench does well with this. Even with our much loved heat shields/wind screens, pot lids, pit stoves, wind breaks, tent alcoves, etc we cannot entirely mitigate the effects of a wind. Anything more than a gentle ventilation breeze will effect accuracy. But, there is more, of course, accounting for the rather large variability in data points despite our best efforts. Hell, even walking by a running stove can effect run-time by disrupting air flow patterns.

Let the test stand as an opening protocol. Do not attempt to calculate anything at this point, though as preliminary results they might be worth a mention. It is fun to play with numbers, averages, and statistics, but there are a few variables that may be out of range for some stoves: eg,, Wide burners.

But do not mention efficiency. This test protocol is not very efficient nor does it test for efficiency, except at a very gross level among data points generated by this test protocol. Indeed, even these are suspect. But, simply reporting data wouldn’t give the geeks much to chew on would it?

Craig B BPL Member

PostedJan 8, 2019 at 3:31 pm

Great test protocol. I also agree it’s about the best one could do for this subject despite other people’s comments about it’s flaws.

One quick note for people that want to attempt this at home which has not quite been directly stated by others. In regards to sample size, keep in mind there is always a learning curve for any new procedure you do. Doing a lab experiment like this requires consistent technique every time, and you likely won’t achieve that until at least the 3rd – 5th time. Yes all data is good, but only if the technique is consistent. So if the results of the first couple of runs are significantly different from subsequent runs, it’s ok to throw those out.

Roger Caffin BPL Member

PostedJan 8, 2019 at 7:26 pm

A little protocol from the hard science community might help.
Come up with an idea.
Do some experiments to test the idea.
Write up a paper or report on the results.
Read the paper and see what experiments should have been done.
Do or repeat the experiments properly.
Check the new results against the conclusions in the paper.

This is never mentioned in the published papers!

Unfortunately, many people in the ‘softer’ areas like medicine, nutrition, psychiatry, etc, never get beyond the third step, which is why those areas have such a problem with unreproducable research. The current move towards having to make all the experimental data publicly available is essential.

Cheers

James Marco BPL Member

PostedJan 8, 2019 at 9:37 pm

Ha, ha….yes. Even the so called hard sciences have trouble with releasing preliminary results too soon. Quarks, and cold fusion come to mind. Good cautions, Roger!

DAN-Y

PostedJan 9, 2019 at 3:30 am

A crucial component of both canister and liquid fuel stoves is the needle valve, also called the control valve. This serves not one but two functions. Obviously it serves to regulate the flow of fuel to the burner, allowing you to simmer gently or boil vogorously, but it also serves as a full shut-off valve.

Will Roger or someone with his experience be doing an analysys of the needle/control valves as part of the TESTING PROTOCOL? How well the fuel is delivered to the burner is the most important part of determining the quality of a stove. It should be incorporated into the BPL testing protocol.

Roger Caffin BPL Member

PostedJan 9, 2019 at 4:06 am

Hi Dan

Yes, the needle valve is important. It was discussed a little in the CO series (whence the photos), and rather more extensively in the V2 and V3 winter stove series. A very gentle taper in region D is desired, to give a fair control range but the quality of the corner E is also important as that corner (and the valve seat) determines how well and how easily the fuel may be shut off. Clearly, these comments apply to all fuel stoves: canister, white gas and kero.

Less obvious is what happens to the fuel as it goes through the valve. In most (all) cases there is a pressure drop, and in some cases there may also be an expansion and boiling, so that it is vapour, not liquid, which reaches the jet.

The boiling bit is important. If the fuel is not hot enough the expansion may cause a cooling, leaving some fuel components as a liquid. This can happen with a propane/butane mix in cold weather. In general, priming is needed to get the stove hot enough that all the fuel vapourises. Some stove mfrs have claimed that you don’t need to prime their white gas stoves: this may be so if the air temperature is >20 C, but otherwise it is just marketing spin.

What have I not covered here?
There is quite a bit of esoteric detail I have not mentioned, but a good design should hide all of that.

Cheers

Rex Sanders BPL Member

PostedJan 9, 2019 at 6:22 am

Reproducibility is a problem in the hard sciences, too. And it’s likely to get worse before it gets better.

For example, I’m very skeptical of the “leaning tower of models” approach used in many fields. Take the output of one computer model, feed that into another computer model, and repeat until you don’t understand how much error you have, or why you get startling results. Publish the startling results, but don’t mention the other problem. Years later, if someone fails to replicate your findings, blame it on changes to the model code, or on unpublished parameters. Requiring open data publication won’t solve that problem.

Where were we …

With Stovebench at least we have a reasonably straightforward test protocol and few poorly understand models, none involving computers (however, beware the perils of Excel!)

I agree that making the raw(-ish) data publicly accessible is a Good Thing. But I’ve been arguing with scientists about data management for more than 40 years.

— Rex

James Marco BPL Member

PostedJan 9, 2019 at 10:51 am

Rex, yes. The raw data becomes the important part of these tests. Raw data can be mined for all sorts of things down the road, often in unexpected ways. Calculated results, well, these are always iffy. As long as the basis for any calculations isn’t lost, sure,.. But as you say, the “leaning tower of models” can be a very bad thing.

Jerry C

PostedJan 9, 2019 at 2:37 pm

There will never be a backpacking stove to match the Svea 123. The “whispers” and the alcohols will come and go, but my Svea is indestructible and totally reliable. Mine is now almost 60 years old, I have fired it up thousands of times without a single failure… I have twice witnessed the “O” ring on whisper lights fail resulting in a pool of burning gasoline…My sweet little Svea always works like a charm and welcomes meal time with its wonderful roar. But, then again, I’m 76 years old and have been packing since I was a Boy Scout…so what do I know? :)

James Marco BPL Member

PostedJan 9, 2019 at 5:21 pm

Jerry, yup, the old 123 is a good reliable, durable stove that would suffer badly in these tests due to low power and time to boil testing, regardless of overall fuel efficiency. But all the data for StoveBench is canister “toppers”, right now. Even other items that are technically toppers, like a JetBoil with the general pot adapter for the standard pot will suffer due to reduced BTU/Kw/hr outputs, for example. Reactors, Windburners, etc all fall into this same test protocol trap and just in this one category of stoves.

I think we, as the community here, understand enough about stoves to come up with a rather standardized test that would allow testing across the board. For example, clearly there is SOME advantage to alky, lots of people use them. Under what conditions? Why is it worth the weight trade off? And so on… We miss soo much using a single narrow test protocol. On the opposite side of the scale, how can you even quantify “reliability”? Sure, you can say this stove has been around for 60 years with no problems, but how do you do that with a new stove? See what I am getting at?

David Y

PostedJan 9, 2019 at 11:10 pm

I have tried to follow this conversation but can’t understand what it has to do with the Philmont forum???”

Roger Caffin BPL Member

PostedJan 9, 2019 at 11:20 pm

Dear David Y

I think you have your Forum channels crossed. Th one has ZERO to do with Philmont.

Cheers

DAN-Y

PostedJan 10, 2019 at 12:21 am

Stove #2 has a better needle valve than the #1 stove. The gas flow of #2 can be lowered to be equal in efficiency as #1. The number 1 stove can’t increase flow of gas for more energy to be equal to #2, but yet it gets a higher Stovebench score than #2.

I would purchase number 2.

StoveBench: A Stove Testing Protocol for Comparing the Performance of Backpacking Stoves

Our Purpose

Email Newsletter

More Links

Username
Password
	Remember me

StoveBench: A Stove Testing Protocol for Comparing the Performance of Backpacking Stoves

Our Purpose

Email Newsletter

More Links

Membership Required

Premium Articles

Get full article access by subscribing to a Premium or Unlimited Backpacking Light membership!

Share This

Login