Viewing 18 posts - 1 through 18 (of 18 total)
  • Help! Any statisticians in?
  • djglover
    Free Member

    I have a statistical problem to answer and I need some help. I think I know the answer but as I have no previous experience or qualifications in statistics I’m having some problems conveying what I think the answer is.

    Here is the problem: a change in average energy consumption has been observed in one population who have had an energy saving intervention in the home versus a control group. The measurements taken follow the trend of a normal distribution and hence reflect a wide range in changes in behaviour. I can therefore only tell for the whole population what the aggregated change is, it is not possible to answer the question how many people have changed consumption by up 1% or up to 2% because there are too many other variables at work that I am not able to isolate.

    Is the bolded statement true from a statistical point of view?

    damo2576
    Free Member

    True. Not possible.

    Examples of other variables you are not able to control for:
    – ambient temperature of each control group (one may be in a warmer place than the other)
    – disposition to temperature of each control group (one may contain lots of people who like their houses cooler)
    and so on

    djglover
    Free Member

    Thanks Damo

    sok
    Full Member

    To answer your question about whether or not you can tell how many people have changed their consumption you’ll need the person/household level data; if you have this it’s easy to do. You can do this regardless of what type of study it is and what variables you have. The issue is how much of this effect can you attribute to the intervention. If there are lots of other variables at play then unless you have the data avilable for these then you can’t really draw any conclusions.

    If this is a randomised controlled trial (RCT) where the membership of the control and intervention groups has been drawn from the same population and then each person/household has been randomily assigned to the intervention or control groups then this controls for both known and unknown confounders and other variables. As such any difference in the outcomes between the two groups can be attributed to the intervention.

    If it’s not an RCT then unless you have data available on all these other factors then you can’t control for these. If these data are available you’d do your statistical analysis to control for these. This isn’t as good as controlling for them in your study design (as an RCT) but is good enough and done a lot as RCTs are expensive and difficult to do in most situations.

    djglover
    Free Member

    The control group is was gathered after the trial started, but it is made up of people who are in the same age and affluence groups, property types and cosumption strata and towns, so the groups are fairly robust I think.

    The problem I have is a director baning his fist on the desk asking for how many people change by 1%,2% etc and me saying, I can’t possibly answer that sorry.

    Need a crash course in statistics and / or managing senior stakeholders 🙁

    sok
    Full Member

    Do you have the data at household level?

    djglover
    Free Member

    sok – I have the experian fss and some other data, but I don’t know at household level what other changes have been made that affect consumption (eg baby, new central heating, double glazing etc etc. So all I know is that a change has happened, but those changes have manifested themselves in the same shape (normal distribution) in the control group too, just at a lower average rate.

    magowen100
    Free Member

    If I understand this correctly you have two randomised groups with significantly different normal distributions in which case the numbers of each incremental group should be in the figures that produce histogram of the normal distribution curve.
    EDIT: Just read your reply to Sok – do you know there is a significant difference in the means or that the curve is the same shape but the mean of that curve is lower in the control group.

    djglover
    Free Member

    Yes I have 2 histograms that overlap and I know how many people are at the 1% point of change on that chart, but I don’t know why they are there. All I know is that the 2 groups are as near as identical as I could make them appart from one has had an intervention and the other hasn’t. So I can say, for example, “40 people changed consumption by 1%”. But I don’t think I can say “40 people changed their consumption by 1% because of the intervention”, because 38 people from the control group have also changed by 1%!

    EDIT – Cant post my histogram as photosharing websites are blocked so hoping the explanation is clear!

    sok
    Full Member

    DJ
    You’ll never know about all these other factors so you just have to accept this. That’s the point of having a control group which is comparable to your intervention group, so you can assume (on average) any changes that happened in your intervention group also happened in the control group.
    Whether you can say what number(or %) of houses changed their consumption depends whether you have data for them all individually or only the group as a whole.
    If changes happened in the control group as well then you report the both. You should be able to say (and is valid to do so) something like:
    The intervention group lower their energy consumption by x amount
    The control group lowered their consumption by y amount
    On average intervention group lowered energy consumption by (x-y) amount more than control gorup
    AND ALSO
    a% of houses in intervention group lowered consumption by 5%
    b% of house in control group lowered consumption by 5%
    (a-b)% more houses lowered their cosumption by 5% in intervention group compared to control

    BUT you can only do this if you have the energy consumption for each household, not just the pooled data.

    sok
    Full Member

    email me your data, address in profile

    poly
    Free Member

    djglover, sok’s comment about randomised control trial seems to still be relevant even though you are saying “The control group is was gathered after the trial started, but it is made up of people who are in the same age and affluence groups, property types and consumption strata and towns, so the groups are fairly robust I think.” Did the group with the “intervention” select/volunteer to add it? You might argue that they were more disposed to “energy saving” anyway than the control group who for whatever reason didn’t volunteer.

    However, from a commercial / career point of view there may be no harm in providing him the data, written into a report which says, “here’s the data, but here’s why it might be wrong to read to much into it”.

    If you have all the data you should be able to extract the %age of households who change consumption by 1,2,3% etc and likewise for the control group? You should also be able to take the %age change and standard deviation for the populations as a whole and estimate the number which change by a particular factor, that will be weighted to the whole population so if you have enough data individual housholds wont affect it. It a bit of a fudge, but thats the sort of thing your “stakeholder” is looking for. If you simply tell him the trial can’t tell him that he will blame you for designing a bad trial.

    I would include a remark to the following effect, “This is a simplistic statistical analysis, and robust detailed analysis could only be performed by specialist expert statisticians. The Company would be advised to seek such independent advice before basing financial, investment, legal or strategic decisions on this information.”

    When I caveat stuff like that it usually pisses people off, but they don’t really have a good argument, if they want robust stats employ a statistician, if they want rough and ready numbers ask me to do it, but don’t complain if down the track someone says its all a bit ropey.

    I think what you need to do though is give him enough of a hook that he believes its worth engaging the professional consultant (assuming that is what you want) rather than simply saying, I don’t think we have the data to answer that question.

    EDIT: If the numbers really are this close “But I don’t think I can say “40 people changed their consumption by 1% because of the intervention”, because 38 people from the control group have also changed by 1%!” then I doubt you have enough data to say that there is a statistical difference between the two groups. i.e. if you repeated the trial that they wouldn’t be the otherway round. That may not be the answer your senior stakeholder wants – and that is a much harder story to tell!

    djglover
    Free Member

    Thanks sok – your reply makes perfect sense, thanks for the offer but not sure I can email the data over without getting fired!

    I do have the indvidual data and have been able to say pretty much what you are saying in your last post, however when you look at the 1% and 2% changes the number of people are that point on the charts are so small that I don’t think they are statistically robust because it throws out some wierd answers.

    Really appreciate your advice btw

    Edit – just seen polys edit and I think that confirms my thoughts above!

    magowen100
    Free Member

    Yes I have 2 histograms that overlap

    As a general rule of thumb the differences between the means are unlikely to be significant.
    As long as the groups were randomly assigned then all other factors are random and should affect both groups in equal measure so they can be discounted.

    sok
    Full Member

    As Poly says. You can do your analysis and say it’s suggestive of, but not absolutely due to, the effect of the intervention.

    djglover
    Free Member

    Poly – The trial participants didn’t self select and yes I have suggested exteral audit of the methodology too.

    Cheers folks very useful help!

    poly
    Free Member

    djglover – if you do get fired for giving the “wrong”* answer, where abouts are you in the country and what do you do? You sound like the sort of person that we could do with on our ‘team’ and bothered enough to try and find out rather than just fudge it.

    *wrong here being the answer your senior stakeholder doesn’t want, rather than actually wrong!

    djglover
    Free Member

    I work on Smart Metering programme at Brtish Gas looking at the commercial strategy, involved in EV tariffs and CLNR, so working on lot of commercial stuff around smart / smart grids at the moment.

    London at the moment but currently open to job opportunities in the NW and West Yorks too.

    Poly – What does your team do? email is danjwilkinson@aol.com if you want to carry on the conversation off forum.

Viewing 18 posts - 1 through 18 (of 18 total)

The topic ‘Help! Any statisticians in?’ is closed to new replies.