Statisticians can you help?

geetee1972

Posts: 0

Free Member

Topic starter

I have a work problem I need some help with.

A client has asked me to look at data relating to discounting by their sales team based on actual sales price versus list price. I’ve got two sets of data, both large samples (300,000 or more points) and both are only about 1% of the total but they do purport to be representative of the same population.

One set is showing a median deviation from list of 0.8% and the other 2.2%. The difference between these two is fairly substantial but both deviations from list are very low.

What might cause this discrepancy?

Posted : 10/11/2017 10:17 am

bikebouy

Posts: 0

Full Member

"statistics, lies and more statistics"

or something like that.

Posted : 10/11/2017 10:18 am

thecaptain

Posts: 7505

Free Member

I'll take a look for 500 quid per day.

Posted : 10/11/2017 10:19 am

ultracrepidarian

Posts: 0

Free Member

2 sample T-test will tell you the probability of the samples being from the same population.

Posted : 10/11/2017 10:20 am

TheSouthernYeti

Posts: 0

Free Member

The samples not being random.

Posted : 10/11/2017 10:22 am

Junkyard

Posts: 5559

Free Member

they are different data sets and I am not sure why you think its a discrepancy - its actually there and you seem to have shown both data sets are different
I guess it shows the sets are not as typical as your client claimed as they are not that similar basically its a sampling error

Posted : 10/11/2017 10:23 am

perchypanther

Posts: 17313

Free Member

Pilot error?

Posted : 10/11/2017 10:23 am

grey_or_black

Posts: 36

Full Member

What might cause this discrepancy?

You haven't given any clues as to what the data really relate to (e.g., what sort of sales) and why you have been asked.

Looking at the above responses... You should be asking (or already know) how the samples were obtained for the two groups, check your calculations, understand the data beyond those median calculations...

Did you receive any other data apart from lots of discount percentages? If not, ask why they want you to do such a trivial calculation. Compare the additional data between the two groups to look for explanations. Ideas: are the bigger discounts linked to seasonal demand, more experienced sales personnel, bulk purchases.

Additionally, just comparing the medians will not be interpretable. Essentially you'll be comparing 1 data point from each group [precisely so if you have an odd number of data per group], but what about the other data? You'll get a better idea of the entirety of the discounts if you also consider some/all of minimum and maximum, interquartile ranges, mean and SD...

Posted : 10/11/2017 11:11 am

reggiegasket

Posts: 6332

Free Member

as said, a two-sample T Test will tell you how statistically different the two samples are. And they look quite different, as you say.

As to why they are different... who knows. You don't say why you have two samples, whether they were collected at different times, whether they are different products, whether the prices of the list are equivalent, or what. Without some information on that, there's no way we can even attempt to answer this question.

Posted : 10/11/2017 1:35 pm

andy4d

Posts: 2217

Full Member

I reckon you have a 50:50 chance of getting your answer.

Posted : 10/11/2017 2:47 pm

TiRed

Posts: 17327

Full Member

You can't do a T-test of the raw data because the data will not be normally distributed - unless some discounts were negative (and that is highly unlikely provided your customers are not mugs). I suspect the distributions are more likely to be beta distributions since the discount is bounded [0-1). I would model these and then test for the difference in distribution. Alternatives are to normalise the data by transformation, which might involve a logistic transformation.

Statistics will not tell you WHY the discounts are different. It will only give you the likelihood that a difference of the observed magnitude might be observed by chance alone.

Posted : 10/11/2017 3:50 pm

km79

Posts: 0

Free Member

60% of the time it works every time.

Posted : 10/11/2017 4:15 pm

CharlieMungus

Posts: 0

Free Member

as said, a two-sample T Test will tell you how statistically different the two samples are

I don't think that was said, and it's not true.

It's not clear what you are trying to find out. But given that you have access to the full data, why are you trying to infer anything from the samples.

If you are unhappy with inferential statistics, you could try some monte carlo, to give you a sense of how low the deviations are.

Posted : 10/11/2017 4:24 pm

geetee1972

Posts: 0

Free Member

Topic starter

I don't have the raw data and no, the discounts are only ever one way (wouldn't it be nice if were not the case eh).

Interpreting why there are discrepancies isn't my brief; I'm actually looking at this in the context of job performance and capability, looking at the impact of how training a sales person improves their performance and we're doing that based on discounting. I just noticed the two data sets looked quite different.

I guess I need to bounce that issue back to the client.

Thanks for the help though, greatly appreciated.

Posted : 10/11/2017 4:45 pm

Cougar

Posts: 78369

Full Member

I'll take a look for 500 quid per day.

Plus VAT?

Posted : 10/11/2017 4:52 pm

CharlieMungus

Posts: 0

Free Member

Then a one tail t-test

Posted : 10/11/2017 10:18 pm

wilburt

Posts: 0

Free Member

Statistical analysis will tell you what is different.

You’ll need to understand the operation to work out why its different.

I’m strugging with what industry has 60,000,000 manually discountable sales!

Posted : 10/11/2017 10:52 pm

hols2

Posts: 0

Free Member

I’m strugging with what industry has 60,000,000 manually discountable sales!

[img] [/img]

Posted : 11/11/2017 12:08 am

geetee1972

Posts: 0

Free Member

Topic starter

I’m strugging with what industry has 60,000,000 manually discountable sales!

Big tobacco. They still sell onesys and twoseys off the back of a moped the shacks in places like Thailand.

So yeah, it does look rather a lot like:

[img] [/img]

Posted : 11/11/2017 1:24 pm

Forum menu

[Closed] Statisticians can you help?