Viewing 22 posts - 1 through 22 (of 22 total)
  • Mathematicians/statisticians – help
  • molgrips
    Free Member

    I want a function that will give me a value according approximately to a normal distribution. In other words, if I were to call the function lots of times and produce a graph of the incidence of a value, it’d produce a normal curve.

    Not really sure what to google for exactly.. but it should be fairly straightfoward no? Any ideas?

    mogrim
    Full Member

    Try a Gaussian random number generator

    mikewsmith
    Free Member

    in excel
    =NORM.DIST(5,2,3,TRUE)

    Plot a frequency histogram

    rusty90
    Free Member

    You want to generate series of random numbers that are normally distributed according to a given mean and variance?
    The Box-Muller transform is the usual way to do this

    How do you want to do this? C/C++? Excel?

    Some links to start with :
    http://en.wikipedia.org/wiki/Normal_distribution#Generating_values_from_normal_distribution
    http://stackoverflow.com/questions/2325472/generate-random-numbers-following-a-normal-distribution-in-c-c
    http://smallbusiness.chron.com/generate-random-variable-normal-distribution-excel-74203.html

    molgrips
    Free Member

    Rusty – yes. Using Java.

    There’s an Apache Commons math function for it, but I can’t use libraries I don’t already have, and I only seem to have Math 1.2.

    mogrim
    Full Member

    Summing two random numbers should generate a normal distribution, assuming your generator is correct.

    elliptic
    Free Member

    Summing two random numbers should generate a normal distribution

    Not starting with uniform randoms (ie your basic built-in PRNG). If the two inputs are already normal then yes, but why bother 🙂

    The sum of N uniform randoms approaches normal as N goes to infinity, but for N=2 you only have a triangular distribution.

    molgrips
    Free Member

    Imagine I’m generating a random set of people for test data. I want to give them a height, but to be realistic the height should be on a normal distribution rather than actually random.

    mogrim
    Full Member

    The sum of N uniform randoms approaches normal as N goes to infinity, but for N=2 you only have a triangular distribution.

    Good point, hadn’t thought it through fully… Without getting to infinity N=100 might be good enough for molgrips, though. What do you need to do with your distribution?

    elliptic
    Free Member

    Sounds like all that’s needed is a vaguely normal-ish spread between some reasonable min and max heights, so small N will be fine (see http://en.wikipedia.org/wiki/Irwin-Hall_distribution).

    Say you have a uniform generator in [0,1] use N=3 and sum giving a distribution in [0,3].

    Add that to a min height (say 4 feet) and you have a heights in the range 4 to 7 feet with a peak in the middle at 5′ 6″.

    Or bodge the numbers to suit 🙂

    mikewsmith
    Free Member

    /hack solution

    one excel workbook with 1000 cols x 10,000 rows using excel random generation to do the normal dist and just read from a col as required.

    molgrips
    Free Member

    Sounds like all that’s needed is a vaguely normal-ish spread between some reasonable min and max heights,

    Yes, absoutley.

    What do you need to do with your distribution?

    As above – faking real people for test data. Heights is just an example – it’s not actually height in this case. I’ve no idea if what I am modelling is actually a normal distribution anyway.. it probably isn’t.. but it’ll do 🙂 Values I’m looking for are integers as it happens.

    mikewsmith
    Free Member

    hell in that case see my hack, just generate enough data, upload it to somewhere DB or big array and find some values

    molgrips
    Free Member

    No, I want a function that does it.

    mikewsmith
    Free Member

    fair enough, higher level question is why? If it’s just for test data do it the simple way.
    I use excel to knock up most of my test random data because it’s quick and easy

    mogrim
    Full Member

    And repeatable, which is always nice when you’re testing.

    molgrips
    Free Member

    Because I want to run, have it generate tons of data and drop it on a queue without me having to do anything. I don’t want to be farting about with excel when a few lines of Java would do the same thing without intervention.

    Also don’t care about repeatability – this is just noise data, the real test cases will be inserted into it.

    mikewsmith
    Free Member

    yep fix the numbers and loop through the sheet. Even more repeatable, when we use RNS we loop back through the same stream starting at the same point unless you can make your function start in the same point then your at a loss. You can create 10,000×1000 Random numbers from a normal dist paste them into one sheet and repeat it, if it’s testing then you will be using real data so no need for the function. Unless you want to solve the problem of finding the function just solve the problem of creating the random data

    chambord
    Free Member

    And repeatable, which is always nice when you’re testing.

    Surely it’s easier to just seed the generator with a constant?

    Rusty90’s links are IMO the way to go here – the polar form of Box-Muller is less than 10 lines of code and here is a C implementation (From stack overflow) which shouldn’t require much tweaking in Java other than maybe math function names and constant names

    double sampleNormal() {
        double u = Math.random() * 2 - 1;
        double v = Math.random() * 2 - 1;
        double r = u * u + v * v;
        if (r == 0 || r > 1) return sampleNormal();
        double c = Math.sqrt(-2 * Math.log(r) / r);
        return u * c;
    }
    molgrips
    Free Member

    I saw that one chambord.. looks neat and tidy but it’s recusrive. Will give it a try though.

    That presumably gives numbers between 0 and 1 with average 0.5?

    chambord
    Free Member

    Yes it’s recursive to throw away values which are outside the unit circle, so will have a recursive call ~ 2/10 times (Back of mental envelope maths here, unlikely to be accurate 🙂 )

    EDIT: I believe mean of 0 with stdev of 1.

    molgrips
    Free Member

    So I can multiply the figure by whatever SD I want, then add an offset to create the mean I want. Great, thanks 🙂

Viewing 22 posts - 1 through 22 (of 22 total)

The topic ‘Mathematicians/statisticians – help’ is closed to new replies.