Using the random module

Make sure to import everything we need for this notebook first.

import matplotlib.pyplot as plt
import random
import string

.. but first, let’s create our own random function

Python’s standard random module generates pseudo-random numbers, which means that values only look like truly random values and that they are generated through some deterministic function. There are many different implementations of a pseudo-random generator (one being the Mersenne Twister which the random module uses under the hood), but in fact we can create our own pseudo-random number generator with our own implementation of “randomness”.

class PseudoRandom():
    def __init__(self, seed=1):
        self.seed = seed
        
    def random(self):
        self.seed = (self.seed * 7) % 13
        return self.seed
random_numbers = PseudoRandom(123)

[random_numbers.random() for _ in range(10)]
[3, 8, 4, 2, 1, 7, 10, 5, 9, 11]

Our PseudoRandom class takes in an initial value called a seed. Each time the random() function is called, the current value of the seed is multiplied by 7 then divided by 13 – this is our function for “randomness”. The remainder becomes the new value of the seed until the random() function is called again. This makes the function deterministic - if two people initiated the class with the same seed value and called the random() function, then they would get the same numbers on each same instance they call random().

The above example gives us a basic understanding of how Python’s random module works. The module has a bunch of pseudo-random number generator functions that we can use for what ever purpose. In the next sections, I will refer to pseudo-random numbers as just random numbers for brevity.

Functions in the random module

The random() function accepts a seed from which to generate random numbers. If no seed is provided, it uses some source of randomness from your OS (you can read more about pseudo-random generators here) and returns a float from the standard uniform distribution (numbers from 0 to 1).

# Unseeded random numbers
print(random.random())
print(random.random())
print(random.random())
0.5183017773452869
0.6361336021259365
0.2376951871669345
# Seeded random numbers
random.seed(1234)
print(random.random())
print(random.random())
print(random.random(), end='\n\n')

random.seed(1234)
print(random.random())
print(random.random())
print(random.random())
0.9664535356921388
0.4407325991753527
0.007491470058587191

0.9664535356921388
0.4407325991753527
0.007491470058587191

Random numbers from a uniform distribution

Numbers in a uniform distribution means that each number has the same probability of being picked as all other numbers. Using the random() function picks a number between 0 and 1 by default.

To generate a random float within a custom range of (min, max) values from a uniform distribution, we use the uniform() function. This accepts two parameters: the minimum value, and the maximum value. The following will generate a number between 1 and 2 (the discrete number 2 is excluded from the range).

# random float
random.uniform(1, 2)
1.910975962449124

To generate a random integer within a range of (min, max) values from a uniform distribution, we can specify a custom range to the randrange() function. This will generate a number between the minimum value (inclusive) and the maximum value (exclusive).

random.randrange(1, 100)
75

Now that we know some functions to generate random integers or floats, we can use list comprehension to generate several random values that can serve as our dataset.

For example, we can generate 1000 random integers between 0 and 100 in a uniform distribution.

data = [random.randrange(0, 100) for _ in range(1000)]

We can plot our dataset with matplotlib to visually check the distribution of our random numbers.

_ = plt.hist(data, 100)
../_images/Random module_23_0.png

If we run the dataset cell again, we’ll get a different result.

data = [random.randrange(0, 100) for _ in range(1000)]
_ = plt.hist(data, 100)
../_images/Random module_25_0.png

Notice that while the generated data between two runs are different, their distribution is the same - they are both a uniform distribution of integers from 0 to 100. We can visualise the same distribution of floats from 0 to 100 generated by the uniform() function.

data = [random.uniform(0, 100) for _ in range(1000)]
_ = plt.hist(data, 100)
../_images/Random module_27_0.png

Random numbers from a normal distribution

The normal distribution is also called the gaussian distribution. Numbers near the mean of the distribution are more likely to be picked out than the ones at the edges. This gives the plot of a bell shape which centres at the mean of the distribution. To generate a number in a normal distribution, we use the gauss() function and specify the mean and the standard deviation.

For example, if we want a number from a group of numbers where the mean is 0 and the standard deviation is 2, we write it as:

random.gauss(0, 2)
1.3036865737825454

We can plot down the distribution using matplotlib. The following plots 1000 numbers where the mean is 0 and standard deviation is 2. The graph shows the bell peaking at the mean, and the values are multiples of 2.

data = [random.gauss(0, 2) for _ in range(1000)]
n, bins, _ = plt.hist(data, 10)
../_images/Random module_32_0.png