Using the numpy library

In the previous section, we were able to generate random numbers using functions in the random module. We then made several random numbers using list comprehension.

import random

print([random.randrange(1, 100) for _ in range(10)])
[3, 4, 74, 90, 57, 32, 35, 46, 80, 95]

You’ll notice that the random module is limited to generating scalar values, and that we need to use for-loops/list comprehensions each time we need an array of values. Numpy is a more powerful library that operates on arrays, and it has its own functions to generate n-dimensional arrays of random values.

import numpy as np

Numpy has its own random module that is different from Python’s standard random module. The output of numpy’s random functions is always a numpy array, which is numpy’s basic data structure.

np.random.randn(5)
array([ 1.3547577 ,  0.75760898, -0.87585406,  0.28555026, -0.34940374])
np.random.randn(5)
array([ 1.12593429, -0.1631145 ,  1.24165682, -0.08268092,  1.4330829 ])

Similar to the random module, we can also provide a seed to numpy.random. It’s important to note that a seed generated from python’s standard random module cannot be used as the seed for numpy.random.

np.random.seed(1234)
np.random.randn(5)
array([ 0.47143516, -1.19097569,  1.43270697, -0.3126519 , -0.72058873])
np.random.seed(1234)
np.random.randn(5)
array([ 0.47143516, -1.19097569,  1.43270697, -0.3126519 , -0.72058873])
np.random.seed()

Random n-dimensional arrays

Random numpy arrays with a uniform distribution

Numpy operates on arrays of any shape, so we can expect that its random module can also generate arrays of any given shape. The np.rand() function accepts any number of parameters representing the shape of a multi-dimensional array. For example, we can generate a 2-dimensional array with random values by providing 2 parameters: the size of the 1st dimension, and then size of the 2nd dimension. Note that np.rand() generates numbers from the standard uniform distribution (numbers between 0 and 1).

np.random.rand(2,3)
array([[0.78256733, 0.66596987, 0.63084097],
       [0.95023572, 0.82580178, 0.79041558]])

If we want a 3-dimensional array, we just add the size of the 3rd dimension as another parameter:

np.random.rand(2,3,4)
array([[[0.10018053, 0.48311475, 0.20444325, 0.74305773],
        [0.66097182, 0.33444742, 0.46479434, 0.62770008],
        [0.00211702, 0.10030268, 0.82633628, 0.91458606]],

       [[0.91055989, 0.28460354, 0.28966579, 0.85894546],
        [0.83913847, 0.88538952, 0.82179712, 0.70011575],
        [0.8745724 , 0.68607236, 0.76411051, 0.52641683]]])

And so on, and so forth:

np.random.rand(2,3,4,5)
array([[[[4.39728218e-01, 6.81714585e-02, 3.63685632e-01,
          1.73465430e-01, 3.31392568e-01],
         [7.62389052e-01, 7.46403921e-01, 8.08893832e-01,
          2.64610039e-01, 8.02545976e-01],
         [6.51192177e-02, 8.05140588e-02, 2.64364131e-01,
          6.01561707e-01, 3.05559011e-01],
         [1.71400588e-01, 5.23408126e-01, 2.94170359e-01,
          2.58420830e-01, 3.34847307e-01]],

        [[9.32957768e-01, 6.64108265e-01, 9.97870234e-01,
          5.48709756e-02, 2.05880462e-01],
         [8.26114945e-01, 4.77620689e-01, 6.99764253e-01,
          6.22693185e-01, 8.32489893e-01],
         [4.34334539e-01, 3.70103251e-01, 7.92470151e-01,
          3.97743585e-01, 8.08249075e-01],
         [4.48806315e-01, 9.24352132e-01, 2.62913546e-01,
          4.63069599e-01, 7.46106676e-01]],

        [[9.48963250e-01, 4.85455680e-01, 3.70933007e-01,
          6.87090579e-01, 1.65995723e-01],
         [6.78604659e-01, 2.93027906e-01, 7.06606837e-02,
          1.53527295e-01, 3.82786037e-01],
         [7.89366828e-01, 2.91269430e-04, 2.74789722e-02,
          6.56570674e-01, 6.05654313e-01],
         [6.63014679e-01, 4.33346891e-01, 1.56866240e-01,
          2.57037110e-01, 9.91205736e-01]]],


       [[[3.52365615e-01, 2.05995701e-01, 4.77944940e-01,
          1.49893942e-01, 4.00400801e-01],
         [2.89476524e-02, 6.83789880e-02, 7.26673872e-01,
          1.26646804e-02, 8.06607734e-01],
         [7.61889846e-01, 5.97918778e-01, 3.16354724e-01,
          9.14989878e-01, 5.75292311e-01],
         [6.78568405e-01, 3.37172607e-01, 6.58893205e-01,
          3.58429814e-01, 9.41435433e-01]],

        [[8.76187298e-01, 2.14238567e-01, 3.81911423e-01,
          7.80074962e-01, 7.64660368e-01],
         [7.36184819e-01, 1.61875622e-01, 3.56934978e-01,
          1.10252455e-02, 7.31190309e-01],
         [1.33872670e-01, 7.54485499e-01, 1.10970980e-01,
          2.73665596e-01, 3.15975309e-01],
         [8.06795361e-01, 1.88548813e-01, 8.08481472e-01,
          4.95769802e-01, 7.94118807e-01]],

        [[1.76695342e-02, 2.83043045e-02, 2.98035885e-01,
          3.77628664e-02, 6.84294814e-01],
         [9.60215255e-01, 1.89209700e-01, 3.37747127e-01,
          9.53157834e-01, 5.72804328e-01],
         [7.25736522e-01, 9.00880849e-01, 3.97499758e-01,
          2.66902872e-01, 3.14820849e-01],
         [9.31249096e-01, 4.15995094e-01, 6.11641348e-01,
          1.58363568e-01, 4.12470962e-01]]]])

We can visualize the output of the rand function with an arbitrary number of parameters.

data_uniform = np.random.rand(1000)
n, bins, _ = plt.hist(data_uniform, 100)
../_images/Numpy Random_21_0.png

To generate a numpy array of integers, we use the np.randint() function, which requires the minimum value, and then optionally the max value and array size. If the size is not specified, the function will return only one number. The following example generates a numpy array of 10 integers whose values range from 1 to 100.

data_uniform = np.random.randint(1, 100, 1000)
n, bins, _ = plt.hist(data_uniform, 100)
../_images/Numpy Random_23_0.png

Random numpy arrays with a normal distribution

Generating a random numpy array of numbers with a normal distribution is as simple as using the np.randn() function.

data_normal = np.random.randn(1000)
import matplotlib.pyplot as plt

n, bins, _ = plt.hist(data_normal, 100)
../_images/Numpy Random_28_0.png