WEBVTT

00:00.150 --> 00:04.920
Hello again! In this video, we are going to look at how to get random numbers in modern C++.

00:06.120 --> 00:11.760
C++11 introduced a completely different way of working with random numbers, which is based around

00:11.880 --> 00:12.450
classes.

00:15.280 --> 00:17.320
These are defined in the <random> header.

00:17.840 --> 00:22.000
There are random number engine classes, which will generate random numbers.

00:22.780 --> 00:26.390
There are distribution classes which will re-scale their input.

00:27.190 --> 00:32.140
And there is also random_device, which you can use for seeding a random number engine.

00:34.470 --> 00:38.460
These are all implemented as functors, with function call operators.

00:39.480 --> 00:45.330
The constructor of the random number engine will generate a sequence of pseudo-random numbers. When

00:45.330 --> 00:51.210
we call the function call operator the random number engine will return the next number from the sequence.

00:52.800 --> 00:59.060
So here we are creating an object, default_random_engine, so that will generate the sequence of random

00:59.070 --> 01:05.490
numbers. And then we call the function call operator. And, each time, that will return the next number in

01:05.490 --> 01:06.060
the sequence.

01:07.650 --> 01:09.690
So there we are, five random integers.

01:10.920 --> 01:13.920
A distribution class is also implemented as a functor.

01:14.190 --> 01:15.540
There will be an example, in a minute!

01:16.440 --> 01:18.820
The constructor takes the range as arguments.

01:18.840 --> 01:25.440
So if we put 1 and 100, then the range would be between 1 and 100, inclusive.

01:26.520 --> 01:31.770
The function call operator of the distribution takes a function object as argument.

01:32.190 --> 01:34.470
So typically that would be the random number engine.

01:35.280 --> 01:38.540
Then that will call the function called operator on the random number engine.

01:39.480 --> 01:43.530
Then that return value will be re-scaled, and fitted into the distribution.

01:45.390 --> 01:47.700
And by the way, distributions are general purpose.

01:47.700 --> 01:49.830
They are not just for random numbers.

01:50.340 --> 01:55.440
If you have any numerical data sequence, you can write a functor which will return the next element in

01:55.440 --> 01:55.830
the function

01:55.830 --> 01:56.520
call operator.

01:56.880 --> 01:58.380
And you can use that with a distribution.

01:58.830 --> 02:03.690
So that is very useful, if you are writing applications which involve statistics.

02:05.670 --> 02:11.970
The two main random number engine classes are: default_random_engine, which is implementation-defined.

02:12.360 --> 02:15.420
It can just be a wrapper around the rand()  implementation.

02:17.300 --> 02:19.940
And the other one is mt19937.

02:20.270 --> 02:21.470
Which is a catchy name!

02:22.250 --> 02:27.440
This is the so-called "Mersenne Twister" algorithm, which has a period of 2 to the power of 19937

02:27.440 --> 02:33.650
minus 1. As opposed to rand() implementations, which may only have two to the 16 minus

02:33.650 --> 02:33.890
one.

02:34.730 --> 02:35.690
This is very fast

02:35.690 --> 02:36.740
at generating numbers.

02:37.820 --> 02:41.600
It is very hard to guess the next number, so it is almost crypto secure.

02:42.380 --> 02:46.370
It has a lot of state, so it takes a long time to initialize, and also to copy it.

02:46.940 --> 02:49.970
And for most applications, this is the best general choice.

02:53.020 --> 02:55.480
There are a wide range of distribution types.

02:55.990 --> 02:58.240
We can have Bernoulli, normal, Poisson and so on.

02:59.560 --> 03:04.420
Usually when we are dealing with random numbers, we want them to be uniformly distributed, which means

03:04.420 --> 03:08.190
that all the values in the range are equally likely. For that,

03:08.200 --> 03:13.660
we use uniform_int_distribution for integers. And we need to give the type.

03:14.380 --> 03:17.770
The constructor will take the range of values.

03:18.430 --> 03:22.960
And there is also a uniform_real_distribution for floating-point numbers.

03:23.770 --> 03:25.090
So let's look at an example.

03:25.750 --> 03:30.970
So we're doing basically the same code we had before, but this time we're using the Mersenne twister as

03:30.970 --> 03:32.040
the engine.

03:32.620 --> 03:37.120
And we are also going to use a uniform_int_distribution to re-scale the values.

03:37.720 --> 03:45.280
We are going to re-scale them to be ints between 0 and 10. And then we call the distribution's function

03:45.280 --> 03:46.030
call operator.

03:46.690 --> 03:51.100
This will call the engine's function call operator to get the random numbers, and then it is going

03:51.100 --> 03:55.600
to re-scale them. And then we do the same thing again, but with doubles.

03:59.130 --> 04:04.890
So there we are, we have 5 random instances between 0 and 10, and 5 random floating-point numbers

04:04.890 --> 04:06.390
between 0 and 10.

04:11.680 --> 04:18.040
And the final class is random_device, this will generate true random numbers using hardware.

04:18.400 --> 04:20.470
This is based on the system's entropy data.

04:20.800 --> 04:26.140
So this is things like process ID, CPU temperature and things like that, which will scramble together,

04:27.820 --> 04:29.320
but only if it is supported.

04:29.590 --> 04:34.990
If you have a system which does not produce entropy data, then it will fall back to using a pseudo-random

04:34.990 --> 04:35.380
number.

04:36.010 --> 04:42.340
That is also the case if you are using GNU C++, which does not support entropy data, even if the system

04:42.340 --> 04:42.970
provides it.

04:45.640 --> 04:50.800
And also, the data depends on things which are happening on the system. If nothing happens on the

04:50.800 --> 04:56.230
system, then all the entropy gets used up. And you have to wait until some more entropy becomes available.

04:56.710 --> 05:02.110
So this can be very slow, but if it is fully implemented, then it is actually completely crypto

05:02.110 --> 05:02.800
secure.

05:08.100 --> 05:13.530
random_device is implemented as a functor again, so we just call the function call operator to

05:13.530 --> 05:14.880
get the next random number.

05:17.390 --> 05:19.670
Because it has these performance limitations,

05:19.910 --> 05:24.800
it is not really suitable for using as a generator of random numbers, in large quantities.

05:25.820 --> 05:29.840
The idea is that you would use the random_device to generate a seed.

05:30.530 --> 05:35.990
So for example, you would create an object of an engine, and then you would pass a call to the random device

05:36.320 --> 05:37.010
as argument.

05:37.520 --> 05:42.830
So this would return a true random number, and then that will seat the random number engine.

05:45.640 --> 05:51.070
So, some advice. I would recommend using mt19937, unless you have a particular

05:51.070 --> 05:56.380
reason to use one of the others. If you are going to use random_device, check the documentation to

05:56.380 --> 05:59.140
make sure it does actually do what it should do.

05:59.530 --> 06:06.820
It does actually use hardware random numbers. When you create objects of the engine and the distribution,

06:06.820 --> 06:12.310
make them static, because creating and copying these objects is fairly time consuming.

06:12.640 --> 06:16.750
If you have them as local variables in a function, then every time you call the function, the engine

06:16.750 --> 06:19.090
has to be initialized, which will take time.

06:19.450 --> 06:21.100
And also, it will restart the sequence.

06:21.640 --> 06:24.550
So you can end up getting the same random numbers every time you call the function!

06:25.540 --> 06:28.630
And besides, you are probably only going to need one object per program anyway.

06:29.920 --> 06:31.270
And finally, a health warning.

06:31.270 --> 06:37.120
This is all aimed at people writing general-purpose applications. Random numbers are a bit like floating-

06:37.120 --> 06:40.870
point numbers, in that they seem fairly straightforward. When you examine them in detail,

06:40.900 --> 06:44.470
there is all sorts of hidden problems and scope for things to go wrong.

06:44.920 --> 06:48.790
So if you are doing something really important, you do need to know what you are doing.

06:49.390 --> 06:50.890
So bear that in mind.

06:52.090 --> 06:53.770
Okay, so that is it for this video.

06:54.340 --> 06:55.210
I will see you next time.

06:55.210 --> 06:57.360
But until then, keep coding!
