WEBVTT

00:01.680 --> 00:09.150
Hello again! In this video, we're going to talk about pointers and memory. A pointer is a variable which

00:09.150 --> 00:13.830
represents an address in memory, so the value of this pointer is the address.

00:14.700 --> 00:19.290
This could be a variable on the stack or some memory that we've allocated on the heap.

00:19.920 --> 00:25.560
If we're working at a low level, then dealing with the operating system often involves working with

00:25.560 --> 00:26.070
pointers.

00:26.610 --> 00:32.580
It's also quite common for devices which are attached to the computer to be memory-mapped, which means that

00:32.580 --> 00:35.220
the operating system gives you an address in memory.

00:35.550 --> 00:38.760
And if you write to that address, it sends data to the device.

00:39.150 --> 00:41.880
And if you read from it, it gets data from the device.

00:42.750 --> 00:46.140
So pointers are pretty important and they are quite central in C.

00:48.450 --> 00:54.870
To create a pointer variable, we put an asterisk after the type name, so int star p, that declares declares p

00:55.140 --> 00:56.300
as a pointer to int.

00:58.210 --> 01:03.790
To initialize or assign to a pointer, we need to give us an address. So, for example, we could

01:03.790 --> 01:07.960
take the address of some stack variable, i, so p1 is a pointer to i.

01:09.160 --> 01:15.270
p1 will be the address of i and then to get access to the data in that memory,

01:15.650 --> 01:20.680
we dereference the pointer. So we put an asterisk in front of the name of the pointer and that will

01:20.680 --> 01:24.430
printout the data in there, which we expect to be one.

01:30.550 --> 01:32.560
We can also allocate memory on the heap.

01:32.830 --> 01:35.080
C++ has an operator called "new".

01:35.890 --> 01:38.760
If you've used the malloc function in C, this is similar,

01:38.800 --> 01:39.970
but not quite the same.

01:41.830 --> 01:45.400
So new, followed by the type, will allocate memory to store that type.

01:45.880 --> 01:51.580
So in this case, we're allocating memory to store an int. new will return the address at the start of this

01:51.580 --> 01:51.970
memory.

01:52.390 --> 01:54.250
So p2 is the address of this int.

01:56.860 --> 01:59.230
This memory is going to be default initialized.

01:59.260 --> 02:05.140
So if it's a built in type, it'll have an undefined value, which will be whatever the data in there

02:05.470 --> 02:06.100
happened to be.

02:06.910 --> 02:09.940
If it's a class, it'll call the default constructor for that class.

02:11.470 --> 02:17.530
We can pass arguments to new, like that, and these arguments will be forwarded to the constructor of

02:17.530 --> 02:20.500
the class, or used to initialize the built-in type.

02:21.160 --> 02:23.680
So in this case, we're going to allocate memory to store an int.

02:24.100 --> 02:26.380
And it's going to be initialized with the value 36.

02:28.180 --> 02:32.150
Here we're using the universal initialization syntax in C++.

02:32.960 --> 02:37.040
If you're using an older version of C++, then you have to use round brackets instead.

02:42.180 --> 02:45.530
So let's try this out. So we have an int, i.

02:45.600 --> 02:49.770
We are taking its address and storing that in the pointer p1. Then we are printing out the pointer

02:49.920 --> 02:51.990
and the dereferenced pointer.

02:53.220 --> 02:56.820
Then we allocate a couple of pointers to int.

02:58.600 --> 03:04.090
One which is not a initialized and one which is. And actually, let's print out the values of those, so...

03:21.250 --> 03:28.090
OK, so p1 is the address of i, so that is some hexadecimal number, which represents an address in memory

03:29.080 --> 03:29.550
star p1

03:29.640 --> 03:33.400
is the data in i, which is one. p2

03:33.400 --> 03:36.760
is this memory which we didn't initialize, so the data in it could have any value.

03:36.760 --> 03:39.040
And in fact, it has a pretty strange looking value.

03:40.060 --> 03:43.900
With p3, we did initialize it, and the data in there has the value 36

03:47.620 --> 03:49.610
When we allocate memory from the heap,

03:49.630 --> 03:51.430
this will remain allocated to the program.

03:51.430 --> 03:53.320
So the program is going to be using that memory.

03:53.590 --> 03:58.690
It's not going to be available to anything else on the computer, and that will be the case until the

03:58.690 --> 03:59.770
memory is released.

04:01.730 --> 04:07.550
So if the program doesn't explicitly release it, then that memory is not available for anything else.

04:12.880 --> 04:18.460
The operating system will restrict the amount of memory that a program uses, so the operating system

04:18.460 --> 04:20.860
won't allow a program to hog all the memory on the computer.

04:21.640 --> 04:27.400
If you try to allocate more than the amount that you're allowed, then the operating system may refuse

04:27.400 --> 04:29.200
to perform the allocation.

04:29.530 --> 04:34.480
So instead of getting a pointer to allocated memory, you'll get the value null, which is a special

04:34.990 --> 04:38.620
kind of pointer that doesn't represent valid memory.

04:46.730 --> 04:49.100
So failing to release memory can be a problem.

04:49.340 --> 04:52.490
And one easy way to do that is to create what's called a memory leak.

04:53.000 --> 04:54.740
So we've got a function here.

04:55.490 --> 04:57.680
We have a pointer to allocated memory.

04:57.680 --> 05:01.580
So P4 is a pointer to an int with value 42.

05:01.880 --> 05:03.470
Then we do something with the code in here.

05:04.950 --> 05:06.930
p4 is actually a local variable.

05:07.320 --> 05:11.700
It's a pointer, but the actual variable itself is still a local variable.

05:12.210 --> 05:16.440
And when we get to the end of the scope, this variable is going to be destroyed.

05:17.280 --> 05:19.290
So the allocated memory will still exist.

05:19.290 --> 05:23.910
But the pointer [variable] we're using to keep track of it no longer exists.

05:24.510 --> 05:28.230
So we now no longer have any way of contacting that memory.

05:28.230 --> 05:31.500
So that memory is just floating in space.

05:31.500 --> 05:33.540
really! We can't use it and we can't release it.

05:33.930 --> 05:38.670
So it's taking up memory but not doing anything useful. And that's known as a memory leak.

05:44.100 --> 05:45.510
So here is that code.

05:45.570 --> 05:50.460
So we have this function, which allocates memory, but doesn't release it, and it also loses track

05:50.460 --> 05:51.120
of the pointer.

05:51.240 --> 05:53.910
So we can't delete it in the main function.

05:54.690 --> 05:58.320
So we're calling the function here, but we don't have any way in main to release that memory.

06:00.910 --> 06:07.240
OK, so it seems to run normally, but in fact, this memory here is being wasted. By the time the function

06:07.240 --> 06:08.740
returns, we can't actually get to it.

06:11.210 --> 06:16.190
One way would be to return the pointer from this function, and then we can release it in here. Another

06:16.190 --> 06:21.290
way would just be to actually release the memory when we finish using it. And that would be the preferred

06:21.290 --> 06:21.860
solution.

06:27.290 --> 06:29.430
OK, so how do we release memory?

06:29.870 --> 06:32.510
C++ provides the delete operator.

06:32.840 --> 06:38.660
So if we put Delete, followed by a pointer to some memory that was allocated by new, then this

06:38.660 --> 06:40.220
will cause the memory to be released.

06:41.000 --> 06:47.060
First of all, the destructors will be called for the objects in that memory and then the memory will

06:47.060 --> 06:47.570
be released.

06:47.750 --> 06:51.920
So you do get a chance to clean up the objects before they get destroyed.

06:54.440 --> 07:00.230
Again, this variable p is still a stack object. It may represent memory that's allocated on the heap, but the

07:00.230 --> 07:02.030
actual variable itself is on the stack.

07:02.960 --> 07:08.690
So once the memory's been released, the variable will continue to exist until the end of that scope.

07:09.710 --> 07:12.230
So it is actually possible for the program to access that.

07:12.830 --> 07:16.760
And if that does happen, then that's undefined behavior.

07:16.760 --> 07:19.820
So the program will probably crash or behave very strangely.

07:20.270 --> 07:23.030
And the reason is that p is not pointing to memory.

07:23.030 --> 07:25.190
Memory that is available for use.

07:27.000 --> 07:29.040
We say that p is a "dangling pointer".

07:34.360 --> 07:40.030
So it's very important that when you allocate memory with new, then you have a corresponding call

07:40.120 --> 07:45.310
to delete when you finish with using the memory. To avoid memory leaks and dangling pointers.

07:49.850 --> 07:55.250
So this is how we fix the problem with the memory leak. We call delete p4.

07:55.280 --> 07:58.040
So this will release the memory that was allocated.

08:04.400 --> 08:08.960
OK, so it still seems to run normally, but this time we don't have any memory leaks.

08:12.020 --> 08:14.600
And to show you the problem with a dangling pointer.

08:18.380 --> 08:23.780
So let's try to access this pointer after it's been released, so let's try to write to it, so we're

08:23.780 --> 08:27.110
going to assign to the data that was in this memory.

08:28.430 --> 08:33.320
So here we allocate the memory, we initialize it with the value 42, then we release that memory.

08:33.980 --> 08:39.230
Then we try to write some data into this memory, even though it no longer belongs to us.

08:39.890 --> 08:40.970
So what happens?

08:43.860 --> 08:47.280
OK, so I think you can see there was a bit of pause after that.

08:50.540 --> 08:53.060
If we put in a print statement, it might be a bit clearer.

09:04.160 --> 09:09.710
OK, so it didn't actually execute this print statement, so that indicates that the program has crashed,

09:10.190 --> 09:12.230
probably while executing this statement.

09:14.150 --> 09:15.500
So that's something to watch out for.

09:21.620 --> 09:25.430
And then finally, I mentioned you can have more than one object in the allocated memory.

09:26.060 --> 09:30.080
You can actually allocate a block of memory and access it as if it were an array.

09:31.950 --> 09:37.080
So in that case, you put square brackets after the type and then you put the number of elements inside

09:37.080 --> 09:37.860
the square brackets.

09:38.910 --> 09:45.420
So here we're going to allocate enough memory to hold 20 ints and then pa is going to be a pointer to the start

09:45.420 --> 09:46.050
of this memory.

09:46.530 --> 09:49.320
So that's going to be appointed to the first element in this array.

09:50.160 --> 09:54.300
Then we can just go through and use the normal array syntax with square brackets.

09:58.780 --> 10:01.640
If we're doing that, we have to use the right form of delete.

10:01.660 --> 10:06.400
There is actually two forms of delete to match the second form of new.

10:06.880 --> 10:11.320
And the second form of delete has a pair of square brackets between the delete and the variable.

10:11.980 --> 10:17.950
So if we use this form, it'll tell the program that we're deleting the entire array and not just the

10:17.950 --> 10:18.460
variable.

10:18.850 --> 10:21.280
that happens to be at the start of the memory.

10:23.110 --> 10:27.340
If we use the other form, then we only delete the first element.

10:27.400 --> 10:30.580
So the first element will be destroyed, and the memory for it will be released.

10:31.000 --> 10:36.610
But the elements after that will still be in memory, and it's possible that we may have messed up the

10:36.610 --> 10:39.700
internal data structures in the memory management of the program.

10:40.150 --> 10:43.030
So this could cause the program to behave strangely or crash.

10:44.950 --> 10:46.000
So let's try this out.

10:46.060 --> 10:51.040
So we're allocating our array of 20 ints, then we're going to go through and populate them.

10:52.620 --> 10:54.710
Then we print out the value of each element.

10:56.050 --> 11:01.330
And then finally, we're going to release the array. So we should expect to see the numbers from one to

11:01.330 --> 11:03.790
20, we're allocating - sorry, from zero to 19.

11:06.580 --> 11:07.450
OK, so we... (sorry, slight typo!)

11:11.790 --> 11:16.560
(Only in a print statement, so it doesn't really matter!)

11:17.580 --> 11:20.520
So we're allocating memory for the array and we populate the array.

11:21.510 --> 11:27.900
Then we print out the array elements, then we release the memory for array, and then we finish.

11:28.470 --> 11:28.860
Okay.

11:29.340 --> 11:34.830
So what is some of the things that could go wrong so we could use the wrong form of delete?

11:38.530 --> 11:41.740
Right, so it actually got to the end of the program, so we got away with that.

11:41.980 --> 11:45.910
But we may not necessarily get away with that every time, so do be careful with that.

11:53.180 --> 11:59.750
And then we only have 20 elements in this array, but if we try to access the 21st element, so we're

12:00.410 --> 12:03.350
accessing some memory that doesn't belong to us.

12:04.710 --> 12:09.060
Okay, so something different has happened, and right, we've got this strange value here.

12:10.720 --> 12:16.360
This is not actually an element in the array. So when we defined the array we had 0, 1, 2, 3, 4,

12:16.360 --> 12:22.360
5, 6, 7, 8, 9 up to 20 and then the memory after that is not actually part of the array.

12:22.930 --> 12:27.970
So when we read it, we got because whatever happened to be in that memory, which converts to that

12:27.970 --> 12:28.480
as an int.

12:32.820 --> 12:33.700
So put that back.

12:33.930 --> 12:38.430
And then if we try writing past the end of the array, we're going to write to some memory which doesn't

12:38.430 --> 12:39.090
belong to us.

12:41.820 --> 12:44.130
OK, that's definitely not right!

12:44.940 --> 12:50.820
So we've actually got a runtime error. So we have heap corruption detected. The application wrote

12:50.820 --> 12:52.790
to memory after end of heap buffer.

12:53.880 --> 12:59.100
So you actually get, or at least with Visual C++ in debug mode, you actually get something that tells

12:59.100 --> 13:00.210
you what the error was.

13:01.140 --> 13:05.310
So... I wish compilers had been like this when I was learning C++!

13:05.930 --> 13:06.690
So anyway.

13:11.050 --> 13:11.440
Okay.

13:11.500 --> 13:13.540
So that's it for this video.

13:14.170 --> 13:15.160
I'll see you next time.

13:15.220 --> 13:17.170
But meanwhile, keep coding!
