WEBVTT

00:00.150 --> 00:06.660
Hello again! In this video, we are going to look at the tree data structure. C++ uses this for implementing

00:06.660 --> 00:11.610
associative containers in the standard library, so it is useful to have some idea of how it works.

00:12.600 --> 00:15.690
If you already know this, then you can just skim through this video.

00:16.170 --> 00:21.060
But if you are not familiar with trees, or you want to go over them again, then I hope this will help

00:21.060 --> 00:26.760
you. The elements in a tree are similar to the elements in a linked list.

00:27.270 --> 00:29.850
We have a separate node for each element.

00:30.390 --> 00:36.900
This has a key which will identify the element, and two pointers, which point to other elements in

00:36.900 --> 00:39.900
the data structure. In the tree,

00:39.930 --> 00:46.290
these are called left and right, and the position of the element in the tree depends on the value

00:46.290 --> 00:47.010
of its key.

00:49.110 --> 00:55.110
The left pointer will be to an element which has a lower value of the key than ours, and the right

00:55.110 --> 00:58.980
pointer will be to an element which has a higher value of the key.

01:02.840 --> 01:08.720
The tree data structure looks like this, and I know it is not like a real tree! In computer science,

01:08.870 --> 01:12.560
trees always have their root at the top and the branches at the bottom.

01:12.980 --> 01:16.280
So we start here, and then we grow downwards and outwards.

01:18.170 --> 01:21.590
The first element, which is added to the tree, is called the root.

01:21.920 --> 01:26.060
So here is our root, with the key 6, and left and right pointers.

01:28.080 --> 01:34.890
If we add an element with the key seven, 7 is greater than 6, so we need to be on the right of 6

01:35.970 --> 01:41.640
and then we set the right pointer of 6 to point to this new element.

01:43.110 --> 01:49.200
If the key is 4, then 4 will need to go to the left of 6, because it is lower than 6, and the

01:49.200 --> 01:51.240
left pointer will point to this element.

01:53.270 --> 01:59.660
What happens if we have an element with key 5? So, 5 is less than 6, so we go left, but 5

01:59.660 --> 02:00.590
is greater than 4,

02:00.800 --> 02:04.370
so we go right. And then, for an element with key 3,

02:04.790 --> 02:06.050
3 is less than 6,

02:06.080 --> 02:08.460
So we go left, and 3 is less than 4.

02:08.480 --> 02:09.620
So we go left again.

02:11.460 --> 02:18.180
So we now have five elements, and 3 was the last one which was added. So inserting an element,

02:18.510 --> 02:21.450
once you know where to insert it, just involves modifying a pointer.

02:22.230 --> 02:24.660
Erasing an element can be a little bit more complex.

02:24.660 --> 02:29.610
If we erase this element 4, then we need to make sure that 3 and 5 go into the right position.

02:31.880 --> 02:36.650
The main advantage of this structure is that locating and searching for elements is very fast.

02:37.190 --> 02:39.950
Let's say we want to find this element with the key 3.

02:40.640 --> 02:44.210
So we start at the root. 3 is less than 6, go left.

02:44.810 --> 02:46.700
3 is less than 4, go left again.

02:46.710 --> 02:47.810
And we found it.

02:48.440 --> 02:53.840
So we only needed to look at three elements to find this, even though there are five elements in the container,

02:54.170 --> 02:55.850
and this was the last one which was added.

02:56.450 --> 02:58.880
So this is quicker than searching a sequential container.

03:04.620 --> 03:07.410
So adding or removing an element is very fast.

03:07.800 --> 03:11.320
There are no memory operations, apart from the actual allocation of the node.

03:12.120 --> 03:16.260
Only one other element needs to be modified, and that is just a pointer assignment.

03:16.890 --> 03:19.620
And there is no moving around of elements within the tree.

03:21.370 --> 03:23.380
And also finding elements is very fast.

03:23.740 --> 03:28.900
We just need to compare the data and dereference a pointer for each element, until we get there.

03:31.100 --> 03:37.820
There is just one potential drawback to trees.  If we have too many elements on one branch, or on one

03:37.820 --> 03:43.280
side of the tree, then the tree becomes unbalanced and operations take longer.

03:44.510 --> 03:47.420
If this gets too bad, we may need to rebalance the tree.

03:49.410 --> 03:55.560
To do that, we need to choose a different element as the root, and then move all the elements around

03:55.560 --> 03:58.470
it, making sure they have their correct relative positions.

03:59.220 --> 04:00.750
And that will make the tree balanced again.

04:02.720 --> 04:07.550
There are some tree structures which will actually do this automatically. These are "balanced" trees

04:07.550 --> 04:08.990
or "self-balancing" trees.

04:09.350 --> 04:13.430
They will notice when the tree gets too badly unbalanced and they will rebalance

04:13.430 --> 04:17.960
it. Examples of these are the red-black tree and the AVL tree.

04:18.650 --> 04:24.320
I am not going to go into any detail about how to actually implement a tree yourself, but it is actually

04:24.320 --> 04:26.530
a good exercise in using pointers, if you want to try it.

04:27.230 --> 04:28.880
And there is plenty of material around.

04:30.020 --> 04:31.610
Okay, so that is it for this video.

04:31.880 --> 04:32.710
I will see you next time.

04:32.990 --> 04:34.760
Until then, keep coding!
