1
00:00:00,540 --> 00:00:02,790
So now that people know that our data set.

2
00:00:03,300 --> 00:00:06,990
Let's take a look at exploring and understanding or data set.

3
00:00:06,990 --> 00:00:10,170
We call this inspecting our data set, which is a very good practice to do.

4
00:00:10,680 --> 00:00:12,750
Here's a block of code that I wrote.

5
00:00:12,780 --> 00:00:17,940
It's a very simple block of code that basically just prints the shape of the tree, and it'll set the

6
00:00:17,940 --> 00:00:23,460
length of it for each one, each of the tree in and test data sets and the labels, as well as the dimensions

7
00:00:23,460 --> 00:00:28,950
of one of the image samples, the shape of all the training labels as well.

8
00:00:28,950 --> 00:00:34,830
So you can take a look at it here just to just as a sanity check, just to make sure everything is as

9
00:00:34,830 --> 00:00:35,370
it seems.

10
00:00:35,880 --> 00:00:42,480
So we know from the previous lesson that the amnesty the set has 60000 creating images, and each images

11
00:00:42,480 --> 00:00:44,970
could add 28 by 28 pixels grayscale.

12
00:00:45,450 --> 00:00:48,270
So you can see when we print out those things here.

13
00:00:48,630 --> 00:00:53,870
Remember when we downloaded our dataset here, we actually put it in these variables already.

14
00:00:53,880 --> 00:00:59,760
So Eminence Load loads this dataset and put the data in these two pools Extra and White Tree, its test

15
00:00:59,760 --> 00:01:01,680
flight test, which we can access directly.

16
00:01:02,580 --> 00:01:03,780
And we have it here.

17
00:01:04,230 --> 00:01:13,530
So we have 60000 twenty-eight by 28 60000 samples, 60000 labels, 10000 samples, 10000 labels, dimensions

18
00:01:13,530 --> 00:01:15,450
of one images, 28 by 28.

19
00:01:15,660 --> 00:01:18,750
The amount of labels and extra in the 16000 which you've seen before.

20
00:01:19,260 --> 00:01:21,660
Likewise for the tests and image sites.

21
00:01:22,230 --> 00:01:24,390
So all that's good so far.

22
00:01:24,930 --> 00:01:30,900
Now let's take a look at visualizing the data so previously in to which we did something very similar.

23
00:01:31,740 --> 00:01:37,620
It's a bit easier to do in the Keros because the data the data is that we don't need to convert it into

24
00:01:37,620 --> 00:01:39,530
an ideal reader to visualize it.

25
00:01:39,540 --> 00:01:46,680
We can actually just simply just access random values using this random function here and using numbers

26
00:01:46,890 --> 00:01:51,930
to get a random random number between zero and the length of the training dataset, which is 60000.

27
00:01:52,560 --> 00:01:56,520
And we can just grab that index, pull the sample out of it here.

28
00:01:56,820 --> 00:02:04,500
And that sample is an image that we can easily easily visualize using our MATLAB macho function that

29
00:02:04,500 --> 00:02:05,220
we use here.

30
00:02:05,640 --> 00:02:08,940
And we just use OpenCV to convert it from BGR to RGV.

31
00:02:09,540 --> 00:02:10,500
And here we go.

32
00:02:10,860 --> 00:02:13,800
So let's run this get some new random samples.

33
00:02:14,460 --> 00:02:16,980
You can see we have seven two five five.

34
00:02:17,640 --> 00:02:20,520
Oh, you can change this to any number of samples, actually.

35
00:02:20,670 --> 00:02:23,970
It was at six and top this list at six.

36
00:02:25,800 --> 00:02:28,440
And yeah, that's fine.

37
00:02:28,450 --> 00:02:31,440
And you can actually plot these and subplots if you wanted to.

38
00:02:31,450 --> 00:02:36,090
You can just have two copies of the code here, put it here and use the cell block function so you can

39
00:02:36,090 --> 00:02:42,060
generate a nice block of code similar to how to which Vision does have multi plot support.

40
00:02:43,260 --> 00:02:46,500
So let's take a look at doing the same thing with doing so.

41
00:02:46,500 --> 00:02:48,040
Actually, some plot is done for you here.

42
00:02:48,060 --> 00:02:49,740
Forgot to put us in in this lesson.

43
00:02:49,770 --> 00:02:50,580
Sorry about that.

44
00:02:51,180 --> 00:02:52,450
However, it's pretty cool to see.

45
00:02:52,530 --> 00:02:58,680
And what I did here as well in the title you can see on top here, I put the ground truth label for

46
00:02:58,680 --> 00:02:59,130
you guys.

47
00:02:59,580 --> 00:03:07,140
So to see how we did this, just simply create the peeled figure, file the object, then specify the

48
00:03:07,140 --> 00:03:09,360
size because we're going to it's going to be quite big.

49
00:03:10,020 --> 00:03:11,070
Otherwise would have been tiny.

50
00:03:12,000 --> 00:03:14,190
And because I want to see it nice and clear.

51
00:03:15,030 --> 00:03:21,210
So then what we do, we did the same subplot iteration thing where we specified it was where we specify

52
00:03:21,220 --> 00:03:22,650
the columns and rows.

53
00:03:23,070 --> 00:03:28,170
We said the title now because we we put a random and actually we're not playing around in the index

54
00:03:28,170 --> 00:03:31,620
of building the first 50 images from our training data.

55
00:03:32,010 --> 00:03:39,570
We also just pulled the first 50 samples, the label sorry, and we just plot them here with the title.

56
00:03:40,110 --> 00:03:45,330
We took off the axis and it's just a nice, clean image and we just show everything here and we use

57
00:03:45,330 --> 00:03:49,500
a sign up to call them up to create together on black and white type.

58
00:03:49,500 --> 00:03:49,830
Look.

59
00:03:50,250 --> 00:03:51,300
And that's it.

60
00:03:51,720 --> 00:03:57,480
That's quite simple, and it's a question I quite nice to visualize with data before training your CNN's

61
00:03:57,480 --> 00:04:01,050
always good to do this, just to double check everything as a sanity check.

62
00:04:01,380 --> 00:04:02,790
So I'll stop there for now.

63
00:04:02,790 --> 00:04:08,190
And then I'll join you in the next section where we talk about pre-processing our data similar to how

64
00:04:08,190 --> 00:04:11,250
we did the transforms and by touch intensive little Keros.

65
00:04:11,640 --> 00:04:12,900
We do it slightly differently.

66
00:04:12,900 --> 00:04:14,370
We have to do it a bit more manually.

67
00:04:14,730 --> 00:04:18,180
But don't worry, it's quite easy stuff, so I'll see you in the next section.
