1
00:00:00,690 --> 00:00:01,920
So welcome back to the lesson.

2
00:00:02,220 --> 00:00:05,280
So now we're going to talk about Transformers.

3
00:00:05,880 --> 00:00:08,640
So what a transformers and why they used.

4
00:00:08,700 --> 00:00:10,830
Well, let's take a look at something here.

5
00:00:11,900 --> 00:00:14,670
Let's look at a code before we go over the explanation of all of this.

6
00:00:15,270 --> 00:00:21,230
Now, when we load data into our deep learning libraries, we want to normalize and standardized that

7
00:00:21,240 --> 00:00:22,770
data for several reasons.

8
00:00:23,160 --> 00:00:29,010
Having data that's not normalized can cause a lot of issues in training and place a lot more importance

9
00:00:29,010 --> 00:00:33,120
on other variables or the pixels in our case when it shouldn't.

10
00:00:33,540 --> 00:00:39,840
So if you want to normalize a values, a good way to do it is just scale the values because initially

11
00:00:40,320 --> 00:00:45,420
the Candice line here or a raw pixel values in the amnesty to set range from zero to 255, which you

12
00:00:45,420 --> 00:00:49,350
would have seen an expensive innocence that that's basically the default.

13
00:00:49,930 --> 00:00:56,580
We would scale of unrepresented densities of colors and intensity of greens as a grayscale image.

14
00:00:57,150 --> 00:00:59,160
So this range of zero to 255.

15
00:01:00,120 --> 00:01:07,590
And when we have to normalize it, a good normalization of practice is to set to range from one to minus

16
00:01:07,590 --> 00:01:07,890
one.

17
00:01:08,520 --> 00:01:15,780
That way, we have that negative positive balance and then not knowing none of the pixel values can

18
00:01:15,780 --> 00:01:18,700
have like super high weighting compared to the others.

19
00:01:18,720 --> 00:01:24,300
It's just something that the bleeding practitioners have observed that normalization improve things

20
00:01:24,300 --> 00:01:27,000
drastically, so it's always a good practice to normalize.

21
00:01:27,030 --> 00:01:29,100
No, you don't have to, but it's a very good practice.

22
00:01:29,640 --> 00:01:31,320
So what transforms do?

23
00:01:31,830 --> 00:01:37,980
Basically, this is a towards functional towards vision function that allows us to create a pipeline.

24
00:01:38,190 --> 00:01:45,870
And this pipeline basically is an operation that performs a series of events, a series of operations

25
00:01:45,870 --> 00:01:49,770
on the image before sending it into the neural net, the tree.

26
00:01:50,700 --> 00:01:57,930
So the first function that we have to do is use watch transforms to Tensor.

27
00:01:58,500 --> 00:02:06,930
This converts the image from the so that we loaded into a PyTorch tensor, and our python tensor is

28
00:02:06,930 --> 00:02:07,910
basically the same.

29
00:02:08,320 --> 00:02:12,360
Remember your arrays and list and nampai representations of images.

30
00:02:12,750 --> 00:02:16,370
A PyTorch tensor is basically the Dutch version of that.

31
00:02:16,380 --> 00:02:22,410
It's just a matrix, but it's an intensive form which allows us to use to be to use these these sensors

32
00:02:22,410 --> 00:02:23,160
on GPUs.

33
00:02:24,210 --> 00:02:29,730
So that's a first normalization type of operation that we use.

34
00:02:30,270 --> 00:02:33,900
The second one is the normalization, one which we just spoke, spoke about.

35
00:02:34,620 --> 00:02:36,900
Now these parameters could look confusing.

36
00:02:37,020 --> 00:02:38,820
Why 0.5 by 0.5?

37
00:02:39,300 --> 00:02:44,160
Well, there's a reason for that, and that's because we want to normalize between one and minus one.

38
00:02:44,550 --> 00:02:47,670
So how does setting this point five, ten point five do that?

39
00:02:47,790 --> 00:02:54,660
Well, this normalization function takes into consideration in tuples as the input Typekit.

40
00:02:54,670 --> 00:02:58,950
This one is four min, one is four standard deviation.

41
00:02:59,580 --> 00:03:03,990
So because we're using for a grayscale image we took, we just put a comma here to see that to balance

42
00:03:03,990 --> 00:03:05,430
the tuple data type standard.

43
00:03:06,000 --> 00:03:11,880
And we don't put in the other two point five because if you look at this line here, four color images

44
00:03:11,880 --> 00:03:17,310
or even this line here for RGV color images, we will use point five point five point five and point

45
00:03:17,310 --> 00:03:23,790
five point five point five, meaning that resetting the mean for all the RGV titles two point five and

46
00:03:23,790 --> 00:03:28,430
the standard deviation destiny two point five point five point five on dudded.

47
00:03:28,950 --> 00:03:30,270
That's why we have these values.

48
00:03:30,270 --> 00:03:32,920
And no, this is how the normalization is done.

49
00:03:32,940 --> 00:03:33,390
Why?

50
00:03:33,420 --> 00:03:34,050
Point five.

51
00:03:34,210 --> 00:03:41,490
Well, image output imaging ActionScript is equal to the image input image minus two mean divided by

52
00:03:41,490 --> 00:03:42,450
the deviation.

53
00:03:42,900 --> 00:03:45,630
This is how we scale the values between the one and minus one.

54
00:03:46,290 --> 00:03:52,950
So the minimum value would be zero minus point five divided, but four point five is equal to one,

55
00:03:52,950 --> 00:03:55,200
and a max value takes this formula as well.

56
00:03:55,530 --> 00:03:56,820
And that's how we end up.

57
00:03:56,940 --> 00:04:02,040
We use this formula to scale the image data between one and minus one.

58
00:04:02,160 --> 00:04:02,910
It's quite simple.

59
00:04:03,780 --> 00:04:09,330
So that takes care of our normalization or transformation that does normalization here.

60
00:04:09,780 --> 00:04:15,360
You can do a lot more different transformations here, but we'll talk about that at least lessons.

61
00:04:15,570 --> 00:04:17,910
But for now, this is what we're going to need to go to use.

62
00:04:18,810 --> 00:04:23,820
So let's stop there, and in the next section, we'll take a look and inspect our dataset.

63
00:04:24,180 --> 00:04:28,110
We'll visualize it and understand the dimensions of it.

64
00:04:28,500 --> 00:04:30,600
So I'll see you in the next section.
