1
00:00:00,600 --> 00:00:02,570
Hi and welcome back to the course.

2
00:00:02,640 --> 00:00:07,860
So in this section, we'll take a look at a course overview because it's a very big course.

3
00:00:08,370 --> 00:00:14,580
But firstly, before we begin, let's take a step back and let's ask the question how do we do computer

4
00:00:14,580 --> 00:00:15,000
vision?

5
00:00:15,030 --> 00:00:16,650
What makes it possible effectively?

6
00:00:17,100 --> 00:00:19,290
So you will need a programming language?

7
00:00:19,290 --> 00:00:24,030
That's how we actually put together our computer vision pipelines and algorithms.

8
00:00:24,450 --> 00:00:27,950
So we will need a very good programming language to do this.

9
00:00:27,960 --> 00:00:32,130
And the best one right now for computer vision is definitely Python.

10
00:00:32,580 --> 00:00:38,490
Now you can do a lot of the things I discuss in this course in MATLAB or C++ or Java.

11
00:00:39,030 --> 00:00:45,780
However, MATLAB isn't free, and C++ and Java are much more low level and a nightmare to work with,

12
00:00:45,780 --> 00:00:48,120
in my opinion, sometimes at least compared to Python.

13
00:00:48,660 --> 00:00:53,020
So Python is our language of choice, and the reason for that?

14
00:00:53,040 --> 00:00:58,640
I'll give some more reasons now is because it's very easy to learn and it's accessible, allowing this

15
00:00:58,710 --> 00:01:04,590
language of artificial intelligence, all of the modern day data science, deep learning libraries like

16
00:01:04,590 --> 00:01:11,790
PyTorch or TensorFlow are basically built to work with Python or what Python in mind, actually, even

17
00:01:11,790 --> 00:01:17,610
though they were coated in C++ that they're designed to work in Python, as the Python has rep that

18
00:01:17,610 --> 00:01:20,340
basically executes the C++ code underneath.

19
00:01:20,670 --> 00:01:27,570
So you don't even have to know that the C++ going on under the hood and open CV has a very good Python

20
00:01:27,570 --> 00:01:32,580
library, which is what we'll be using for the classical computer vision part of this course.

21
00:01:33,150 --> 00:01:37,410
So when I say classical computer vision, what what am I actually talking about?

22
00:01:37,920 --> 00:01:43,920
Well, basically this encompasses all of the computer vision algorithms that don't involve machine learning.

23
00:01:44,400 --> 00:01:50,040
There's a lot of different techniques, mathematical techniques that we can do things like edge detection,

24
00:01:50,400 --> 00:01:53,760
even some simple cascade classifiers.

25
00:01:54,270 --> 00:01:56,370
Basically, I mean, they do use machine learning, to be fair.

26
00:01:56,700 --> 00:02:02,160
But there's a lot of things we can do within the open CV that is considered a classical computer vision,

27
00:02:02,160 --> 00:02:06,960
and it's a very important for your foundational knowledge of computer vision.

28
00:02:07,680 --> 00:02:09,990
So what about deep learning?

29
00:02:10,140 --> 00:02:16,770
Well, deep learning change the game, as I mentioned in the previous section, because basically all

30
00:02:16,770 --> 00:02:22,170
of these traditional classical computer vision algorithms, they could only get us so far.

31
00:02:22,440 --> 00:02:25,260
There's a lot of things we could do, but it's a lot of things we couldn't do.

32
00:02:25,830 --> 00:02:27,420
And deep learning made it possible.

33
00:02:27,430 --> 00:02:33,810
Things like image classification, where we had much more classes than we were used to, like things

34
00:02:33,810 --> 00:02:39,750
like even like handwritten digits, deep learning basically dominated those benchmarks for handwritten

35
00:02:39,750 --> 00:02:40,350
digits.

36
00:02:40,710 --> 00:02:45,960
Then there were things like the sofar image note datasets with thousands or hundreds of classes and

37
00:02:46,320 --> 00:02:50,730
deep learning basically excelled at those types of objectives.

38
00:02:51,360 --> 00:02:57,000
So the reason why deep learning is taken off so much right now is mainly because of two reasons.

39
00:02:57,510 --> 00:03:02,940
And when I say taken off in the last maybe 10 to five years, especially in the last five years, deep

40
00:03:02,940 --> 00:03:09,030
learning has had phenomenal success in industrial and everyday applications, not just research anymore.

41
00:03:09,480 --> 00:03:13,350
And that's because of libraries like TensorFlow, Carrera's and Tierno.

42
00:03:13,380 --> 00:03:16,050
They kicked off the basically the deep learning revolution.

43
00:03:16,560 --> 00:03:21,090
Right now, though, it's only basically PyTorch and TensorFlow that are being used.

44
00:03:21,570 --> 00:03:24,880
So those are the mature depleting libraries that we have right now.

45
00:03:24,900 --> 00:03:26,820
PyTorch was made by Facebook.

46
00:03:27,150 --> 00:03:33,540
TensorFlow was made by Google, and Keros was a layer that sat on top of TensorFlow.

47
00:03:33,840 --> 00:03:39,290
And Google has recently incorporated Keras into TensorFlow to make it much more accessible.

48
00:03:39,300 --> 00:03:41,100
In my opinion, it's easier to work with.

49
00:03:41,520 --> 00:03:47,940
However, you don't get the fine grained control that PyTorch offers, and the big reason why deep learning

50
00:03:47,940 --> 00:03:54,510
is taking off is actually because GPUs are now so accessible and video GPUs in particular.

51
00:03:54,720 --> 00:04:01,620
Even though you can use M.D., there's some libraries like Vulkan that can support MDA M.D. ziplining,

52
00:04:01,620 --> 00:04:07,500
but predominantly 99 percent of deep learning is done on video CUDA GPUs on ninety nine point ninety

53
00:04:07,500 --> 00:04:08,190
nine percent.

54
00:04:08,640 --> 00:04:14,670
And that's because CUDA libraries allow us to train our deep learning networks much, much quicker because

55
00:04:14,670 --> 00:04:21,750
of all the low level coding that I've done to make CUDA such an important library for deep learning.

56
00:04:22,620 --> 00:04:27,030
So here's here's some deep learning examples you have image classification.

57
00:04:27,030 --> 00:04:28,410
You can have object detection.

58
00:04:28,410 --> 00:04:33,540
These are all things that ziplining excels at even deepfakes to us, but especially deepfakes.

59
00:04:34,140 --> 00:04:40,500
You can also use deep learning for things like tumor identification or any anomalies in medical imaging,

60
00:04:40,950 --> 00:04:46,210
as well as air transfer things like body pools estimation segmentation.

61
00:04:46,230 --> 00:04:49,770
Look how good this segmentation model has worked in the scene.

62
00:04:50,430 --> 00:04:55,470
And then these are things like more advanced body pose estimation that we can get a mesh around the

63
00:04:55,470 --> 00:04:57,840
GLISSON and get identify each limb.

64
00:04:57,870 --> 00:04:58,680
It's quite good.

65
00:05:00,000 --> 00:05:01,320
And see that being played here.

66
00:05:02,580 --> 00:05:09,510
So if you were to if it were to separate what's deplaning and what's classical or deplaning basically

67
00:05:09,510 --> 00:05:15,400
adapts to new images quite well, assuming that it's similar to the data it was trained on classical

68
00:05:15,420 --> 00:05:21,930
in computer vision, small changes and image can have big negative effects because it's basically hardcoded.

69
00:05:21,990 --> 00:05:27,600
A lot of it requires us to manually tune things on these algorithms, which isn't the best.

70
00:05:27,630 --> 00:05:30,840
If you have a real world situation like what are you planning?

71
00:05:31,260 --> 00:05:36,720
Will accelerate actually once it's trained on various lighting conditions, deep learning basically.

72
00:05:37,020 --> 00:05:39,300
Also, as I said, it requires models to be trained.

73
00:05:39,790 --> 00:05:44,910
However, classical computer vision is usually as a hard coded algorithm, so it doesn't require any

74
00:05:44,910 --> 00:05:48,720
training, and it runs fairly well and should be used.

75
00:05:49,170 --> 00:05:54,900
Can be run in GPUs, but traditionally it's always run into CPUs because it's easier and it lends itself

76
00:05:54,900 --> 00:05:57,660
naturally to x86 processors.

77
00:05:58,680 --> 00:06:05,040
However, deep learning definitely requires GPU, but that's a good thing because you can get so much

78
00:06:05,130 --> 00:06:09,330
computational improvement in performance or speed up using GPU hardware.

79
00:06:09,870 --> 00:06:13,230
So let's take a look at our open CV outline.

80
00:06:13,260 --> 00:06:19,050
So as you can see, I'm not going to read all of this because it's 40 different topics in open TV,

81
00:06:19,470 --> 00:06:21,360
but we cover a wide range.

82
00:06:21,360 --> 00:06:23,580
We go through all of the basics here.

83
00:06:24,180 --> 00:06:26,280
So you get all the four foundational knowledge here.

84
00:06:26,760 --> 00:06:28,740
Then we saw do things like edge detection.

85
00:06:29,190 --> 00:06:30,960
Contouring is quite important as well.

86
00:06:31,500 --> 00:06:33,480
Sling and sickle and blood detection.

87
00:06:33,930 --> 00:06:35,910
Then we use horror cascade classifiers.

88
00:06:35,910 --> 00:06:39,690
Perspective transforms K means clustering for dominant colors.

89
00:06:39,690 --> 00:06:41,460
Image similarity comparisons.

90
00:06:42,590 --> 00:06:48,170
Then there's a bunch of other things here with little trucking, with optical flow means if facial landmark

91
00:06:48,170 --> 00:06:51,380
and even some simple facial recognition algorithms can be used here.

92
00:06:52,190 --> 00:06:59,870
Background removal We use some OCR libraries like patents, erect an easy OCR barcode reading we can

93
00:06:59,870 --> 00:07:03,470
import YOLO into open TV, Eurovision tree and run.

94
00:07:03,470 --> 00:07:05,600
That's quite easy and nice to use.

95
00:07:06,170 --> 00:07:10,010
We can do a bunch of other things some computational photography as well.

96
00:07:10,370 --> 00:07:17,040
So then we go into looking with video because this is a very important part of computer vision.

97
00:07:17,060 --> 00:07:21,560
A lot of people don't know how they can work with images, but they don't know how to work with VIDEO.

98
00:07:21,570 --> 00:07:27,890
So I go for a few topics on how to load video, how to implement functions on them, how to import R-S,

99
00:07:28,250 --> 00:07:34,280
USB and IP streams, how to automatically reconnect to stream, how to capture like if you have a video

100
00:07:34,280 --> 00:07:39,770
playing on your computer, you can capture screenshots live and bring it into open TV, as well as importing

101
00:07:39,770 --> 00:07:40,720
YouTube videos.

102
00:07:40,790 --> 00:07:43,340
So it's a very useful exception, in my opinion.

103
00:07:44,370 --> 00:07:47,260
Then we have the deep learning outline, which is quite big.

104
00:07:47,270 --> 00:07:53,360
So firstly, we do a deep dive into deep learning and computer vision covering all of these topics in

105
00:07:53,360 --> 00:07:53,900
detail.

106
00:07:54,950 --> 00:08:01,730
Then we take a look at basically building our own convolutional neural network models using both PyTorch

107
00:08:01,730 --> 00:08:03,020
and TensorFlow cameras.

108
00:08:03,560 --> 00:08:11,750
We do some basic with basically go for all of the processes you need to know to load data training model

109
00:08:11,750 --> 00:08:13,250
and inference on that model.

110
00:08:14,270 --> 00:08:18,380
Then we compare libraries, we analyze the performance of all models.

111
00:08:18,860 --> 00:08:23,330
Then we learn how to improve all models with regularization and avoiding overfitting.

112
00:08:23,900 --> 00:08:30,410
Then we go into understanding what CNN see so you can understand better idea of how your model is working

113
00:08:30,410 --> 00:08:32,690
or how it's being trained and what it's responding to.

114
00:08:33,770 --> 00:08:37,250
Next, we take a look at more advanced CNN topics.

115
00:08:37,370 --> 00:08:43,970
We basically go over some design principles, then take a look at several modern CNN architectures.

116
00:08:44,390 --> 00:08:48,980
Some of these on some models, basically traditional networks like these tree here.

117
00:08:49,430 --> 00:08:52,830
However, we took a look at some modern ones resonates.

118
00:08:52,850 --> 00:08:56,390
They're not that modern, but are one of the best dense, inert, efficient net.

119
00:08:56,990 --> 00:09:03,320
Then we take a look at how to implement them, how to do a top one to five accuracy's implement callbacks

120
00:09:03,320 --> 00:09:03,830
as well.

121
00:09:04,370 --> 00:09:09,980
Then we take a look at a PyTorch library called Python's Lightning, which is a very useful library

122
00:09:09,980 --> 00:09:16,250
and allows us to do some complicated things like training on multiple GPUs to use easy to implement

123
00:09:16,250 --> 00:09:17,450
transfer learning on it.

124
00:09:17,780 --> 00:09:20,600
It does auto batch selection, auto learning, read selection.

125
00:09:20,960 --> 00:09:22,280
So it's a quite useful library.

126
00:09:22,850 --> 00:09:26,600
Next, we take a look at transfer learning and both TensorFlow, Keras and PyTorch.

127
00:09:27,110 --> 00:09:33,890
Then we do a Google Deep Dream Neural style transfer or to include is take a look at generative adversarial

128
00:09:33,890 --> 00:09:39,170
networks when we do some pretty cool projects like generating anime characters or the Arcand project.

129
00:09:40,220 --> 00:09:46,580
Then we take a look at Siamese networks, which directly lead to facial recognition, and we use deep

130
00:09:46,580 --> 00:09:50,000
face as well to to enhance our facial recognition algorithms.

131
00:09:50,570 --> 00:09:54,050
Then we take a look at a lot of different object detectors.

132
00:09:54,410 --> 00:10:00,080
We cover basically the history and all of the give a good overview of object detectors.

133
00:10:00,470 --> 00:10:06,110
Then we start using facial detection to assess these and the big one.

134
00:10:06,170 --> 00:10:06,710
YOLO.

135
00:10:07,070 --> 00:10:12,460
We actually took a look at your evolution in three, four and five, as well as PyTorch implementations

136
00:10:12,470 --> 00:10:13,780
of darknet implementations.

137
00:10:14,210 --> 00:10:19,370
And we do a number of different projects, as you can see here with all of these different GitHub to

138
00:10:19,370 --> 00:10:23,600
set a very good, very true object detection outline.

139
00:10:24,650 --> 00:10:31,400
Then we take a look at deep segmentation where we look at unit signet and deploy up Vision Tree, which

140
00:10:31,400 --> 00:10:32,270
isn't mentioned here.

141
00:10:32,270 --> 00:10:35,600
But it isn't, of course, as well as Mascaro CNN's.

142
00:10:36,530 --> 00:10:41,540
Then we take a look at deep salt, which is deep tracking, which is a very important topic.

143
00:10:42,530 --> 00:10:47,510
Then we take a look at some miscellaneous tutorials where we make deepfakes.

144
00:10:47,510 --> 00:10:54,320
Take a look at vision transformers of depth, estimation, image similarity, image captioning, video

145
00:10:54,320 --> 00:11:01,640
classification, point cloud classification and segmentation 3D image classification with CT scans and

146
00:11:01,640 --> 00:11:07,820
X-ray and ammonia classification project OCR Model four captures as well as a web app.

147
00:11:08,000 --> 00:11:11,780
We do a rest API and then a flask web app as well.

148
00:11:12,350 --> 00:11:14,600
So that's it for the overview.

149
00:11:15,200 --> 00:11:21,290
In the next section, I'll start picking off a basic introduction to computer vision, so stay tuned

150
00:11:21,290 --> 00:11:21,690
for that.

151
00:11:21,770 --> 00:11:22,580
Thank you for watching.