1
00:00:05,380 --> 00:00:10,210
In this lesson we will see how to use GPT for vision.

2
00:00:13,750 --> 00:00:19,990
So to use GPT for vision is as easy as using GPT four.

3
00:00:20,020 --> 00:00:23,590
In fact, it is the same user interface.

4
00:00:23,590 --> 00:00:25,780
You just have to open chat.

5
00:00:25,780 --> 00:00:26,800
GPT four.

6
00:00:26,800 --> 00:00:27,790
Remember that chat?

7
00:00:27,790 --> 00:00:36,160
GPT four is the premium version of chat GPT, so remember that the regular free chat GPT you have right

8
00:00:36,160 --> 00:00:38,800
now is the 3.5 version.

9
00:00:38,800 --> 00:00:49,360
And if you a if you a use the premium version, which is in my opinion very convenient in pricing.

10
00:00:49,450 --> 00:00:53,110
A this is the chat GPT four version.

11
00:00:53,110 --> 00:01:00,880
So in the chat GPT four version you can you have included chat GPT four vision.

12
00:01:00,880 --> 00:01:12,580
So to use to start using GPT four vision is as easy as click on the load a file button you have in the

13
00:01:12,580 --> 00:01:22,690
at the bottom of the screen in in in chat GPT and loading an image so you can load a png file and jpg

14
00:01:22,690 --> 00:01:35,830
file, a web PE file, and also non-animated gif gif files with right now the limit of 20MB per file.

15
00:01:35,950 --> 00:01:40,000
You can also load a table and that's it.

16
00:01:40,000 --> 00:01:44,770
You can start asking questions or request any action about it.

17
00:01:44,770 --> 00:01:55,870
So using GPT four vision is extremely easy once you have GPT four, uh, installed in your computer.

18
00:01:55,870 --> 00:01:59,620
But what can we do with GPT four vision?

19
00:01:59,620 --> 00:02:04,510
Let's see the main use cases for it in the next lesson.

20
00:02:04,510 --> 00:02:11,890
This is going to be super interesting for you, not just to understand what GPT four vision can do,

21
00:02:11,890 --> 00:02:23,170
but to understand what you can do with multimodal LM applications that use GPT four vision as the foundation

22
00:02:23,170 --> 00:02:23,770
model.

23
00:02:23,770 --> 00:02:26,080
We will see this in the next lesson.

