1
00:00:03,590 --> 00:00:12,890
So these are very interesting and important chapter where we are going to explore how to a use a long

2
00:00:12,890 --> 00:00:17,030
chain application and an application with an API.

3
00:00:17,030 --> 00:00:22,400
So we can start communicating our application with the external world.

4
00:00:22,400 --> 00:00:22,970
Right.

5
00:00:22,970 --> 00:00:28,640
And to connect it and integrate it with other, other pieces of software.

6
00:00:28,640 --> 00:00:39,230
So what we are going to do in this exercise is to connect a basic drag application with a fast API.

7
00:00:39,680 --> 00:00:40,430
API.

8
00:00:40,580 --> 00:00:49,070
So fast API is a probably the most, uh, used, uh, framework in the Python ecosystem.

9
00:00:49,340 --> 00:00:56,210
Uh, you can you have different possibilities, but this one is one of the most popular in the launching

10
00:00:56,210 --> 00:00:56,780
world.

11
00:00:57,110 --> 00:01:03,530
So you will see that we will use the fast API module of long chain to create the API.

12
00:01:03,530 --> 00:01:08,150
And we will use Uvicorn to create the local server.

13
00:01:09,670 --> 00:01:21,190
The creation of the Rag application is a familiar to you, so we will start loading the dot env file

14
00:01:21,190 --> 00:01:26,980
the credentials from the file in order to be able to communicate with the OpenAI API.

15
00:01:26,980 --> 00:01:29,890
And then we will create the Rag application.

16
00:01:29,890 --> 00:01:32,680
We will create the LM instance.

17
00:01:32,680 --> 00:01:34,810
We will import the loader.

18
00:01:34,810 --> 00:01:40,570
We will load a text file that is in the data directory.

19
00:01:40,570 --> 00:01:46,600
This is an article an article about startups and stuff about startups.

20
00:01:46,600 --> 00:01:54,010
And we uh the next thing we do is we split this document into small chunks.

21
00:01:54,010 --> 00:02:03,580
Then we, uh, transform the chunks into embeddings, and we load the content into a file vector database.

22
00:02:03,790 --> 00:02:09,250
Then we create a retrieval QA chain.

23
00:02:09,250 --> 00:02:15,550
And with this chain we can start making questions to our private document.

24
00:02:16,510 --> 00:02:24,400
So how do we connect these basic rack application with fast API.

25
00:02:25,090 --> 00:02:27,370
So the first thing we need is to install.

26
00:02:27,370 --> 00:02:32,260
If you don't have it to install the fast API package you know how to do it.

27
00:02:32,260 --> 00:02:38,590
If you are using terminal, you just write pip install fast API and if you want to do it, you want

28
00:02:38,590 --> 00:02:41,320
to do it via a Jupyter notebook.

29
00:02:41,320 --> 00:02:46,330
You can write the exclamation mark before and then you execute the cell.

30
00:02:46,330 --> 00:02:47,290
And that's it.

31
00:02:47,290 --> 00:02:50,680
You have here the pound sign because we want to uncomment this.

32
00:02:50,710 --> 00:02:54,400
We don't want to do this operation because we have done it already.

33
00:02:54,670 --> 00:03:01,150
So then you import the fast API and the Http exception modules.

34
00:03:01,150 --> 00:03:02,680
You create the application.

35
00:03:02,680 --> 00:03:05,590
And in this case we are creating an endpoint.

36
00:03:05,590 --> 00:03:07,060
It's a post endpoint.

37
00:03:07,060 --> 00:03:15,250
And you see that this endpoint is using the chain we just created in order to get the uh data.

38
00:03:15,250 --> 00:03:22,810
So once you have that you install you can install Uvicorn to proceed with the local server.

39
00:03:22,810 --> 00:03:28,600
And once you have that, this is the way you configure it using Jupyter Notebook.

40
00:03:28,600 --> 00:03:34,600
If you are using, for example, a Visual Studio Code editor instead of a Jupyter notebook, you can

41
00:03:34,600 --> 00:03:36,220
use this other approach.

42
00:03:36,220 --> 00:03:38,350
You have any problem with this process?

43
00:03:38,350 --> 00:03:46,510
With the server, you just go to the usual places ChatGPT and Google or StackOverflow and ask about

44
00:03:46,510 --> 00:03:51,040
any doubt or problem you may have in the process, and you will find, uh, the response there.

45
00:03:51,040 --> 00:03:58,360
But if everything is successful, and I think it is going to be because this is quite simple to do,

46
00:03:58,480 --> 00:04:07,540
uh, then you will have a situation like this where you can start the local server in this, uh, address

47
00:04:07,540 --> 00:04:08,650
and here.

48
00:04:10,510 --> 00:04:12,280
Vuiyasawa said this.

49
00:04:14,740 --> 00:04:23,320
You will see something like this and you can click here on post and try it out and you can enter here.

50
00:04:23,320 --> 00:04:24,490
Your question.

51
00:04:24,640 --> 00:04:31,180
Uh, it would be like what is this article?

52
00:04:33,140 --> 00:04:36,650
About in less than.

53
00:04:38,900 --> 00:04:40,670
100 words.

54
00:04:42,830 --> 00:04:43,820
Execute.

55
00:04:46,920 --> 00:04:48,540
And here you have the response.

56
00:04:49,300 --> 00:04:58,240
Okay, so the most important thing of this chapter is to understand that our LM application can end

57
00:04:58,240 --> 00:04:59,890
up in an API.

58
00:04:59,890 --> 00:05:07,300
So we can start plugging our LM applications with other external applications to do.

59
00:05:08,100 --> 00:05:10,140
A whole lot of different things.

60
00:05:10,440 --> 00:05:16,380
Remember that you can go to the documentation to know more, or to experiment further with this approach,

61
00:05:16,380 --> 00:05:26,760
and to find the special use case that is, uh, best for you, the particular API, API, or approach

62
00:05:26,760 --> 00:05:29,340
that works better for your use case.