WEBVTT

00:00.450 --> 00:03.240
-: Hey guys, Eden here, I hope you joined the course

00:03.240 --> 00:06.060
and I would really appreciate if you can write me a review.

00:06.060 --> 00:07.770
This really, really helps.

00:07.770 --> 00:10.920
Now I want to take a recap of this entire course

00:10.920 --> 00:13.860
and what we did so far and how to go next.

00:13.860 --> 00:15.330
So the main idea for this course

00:15.330 --> 00:17.190
was to give you an introduction

00:17.190 --> 00:18.750
and an elaborate introduction

00:18.750 --> 00:21.090
for LLM application development.

00:21.090 --> 00:23.970
We covered two super important patterns

00:23.970 --> 00:27.030
when it comes to developing LLM applications.

00:27.030 --> 00:28.830
The first one is using agents

00:28.830 --> 00:32.490
and leveraging the reasoning capabilities of the LLM,

00:32.490 --> 00:37.490
creating agents which can execute non-deterministic actions,

00:37.680 --> 00:39.630
so this was the first part.

00:39.630 --> 00:41.280
And the second part was dedicated

00:41.280 --> 00:45.060
for retrieval augmentation, for utilizing vector stores

00:45.060 --> 00:47.280
and semantic search and embeddings

00:47.280 --> 00:50.910
in order to chat over our proprietary data

00:50.910 --> 00:52.860
and leveraging this capability.

00:52.860 --> 00:55.800
So every LLM application is either using this pattern

00:55.800 --> 00:59.520
or that pattern or simply making a call to the LLM directly,

00:59.520 --> 01:00.990
and you have now the tools

01:00.990 --> 01:02.760
in order to create those applications.

01:02.760 --> 01:04.800
So now let's talk about what's next

01:04.800 --> 01:07.440
and what do I suggest you're doing in your learning journey

01:07.440 --> 01:09.333
of LLM application development.

01:10.620 --> 01:12.960
So if you've been following along this course,

01:12.960 --> 01:16.290
you probably noticed that developing an LLM application

01:16.290 --> 01:18.060
is not that simple.

01:18.060 --> 01:19.950
Coming up with this perfect prompt,

01:19.950 --> 01:22.590
which would get the best response for the LLM,

01:22.590 --> 01:26.520
requires a lot of work and investing a lot of time.

01:26.520 --> 01:29.370
And what happens if the LLM suddenly changes

01:29.370 --> 01:32.010
and now the prompt is not working as expected?

01:32.010 --> 01:35.790
And what if we want to consider other LLMs in other models

01:35.790 --> 01:40.380
which are faster, maybe cheaper, or maybe more secure?

01:40.380 --> 01:42.450
So we would need to take this prompt

01:42.450 --> 01:44.940
and adjust it to the other LLM.

01:44.940 --> 01:47.909
So prompt management is a really important issue

01:47.909 --> 01:50.970
when it comes to LLM application development.

01:50.970 --> 01:53.340
We also need to monitor our LLMs.

01:53.340 --> 01:56.130
So we want to see how fast are we getting our response,

01:56.130 --> 01:57.390
what's the latency?

01:57.390 --> 01:59.850
How much does each request cost us

01:59.850 --> 02:03.300
and how much are we going to pay to the LLM vendors?

02:03.300 --> 02:06.360
We also have debugging, where something goes wrong,

02:06.360 --> 02:08.340
and we want to understand why the LLM

02:08.340 --> 02:10.560
is not returning us the correct response

02:10.560 --> 02:12.330
and the response that we want.

02:12.330 --> 02:13.740
And that can be a challenge,

02:13.740 --> 02:16.050
especially when we're dealing with agents

02:16.050 --> 02:17.640
and of course, evaluation.

02:17.640 --> 02:19.560
Even if we use the LLM,

02:19.560 --> 02:21.420
how can we know that the response

02:21.420 --> 02:23.760
that we get from the LLM is good?

02:23.760 --> 02:25.380
So we want to have tools

02:25.380 --> 02:27.840
that help us automate all those processes,

02:27.840 --> 02:29.550
because if we're going to evaluate

02:29.550 --> 02:31.290
the LLM response manually,

02:31.290 --> 02:35.040
this would take a lot of time and it doesn't scale.

02:35.040 --> 02:38.220
So all of those problems are being bundled together

02:38.220 --> 02:42.810
into a new evolving field called LLMOps, LLM operations.

02:42.810 --> 02:45.390
And there are several of popular tools

02:45.390 --> 02:47.250
which can help us solve

02:47.250 --> 02:50.790
and to perform better LLM operations.

02:50.790 --> 02:53.853
One famous one is LangSmith by LangChain.

02:54.930 --> 02:57.750
And LangSmith is a unified platform

02:57.750 --> 02:59.580
that enables developers

02:59.580 --> 03:01.950
to build production-grade LLM application.

03:01.950 --> 03:04.830
And it helps with debugging, testing, evaluating,

03:04.830 --> 03:07.470
and monitoring all of the applications,

03:07.470 --> 03:09.300
and allowing us developers

03:09.300 --> 03:13.260
to have quick and efficient development life cycles.

03:13.260 --> 03:15.780
So LangSmith currently is not open source,

03:15.780 --> 03:18.000
so this is something you should consider,

03:18.000 --> 03:20.460
but if you want open source alternative,

03:20.460 --> 03:21.928
you can use Pezzo

03:21.928 --> 03:25.320
and Pezzo really helps with prompt management and tracing

03:25.320 --> 03:28.350
and monitoring our LLM operations.

03:28.350 --> 03:31.800
So we should also check out Pezzo, in my opinion.

03:31.800 --> 03:34.680
I want to talk about LLM security for a moment.

03:34.680 --> 03:35.820
So in this course,

03:35.820 --> 03:38.430
we haven't really talked about this subject.

03:38.430 --> 03:39.480
So there's a difference

03:39.480 --> 03:41.790
when we develop an LLM application locally,

03:41.790 --> 03:44.100
between deploying it into production,

03:44.100 --> 03:46.410
because once it's deployed in production

03:46.410 --> 03:48.600
and we have real customers using it,

03:48.600 --> 03:52.230
then we need to make sure that our LLM application

03:52.230 --> 03:53.880
is threat safe.

03:53.880 --> 03:56.010
And that's a pretty challenging thing to do

03:56.010 --> 03:59.490
because large language models and usage of them

03:59.490 --> 04:01.890
bring a lot of new attack vectors.

04:01.890 --> 04:06.030
So things like prompt injection and agents accessing data

04:06.030 --> 04:07.590
that they shouldn't access.

04:07.590 --> 04:10.200
So those things are real world problems

04:10.200 --> 04:12.420
that we need to take account of.

04:12.420 --> 04:15.507
So recently, LangChain made a huge repository change

04:15.507 --> 04:18.030
where all the code that was not safe

04:18.030 --> 04:20.610
and held new vulnerabilities

04:20.610 --> 04:23.970
was moved to this experimental directory.

04:23.970 --> 04:27.690
So overall, security is something important in my opinion,

04:27.690 --> 04:29.160
and maybe I'm just mentioning it

04:29.160 --> 04:31.710
because I come from a security background,

04:31.710 --> 04:34.470
but I think it's important to know and to explore,

04:34.470 --> 04:37.290
especially when we are dealing with LLMs.

04:37.290 --> 04:41.940
Now let's talk about resources and where do I suggest you go

04:41.940 --> 04:43.380
to find new information

04:43.380 --> 04:47.340
when it comes to generative AI application developments,

04:47.340 --> 04:51.630
and LangChain, LangSmith, other frameworks, all of that?

04:51.630 --> 04:55.170
So really the two things I highly suggest you do,

04:55.170 --> 04:58.380
and in my opinion, it would cover most of it,

04:58.380 --> 05:01.050
is one to follow the LangChain blogs,

05:01.050 --> 05:03.510
so every week they release new blog,

05:03.510 --> 05:08.430
and new ideas, new implementations about generative AI

05:08.430 --> 05:10.800
and gen AI application development.

05:10.800 --> 05:14.010
And I think it's an excellent source of information

05:14.010 --> 05:15.720
for you to keep going.

05:15.720 --> 05:19.110
And the other thing I highly suggest you joining Twitter

05:19.110 --> 05:22.260
and reading about generative AI in there,

05:22.260 --> 05:24.360
there are a lot of people that post things

05:24.360 --> 05:28.710
about researchers and about new applications,

05:28.710 --> 05:32.750
new use cases, and new ways to optimize things.

05:32.750 --> 05:35.160
So I highly recommend you joining Twitter

05:35.160 --> 05:38.190
and follow it there because all the news,

05:38.190 --> 05:42.120
all of that are being directly streamed into Twitter.

05:42.120 --> 05:43.500
So that's pretty much it.

05:43.500 --> 05:45.482
I do update this course frequently.

05:45.482 --> 05:48.480
So when something new and important comes out,

05:48.480 --> 05:50.460
I do update this course.

05:50.460 --> 05:52.650
And I hope you enjoyed this course,

05:52.650 --> 05:54.510
and if you do want to support me,

05:54.510 --> 05:56.310
I'd appreciate a Udemy review.

05:56.310 --> 05:58.740
This really helps me to spread around the world,

05:58.740 --> 06:00.753
and thank you so much.