WEBVTT

00:00.080 --> 00:00.840
There you have it.

00:00.840 --> 00:06.280
Those are the five tricks that allow us to take calls to LMS with inputs and the way we interpret the

00:06.280 --> 00:06.840
outputs.

00:07.080 --> 00:14.160
Do it in such a way that gives this impression that we have something out there carrying out tasks autonomously

00:14.160 --> 00:14.880
for us.

00:15.040 --> 00:19.160
And don't worry if you if you didn't get all of this because we're going to be doing this again and

00:19.160 --> 00:23.840
again and again, and by the end of it it's all going to connect wonderfully, but hopefully this giving

00:23.840 --> 00:25.680
you some intuition for it.

00:25.800 --> 00:26.920
So what about the trap?

00:26.920 --> 00:28.920
I told you there were five tricks and a trap.

00:29.000 --> 00:34.120
So this trap is something that I find personally galling, and it happens all the time.

00:34.120 --> 00:38.520
And so I wanted to to warn you about it and get your take on it.

00:38.560 --> 00:40.880
And I'm calling it the human trap.

00:41.120 --> 00:45.600
The proper word people use for this is anthropomorphizing.

00:45.880 --> 00:53.840
It's the problem of when people approach a genetic AI, treating llms as if they're humans, as if they

00:53.840 --> 00:55.760
have roles and responsibilities.

00:55.760 --> 00:57.960
And let me tell you more about what I mean.

00:58.160 --> 01:03.710
It's so common when I hear business people have have an objective they want to automate a business process,

01:03.710 --> 01:05.830
something we'll be doing a lot on this course.

01:05.990 --> 01:11.990
And the first their go to thought for business people and for technology people too, for engineers

01:11.990 --> 01:19.750
that should know better, uh, is to say, okay, let's create an agent architecture, an agent architecture,

01:19.750 --> 01:23.830
which is a diagram with different agents with lines between them.

01:24.030 --> 01:30.230
And I'm going to assign roles and responsibilities to these agents just by sort of analogy with the

01:30.230 --> 01:33.590
way humans go about doing this as if they are people.

01:33.590 --> 01:39.870
I'm going to have agents that represent different jobs, and I'm guilty of doing this myself.

01:39.870 --> 01:44.430
I do this in some of my, my, my courses teaching a genetic engineering, and we build things like

01:44.430 --> 01:47.350
a trading floor and we have traders and researchers.

01:47.350 --> 01:50.070
It's so natural to go towards that.

01:50.070 --> 01:53.630
And when you're doing like, like toy projects and demos, it's fine.

01:53.630 --> 01:59.310
You can do that because it's great fun to see it in progress, but it's not a disciplined way to do

01:59.310 --> 01:59.670
it.

01:59.710 --> 02:02.350
And it has lots of lots of problems.

02:02.350 --> 02:03.260
So what are the problems?

02:03.300 --> 02:10.900
Well, the big problem is that you have to keep in mind that what llms are good at doing is generating

02:10.900 --> 02:12.820
realistic content.

02:12.820 --> 02:17.220
That's what they're trained to do, generating stuff that seems compelling.

02:17.380 --> 02:22.260
And so if you give it a job, you say you are an evaluation agent.

02:22.260 --> 02:24.940
Your job is to evaluate what came before.

02:25.100 --> 02:26.900
You should give it marks out of ten.

02:27.140 --> 02:29.780
Then it will do that because that's what it's meant to do.

02:29.820 --> 02:33.660
And it will come up with a reason, because you've told it to give a reason.

02:33.860 --> 02:36.300
It doesn't mean that it's doing it well.

02:36.300 --> 02:40.380
It doesn't mean that this evaluation is aligned with your objectives.

02:40.500 --> 02:43.060
It just means it's going to follow the script.

02:43.060 --> 02:45.580
It's going to do what it's told to do in the prompt.

02:45.740 --> 02:50.780
So the danger, the risk of this is that you can fool yourself into thinking you have this, this,

02:50.780 --> 02:55.940
this whole sort of cadre, this whole group of different agents all doing their thing, all generating

02:55.940 --> 02:58.860
very reasonable, very realistic content.

02:58.860 --> 03:01.580
But it might all be llms slop.

03:01.620 --> 03:04.370
It might be nonsense that apparently is doing something.

03:04.370 --> 03:06.650
They're all they're all collaborating together.

03:06.650 --> 03:11.970
They're all assuming their roles, but they're not actually solving your task in an accurate way.

03:12.010 --> 03:16.490
So it's all very well, me complaining, but what am I suggesting is the the answer?

03:16.730 --> 03:18.530
Well, the right way to do it.

03:18.570 --> 03:25.290
When you divide up your problem into different agents, you should do it because it solves your problem

03:25.290 --> 03:30.770
better, not because it sounds like those are the roles you would have, but because you've tried it

03:30.770 --> 03:32.370
and that's an improvement.

03:32.370 --> 03:34.290
You've got some chunky problem.

03:34.290 --> 03:37.010
It makes sense to divide it into some steps.

03:37.010 --> 03:40.850
You try it and you get better outcomes and then you stick with it.

03:40.850 --> 03:42.210
That's the way to do it.

03:42.330 --> 03:48.170
Uh, scientifically, with experiments and most importantly with this, the magical word is that word

03:48.170 --> 03:49.410
in that last bullet there.

03:49.610 --> 03:50.650
Evaluate.

03:50.650 --> 03:56.890
You need to have a way to measure your outcomes, and you should divide things into smaller steps or

03:56.890 --> 04:03.410
organize your agents differently because you get better evaluations, because it results in superior

04:03.450 --> 04:04.210
performance.

04:04.320 --> 04:10.040
That's the right way to do it, not just because it sounds like that's the right kind of responsibility.

04:10.080 --> 04:15.880
Now, often it's a good starting point to start by analogy with with some human organization.

04:15.880 --> 04:20.200
If we divided up roles and responsibilities for for humans, we probably had some, some reason for

04:20.240 --> 04:20.640
that.

04:20.640 --> 04:26.520
Maybe it's a decent starting point, but you should treat it as your starting point and always start

04:26.520 --> 04:28.240
as simple as possible.

04:28.360 --> 04:32.920
Start with just one role, divide up and start assigning more responsibilities.

04:32.920 --> 04:38.640
Experiment with that because it's going to help you get better outcomes and then measure it.

04:38.640 --> 04:40.000
That's the right way to do it.

04:40.000 --> 04:41.720
That's how to avoid the trap.

04:41.720 --> 04:45.200
And that's a wrap on the theory around Agentic AI.

04:45.240 --> 04:51.840
I hope this has given you some good intuition as we dive into building agentic workflows and also some

04:51.840 --> 04:57.560
some real, real world kind of lessons learned from actually building Agentic AI in anger.

04:57.960 --> 05:03.320
And now it's time for us to go back to N810, starting with the navigation.

05:03.320 --> 05:06.240
The big picture building blocks of N810.