WEBVTT

00:00.040 --> 00:03.520
In this video you're going to explore ChatGPT agent mode.

00:03.560 --> 00:11.280
This came out and basically it allows you to combine computer use so ChatGPT can spin up a computer

00:11.440 --> 00:13.800
and then it can do operations on that.

00:13.840 --> 00:21.040
It also has access to a browser, so you can technically go and tell it to autonomously do some web

00:21.040 --> 00:23.160
search or some manual data entry.

00:23.280 --> 00:27.400
And if you give it some of the right logins, it can do work on your behalf.

00:27.720 --> 00:30.640
Let's have a look at how you can access this inside of ChatGPT.

00:31.040 --> 00:32.280
So go to ChatGPT.

00:33.160 --> 00:38.200
And what you'll need to do is click on the plus symbol and then turn on Agent Mode.

00:38.240 --> 00:42.720
Now you can see when you turn on Agent mode you have a certain number of credits.

00:42.920 --> 00:48.440
In my case, I've got 34 that remain and that will be reset on August the 23rd.

00:48.440 --> 00:50.560
So you get a certain amount of credits each month.

00:50.760 --> 00:55.520
And then once you've turned on Agent Mode, it actually has a bunch of recommendations that you can

00:55.520 --> 00:56.080
see here.

00:56.080 --> 00:58.640
For example, we could create a presentation.

00:58.640 --> 01:04.570
So we could do develop a go to market strategy for robotic pet launch.

01:04.570 --> 01:05.410
So I'm going to click that.

01:05.410 --> 01:07.850
I'm going to click and hit enter on this.

01:07.850 --> 01:13.250
And then what's going to happen behind the scenes is that ChatGPT is going to set up its own desktop.

01:13.250 --> 01:17.970
And that desktop is essentially a Linux machine that lives on the cloud.

01:17.970 --> 01:23.890
And then because of that, it can access to things like Python using a browser.

01:24.170 --> 01:31.370
Also, as you can see here, give you a live preview of what ChatGPT is doing in the flow.

01:31.490 --> 01:33.610
So you can see it's doing some searching.

01:33.770 --> 01:35.930
It's then going to do a reading mode.

01:35.930 --> 01:41.970
And if you look on the bottom of this, we can actually scroll back and you can see specifically what

01:42.010 --> 01:44.810
different bits the agent decided to take.

01:44.850 --> 01:46.450
And that's called a trajectory.

01:46.450 --> 01:49.610
So what trajectory did the agent take.

01:49.810 --> 01:52.210
Now interestingly go to the triple dots.

01:52.250 --> 01:55.890
Click on that and you'll see there's three things here at the moment.

01:56.570 --> 02:02.650
You can see the activity gives you a breakdown of what this specific version is worked on.

02:02.930 --> 02:06.850
That's quite useful to see how often it's searching, how often it's reviewing.

02:07.130 --> 02:11.530
If we go and do stop, then obviously that will stop the autonomous agent.

02:11.730 --> 02:15.210
And the other thing that we can do is we can take over the browser.

02:15.410 --> 02:18.090
Now I'm going to show you one that I did earlier.

02:18.130 --> 02:19.530
So let me go and show you that.

02:19.530 --> 02:28.610
So I've got a broad band of affordability analysis and you can see that basically it went and worked

02:28.610 --> 02:32.770
for 17 minutes and actually built a bunch of slides.

02:32.770 --> 02:38.650
And you can see it going and working on the terminal here and doing lots of code and generating kind

02:38.650 --> 02:40.770
of different types of visualizations.

02:41.090 --> 02:42.170
And it worked.

02:42.170 --> 02:47.130
And then it actually created a presentation and I could download this as a PDF.

02:47.490 --> 02:53.530
So again, this is the kind of thing that you get when you can use ChatGPT to generate presentations.

02:53.730 --> 02:59.330
And you can see this is what it generated on different types of broadband and the monthly cost and all

02:59.330 --> 03:00.050
that kind of stuff.

03:00.180 --> 03:02.100
That's what a report looks like when they're finished.

03:02.140 --> 03:08.180
Now, obviously if we go to the go to market strategy, you can see that ChatGPT is working and it's

03:08.180 --> 03:09.180
going to take a while.

03:09.180 --> 03:13.060
So now it's actually in the machine that's running lots of different types of commands.

03:13.060 --> 03:17.100
So that's quite useful for seeing what's happening on the activity view.

03:17.380 --> 03:20.180
And obviously if you want you can look at the desktop view as well.

03:20.580 --> 03:21.060
Great.

03:21.060 --> 03:24.700
So we've gone through I've shown you a report at the end.

03:25.100 --> 03:29.140
And obviously we could wait until this report's done, but it's going to roughly produce some kind of

03:29.180 --> 03:29.860
PowerPoint.

03:29.900 --> 03:35.900
The key insight I want you to take away from this is that you could use this for anything, and we could

03:35.900 --> 03:42.460
go a step further, and we could not only use web search and Microsoft Excel, Microsoft PowerPoint,

03:42.500 --> 03:46.700
but we can also give ChatGPT agent access to our own websites.

03:46.820 --> 03:53.100
So if I go and create a new chat, and then we'll go back and we'll go into agent mode, and then what

03:53.100 --> 03:56.380
I'm going to do is I'm going to click, I'm actually going to just not use the suggestion.

03:56.380 --> 04:03.980
So I'm going to say I want to spin up a virtual VM and I will log into a website.

04:04.740 --> 04:09.180
And what I'm going to show you is this other feature of agent mode that we haven't seen yet.

04:09.340 --> 04:09.540
Right.

04:09.580 --> 04:13.500
To go into the triple dots, click Take Over browser and then you can control the browser.

04:13.500 --> 04:16.980
So I can go in here I can sign into X.com.

04:17.460 --> 04:22.500
And if you've already signed in on your ChatGPT account then it should have access to that.

04:22.500 --> 04:25.380
So you can see here I actually have access to this.

04:25.380 --> 04:27.100
And we can go and like this post.

04:27.540 --> 04:29.460
So I've signed in with my account.

04:29.500 --> 04:31.340
Obviously you can sign into anything you want here.

04:31.340 --> 04:33.220
You could go to Facebook what have you.

04:33.660 --> 04:36.540
That's the power of this is you just sign in and then it's going to remember.

04:36.540 --> 04:39.340
You click on the bottom right finish controlling.

04:39.620 --> 04:48.580
Then we can say I want you to create three posts on X.com saying hello world from ChatGPT agent mode.

04:49.340 --> 04:50.860
Okay, so here's the powerful thing.

04:50.900 --> 04:55.100
Now it's able to use logins that you've previously provided.

04:55.300 --> 04:58.580
And you can basically have that so it can do some work.

04:58.740 --> 05:01.540
So it's decided that that page is facebook.com.

05:01.540 --> 05:07.020
But you can now see that it's gone to X.com and we can actually see it's going to go up here and it's

05:07.020 --> 05:08.100
going to compose.

05:08.220 --> 05:11.060
You will see that it asks you shall I click to publish it?

05:11.060 --> 05:12.540
And we'll say continue.

05:12.860 --> 05:17.260
So sometimes it will ask for approval, which is absolutely fine.

05:18.660 --> 05:23.260
We could probably be more specific with the prompt, like if you have to do anything then don't worry

05:23.260 --> 05:23.860
about it.

05:24.180 --> 05:28.180
And essentially yeah, like you can see here, shall I click to publish it.

05:28.500 --> 05:29.180
And there you go.

05:29.180 --> 05:32.100
So now it's working in the second process here.

05:32.380 --> 05:35.300
And again yeah you can rate these responses.

05:35.580 --> 05:36.620
We can click to continue.

05:36.660 --> 05:39.340
I'll say continue and post without permission.

05:39.380 --> 05:42.340
Let's see if it does that or whether it actually needs permission.

05:42.660 --> 05:45.900
So I'm going to just tell it continue and post without permission.

05:46.340 --> 05:46.980
So let's see.

05:46.980 --> 05:47.740
Let's have a look.

05:51.700 --> 05:52.060
Yeah.

05:52.060 --> 05:55.140
So you can see without further confirmation prevents.

05:55.180 --> 05:55.420
Yeah.

05:55.460 --> 05:56.740
So would you like to modify it.

05:56.740 --> 05:57.180
So yeah.

05:57.220 --> 05:57.460
Yeah.

05:57.500 --> 06:05.470
Modify it and make it unique for three messages, but we'll do five messages then.

06:06.270 --> 06:08.150
And don't ask, just do.

06:08.470 --> 06:12.350
So we're just going to use some prompting techniques to specifically lock it in and just tell it.

06:12.390 --> 06:13.190
It's absolutely fine.

06:13.230 --> 06:14.070
Go and do the work.

06:14.110 --> 06:16.710
And that'd be the type of thing that you'll put at the initial prompt.

06:16.710 --> 06:21.790
When you're doing whatever task you need to do is just telling it, don't ask for permission.

06:21.950 --> 06:22.590
Just do the work.

06:22.590 --> 06:27.310
Now, obviously, you've got to be very careful with that because you're giving a lot more agency and

06:27.310 --> 06:32.550
you're giving a lot of delegation to agents, but just be aware that you can often open.

06:32.550 --> 06:33.990
I will respect your prompts.

06:33.990 --> 06:35.710
If you're saying just do the work.

06:36.030 --> 06:43.030
So I'm going to go and open a fresh tab on and have a look on Xe.com and let's see what we've got.

06:43.030 --> 06:45.110
So we've got one, we've got two.

06:45.790 --> 06:46.550
How are we doing.

06:46.550 --> 06:48.350
So it says that it's thinking.

06:48.350 --> 06:49.470
It says it's thinking.

06:49.950 --> 06:50.550
So here you go.

06:50.590 --> 06:52.470
So it's decided it's going to modify these.

06:52.510 --> 06:53.590
It's clicking send.

06:53.590 --> 06:54.670
So you can see here.

06:55.510 --> 06:59.680
So now that I'm going to do that and then we're what are we doing here?

06:59.720 --> 07:01.280
So you've got a bunch of these.

07:01.280 --> 07:02.360
Hello, world.

07:02.360 --> 07:03.600
Hello, world.

07:04.400 --> 07:05.080
Let's see.

07:05.320 --> 07:08.120
I'm sending the message hello, world from post two.

07:09.040 --> 07:09.520
Here you go.

07:09.560 --> 07:13.280
So post three of five and then it's clicking paste.

07:13.280 --> 07:14.400
And there we go.

07:14.400 --> 07:18.840
So we've got our third message from ChatGPT agent.

07:18.880 --> 07:22.560
The agent mode the agent model is now working.

07:22.560 --> 07:23.360
So here you go.

07:23.400 --> 07:27.880
So we learned about the fact that you could use Agent Mode to build PowerPoints or presentations.

07:27.880 --> 07:30.400
It has complete access to a virtual machine.

07:30.680 --> 07:38.040
You've learned about how to add in additional types of websites by using that takeover browser, and

07:38.040 --> 07:42.560
you've also learned about how to do some clever prompt engineering techniques so that we don't have

07:42.560 --> 07:44.520
to keep clicking that continue button.

07:44.920 --> 07:45.240
All right.

07:45.240 --> 07:50.240
The final thing I just want to say on this is you can also schedule these and you can say how often

07:50.240 --> 07:51.200
you want them to come.

07:51.480 --> 07:56.240
And you can give a range of instructions and you can tell it when to repeat as well.

07:56.240 --> 08:00.040
So scheduling is also possible and I will see you in the next one.