WEBVTT

00:00:00.160 --> 00:00:04.100
When working with Claude Code
or with any AI

00:00:04.160 --> 00:00:07.979
agents, it's a good idea to give them ways

00:00:08.080 --> 00:00:11.780
of validating their work. That could be

00:00:11.820 --> 00:00:14.680
automated tests. I'll get back to that.

00:00:14.700 --> 00:00:18.500
But it can also be browser access,
at least if you're building a web

00:00:18.560 --> 00:00:22.500
app where they can access a browser
and test something.

00:00:22.540 --> 00:00:25.430
So for example, here,
I got my development server up

00:00:25.460 --> 00:00:29.200
For me, it's running on port 3001.
By default, it would be

00:00:29.240 --> 00:00:32.880
port 3000.
And I could now tell Claude Code

00:00:32.890 --> 00:00:36.860
to test the application it built.
So I could say, "Test

00:00:36.880 --> 00:00:40.810
the application you built using the
Playwright plugin or MCP," it's kind

00:00:40.820 --> 00:00:44.680
of the same thing,
"test all main features step-by-step

00:00:44.720 --> 00:00:48.620
work correctly." And I also tell it
that the application server

00:00:48.660 --> 00:00:51.960
running, and in my case,
it's running on port 3001.

00:00:51.970 --> 00:00:55.310
Playwright is a tool that
was initially built to help with

00:00:55.360 --> 00:00:59.210
end-to-end testing of web applications,
and that still is kind of

00:00:59.220 --> 00:01:02.540
its main purpose,
but with the rise of coding

00:01:02.620 --> 00:01:06.560
agents,
it now is also very popular for giving

00:01:06.660 --> 00:01:10.380
access so that they can interactively
explore a website and

00:01:10.440 --> 00:01:14.020
work with it. Now here,
I'll not send this in plan mode, but

00:01:14.060 --> 00:01:17.980
instead with accept edits on.
And what this should do

00:01:18.080 --> 00:01:22.000
now is use that Playwright integration
which we added through that

00:01:22.060 --> 00:01:25.420
plugin to then spin up a browser window
and

00:01:25.580 --> 00:01:29.200
navigate it on its own
and test the application on its

00:01:29.280 --> 00:01:33.200
own. So here,
it's initially asking me for permission

00:01:33.260 --> 00:01:37.100
Playwright tool,
and I don't want it to ask again,

00:01:37.160 --> 00:01:39.940
that. And it did now open this browser.

00:01:39.980 --> 00:01:43.840
So this browser window here
was opened by Claude Code and navigate to

00:01:43.900 --> 00:01:47.660
localhost 3001. Now, since this
is the first time it's

00:01:47.720 --> 00:01:51.480
using this plugin,
it's asking me for all kinds of

00:01:51.500 --> 00:01:55.480
Playwright, and unless you enabled
that mode where you dangerously grant

00:01:55.560 --> 00:01:58.250
all permissions,
you'll have to initially allow them all

00:01:58.300 --> 00:02:02.040
step-by-step. But of course,
that will get better once it used that

00:02:02.050 --> 00:02:05.690
plugin a bit more.
And it can now indeed navigate around

00:02:05.740 --> 00:02:08.410
navigate to this authentication page here.

00:02:08.460 --> 00:02:12.160
Now it's asking for more permissions to
fill in that form field.

00:02:12.220 --> 00:02:16.070
So I'll allow that,
and you see it filled in these form

00:02:16.100 --> 00:02:17.989
step, it will likely click this button.

00:02:18.020 --> 00:02:21.900
Now it wants to take a snapshot of the
page so that it can look at it because as

00:02:21.940 --> 00:02:25.920
mentioned before,
Claude Code has image vision

00:02:25.940 --> 00:02:29.800
it images, but it can also, of course,
take a look at images

00:02:29.880 --> 00:02:33.700
itself.
It can also take a look at the network

00:02:33.740 --> 00:02:37.640
or anything like that. And as you can see,
it now made it to the dashboard and it

00:02:37.680 --> 00:02:40.940
will keep on interacting with this site,
now creating a new note.

00:02:40.980 --> 00:02:44.810
And that, of course,
is a very powerful capability since it

00:02:44.840 --> 00:02:48.520
Claude Code to test its UI changes and its

00:02:48.560 --> 00:02:52.500
website changes on its own,
and it creates a feedback

00:02:52.720 --> 00:02:56.040
loop where Claude Code can then on its own
detect

00:02:56.180 --> 00:02:59.780
issues, fix those issues,
test those changes and so

00:02:59.900 --> 00:03:03.220
on. It's worth noting though
that browser access is

00:03:03.280 --> 00:03:07.040
quite token intensive because all these
tool

00:03:07.520 --> 00:03:11.440
descriptions it's creating,
all these tool calls do cost quite a

00:03:11.460 --> 00:03:13.880
bit of tokens. They can get quite long.

00:03:13.940 --> 00:03:17.800
Looking at all those images
that it eventually creates costs

00:03:18.000 --> 00:03:21.960
tokens.
So you should kind of be careful about

00:03:22.000 --> 00:03:25.820
feature, but of course, it
is a very powerful feature at the same

00:03:25.900 --> 00:03:28.460
which should definitely be in your tool
set.

00:03:28.480 --> 00:03:32.070
And as you see,
it's even able to open new tabs to test

00:03:32.140 --> 00:03:34.180
sharing feature here, for example.

00:03:34.240 --> 00:03:38.040
So now it concluded its testing,
closed the browser, and it

00:03:38.080 --> 00:03:41.020
found some problems which it
is now trying to fix.

00:03:41.060 --> 00:03:44.880
And once it implemented those fixes,
it'll of course also try to

00:03:44.920 --> 00:03:48.660
test them again in the browser. And again,
that can be a very powerful feedback

00:03:48.720 --> 00:03:49.820
loop therefore.
