WEBVTT

00:00.240 --> 00:02.670
-: Hey, you're gonna talk about claim detection

00:02.670 --> 00:07.110
because this is a common issue in creating content

00:07.110 --> 00:10.170
with AI that it makes things up, hallucinates,

00:10.170 --> 00:13.380
and it comes up with quotes that didn't actually happen.

00:13.380 --> 00:15.480
It makes up statistics that are outta date.

00:15.480 --> 00:19.050
It's typically something that is a weak point of AI.

00:19.050 --> 00:22.320
But the interesting thing is that you can actually use AI

00:22.320 --> 00:24.360
to solve AI's own problems. (laughs)

00:24.360 --> 00:26.580
Everything just becomes AI.

00:26.580 --> 00:28.920
What we have here is some article text.

00:28.920 --> 00:33.120
I made this, it's an article based on a real article

00:33.120 --> 00:34.560
here in the New York Times.

00:34.560 --> 00:36.270
I copied and pasted it into ChatGPT,

00:36.270 --> 00:38.070
and I asked it to rewrite the story

00:38.070 --> 00:40.470
but add a few fake things in here.

00:40.470 --> 00:43.620
The fake things that it's added are that it's said

00:43.620 --> 00:46.350
that the court has not, yeah, that the court

00:46.350 --> 00:49.380
has voided his pay package, Elon Musk's pay package.

00:49.380 --> 00:51.240
And in reality they didn't.

00:51.240 --> 00:54.870
We also added in this fake quote from Greg Varallo,

00:54.870 --> 00:57.570
this quote from the lawyer of the disenchanted shareholders.

00:57.570 --> 00:58.590
It's fabricated.

00:58.590 --> 01:00.900
And then we made up a statistic as well.

01:00.900 --> 01:05.040
So saying that we'd, he would own 20% of Tesla,

01:05.040 --> 01:07.770
the actual figure wasn't quoted.

01:07.770 --> 01:08.850
Cool. Those are the fake things.

01:08.850 --> 01:11.010
We'll see if it finds those.

01:11.010 --> 01:13.320
It's the New York Times, so their journalism's pretty good.

01:13.320 --> 01:15.120
You would expect all the other things to be true,

01:15.120 --> 01:17.310
but that's not true of all journalism

01:17.310 --> 01:18.900
either AI generated or human.

01:18.900 --> 01:20.250
So it's useful for us to be able

01:20.250 --> 01:22.230
to detect when claims are being made

01:22.230 --> 01:25.290
and then investigate them after, again, using AI.

01:25.290 --> 01:26.397
You're gonna run that.

01:26.397 --> 01:28.740
Now I have a basic prompt that I put together

01:28.740 --> 01:31.230
here as a diligent copy editor, your role is

01:31.230 --> 01:33.000
detecting claims that need to be fact checked

01:33.000 --> 01:34.950
and I've given some specifics.

01:34.950 --> 01:36.660
A claim could be a statistic, a quote,

01:36.660 --> 01:38.640
an anecdote, an instruction.

01:38.640 --> 01:40.680
So if it tells you this is how

01:40.680 --> 01:42.690
this function works in Excel, right?

01:42.690 --> 01:43.860
That can actually be a problem.

01:43.860 --> 01:44.850
Like it might make that up.

01:44.850 --> 01:46.260
I've seen that happen in the past.

01:46.260 --> 01:48.660
Or say any information that's stated is fact,

01:48.660 --> 01:51.690
which may be out of date incorrect or imagined.

01:51.690 --> 01:55.620
Asking it to return JSON with an unmodified excerpt

01:55.620 --> 01:58.800
from the text, the type of claim being detected,

01:58.800 --> 02:02.310
and some reasoning why the claim might need verification.

02:02.310 --> 02:05.250
So a bit of chain of thought going on here.

02:05.250 --> 02:08.490
I'm using GPT-4o, it makes mistakes,

02:08.490 --> 02:10.500
but it's very fast, very cheap.

02:10.500 --> 02:12.720
So I prefer for this type of task.

02:12.720 --> 02:15.450
And then I'm just passing in the article text here,

02:15.450 --> 02:17.310
and then the temperature I'm putting at 0.2

02:17.310 --> 02:19.800
'cause I don't want it to make up the excerpts.

02:19.800 --> 02:21.930
I've had that as a problem in the past,

02:21.930 --> 02:23.490
but you know, should be good here

02:23.490 --> 02:25.020
if we keep the temperature low,

02:25.020 --> 02:29.583
and then if we run that can see what excerpts we get back.

02:39.750 --> 02:42.607
Okay, so here's an example of an excerpt.

02:42.607 --> 02:44.910
"For months, many Tesla investors are worried about how

02:44.910 --> 02:47.340
engaged, who would be," blah blah blah.

02:47.340 --> 02:49.890
There's an anecdote and then reasoning "the claim that a

02:49.890 --> 02:51.600
judge in Delaware voided Elon Musk's

02:51.600 --> 02:53.460
pay package needs verification."

02:53.460 --> 02:55.440
Yeah, that was one of our fake ones, right?

02:55.440 --> 02:57.240
It says, especially after a judge in Delaware

02:57.240 --> 02:58.740
voided his pay package.

02:58.740 --> 03:00.703
So it has correctly found that.

03:00.703 --> 03:02.280
I also found the statistics,

03:02.280 --> 03:04.230
the shares are worth 48 billion,

03:04.230 --> 03:07.680
so we need to look up the latest value of the shares.

03:07.680 --> 03:10.260
Yeah, you can see that picked up the percentage

03:10.260 --> 03:12.720
of Tesla here, which is pretty good as well.

03:12.720 --> 03:15.930
So I picked up a lot of the claims, you know, both true ones

03:15.930 --> 03:18.870
and also the fake ones that we put in here.

03:18.870 --> 03:21.660
All right, we need to get that into actual JSON.

03:21.660 --> 03:24.870
So in order to do that, we just basically

03:24.870 --> 03:29.370
remove this like beginning part from here.

03:29.370 --> 03:33.360
See it says triple tick JSON, get rid of that,

03:33.360 --> 03:35.460
and get rid of the triple ticks at the end.

03:35.460 --> 03:40.200
And then we have a Python object that we can work with now.

03:40.200 --> 03:41.340
So that's the first claim.

03:41.340 --> 03:44.310
I'm using Tavily to look up these research claims.

03:44.310 --> 03:49.310
Tavily is an AI tool basically to search the web.

03:49.620 --> 03:53.520
It gives web search capabilities to your AI agents.

03:53.520 --> 03:55.020
There are other options here.

03:55.020 --> 03:58.440
Like you could use perplexities API has search

03:58.440 --> 04:01.830
and Google's AI has search as well.

04:01.830 --> 04:04.980
And then you could also use SERP API,

04:04.980 --> 04:07.260
I think is another one that people use for this.

04:07.260 --> 04:09.270
And some of them have free credits, some of them don't.

04:09.270 --> 04:12.900
But Tavily, I've been recommended by a few people

04:12.900 --> 04:15.450
it has a thousand requests for free per month.

04:15.450 --> 04:16.290
It's worth setting it up,

04:16.290 --> 04:18.570
and you don't need a credit card or anything.

04:18.570 --> 04:21.124
Cool, here's Tavily Client, you just import that.

04:21.124 --> 04:24.600
I'm just gonna run this, it's gonna ask me for my API key.

04:24.600 --> 04:27.150
If I go into Tavily here,

04:27.150 --> 04:29.013
I'm just gonna regenerate this,

04:30.030 --> 04:31.080
gonna copy it.

04:31.080 --> 04:33.360
You don't wanna show anyone your API key,

04:33.360 --> 04:36.690
and just paste that in there and hit enter,

04:36.690 --> 04:38.580
and then hopefully that will run.

04:38.580 --> 04:41.130
Here we go. So here's the basic search.

04:41.130 --> 04:42.660
I've just asked it to validate its claim.

04:42.660 --> 04:46.560
I've formatted it slightly so I just put the claim,

04:46.560 --> 04:49.620
and then the reasoning into a string,

04:49.620 --> 04:51.720
and then I put that into here.

04:51.720 --> 04:53.850
It just says validate this claim.

04:53.850 --> 04:55.980
You can see this is the query that I gave it.

04:55.980 --> 04:58.380
And you can see this is the one about

04:58.380 --> 05:00.210
the pay package being voided.

05:00.210 --> 05:02.490
So that's the question I asked.

05:02.490 --> 05:04.260
And then these are the results.

05:04.260 --> 05:06.360
So it's done basically a Google search,

05:06.360 --> 05:09.090
and you can see the content of those searches.

05:09.090 --> 05:10.470
So it has the links.

05:10.470 --> 05:12.810
So if you want to be able to provide citations

05:12.810 --> 05:15.480
and things, this is really great, it's really helpful,

05:15.480 --> 05:17.160
and you've got the content here as well

05:17.160 --> 05:20.250
as the score in terms of how relevant they are, right?

05:20.250 --> 05:21.630
We have the context that's really useful,

05:21.630 --> 05:23.100
but I'm gonna be even lazier here.

05:23.100 --> 05:25.050
Because I don't wanna have to make another call,

05:25.050 --> 05:27.600
and say, "Okay, based on this context,

05:27.600 --> 05:29.160
is this claim correct?"

05:29.160 --> 05:30.480
You know, maybe it depends on

05:30.480 --> 05:32.310
what application you're designing,

05:32.310 --> 05:34.920
but for me I just want to check and get an answer.

05:34.920 --> 05:37.350
So I'm gonna use the Q &amp; A search,

05:37.350 --> 05:39.510
which is basically just another call

05:39.510 --> 05:42.240
in underneath it will do this search,

05:42.240 --> 05:44.280
and then it will give you back the answer.

05:44.280 --> 05:46.680
So it says the claim that a judge in Delaware voided

05:46.680 --> 05:49.710
on Elon Musk's pay package is not valid.

05:49.710 --> 05:51.990
Tesla shareholders recently voted to reaffirm

05:51.990 --> 05:54.150
a $56 billion compensation package.

05:54.150 --> 05:56.970
So this actually just happened a couple days ago.

05:56.970 --> 05:59.310
So it's really impressive that, you know, we're able

05:59.310 --> 06:01.380
to pull out this up-to-date information.

06:01.380 --> 06:04.950
And one of the reasons why AI makes fake claims is

06:04.950 --> 06:07.260
because they don't know what the up-to-date information is.

06:07.260 --> 06:09.600
Really powerful to be able to go

06:09.600 --> 06:12.780
and fact check your sources again with AI, right?

06:12.780 --> 06:14.070
So you know, don't have to have

06:14.070 --> 06:16.380
a human do this fact checking.

06:16.380 --> 06:18.780
When you wanna run this across lots of different ones,

06:18.780 --> 06:20.490
you could do a for loop here,

06:20.490 --> 06:22.920
I'm just looping through every single one,

06:22.920 --> 06:24.750
and then I'll have all my claims checked.

06:24.750 --> 06:27.750
You could also structure it in a different way

06:27.750 --> 06:32.750
so that you would get maybe a yes no answer, for example.

06:33.450 --> 06:36.480
That'd be really helpful because then you could

06:36.480 --> 06:39.570
display an invalid claim in your user interface

06:39.570 --> 06:42.090
instead if you do run into an issue.

06:42.090 --> 06:43.710
There's also error handling as well,

06:43.710 --> 06:46.500
so I've noticed sometimes with Tavily it gets,

06:46.500 --> 06:49.290
I get this like bad request for URL sometimes,

06:49.290 --> 06:51.330
or you might have rate limiting,

06:51.330 --> 06:53.400
so maybe you use multiple solutions

06:53.400 --> 06:55.020
and see which ones work best for you.

06:55.020 --> 06:58.470
But ultimately what you get here is you have the excerpt,

06:58.470 --> 07:01.320
you have the type of claim, so you can see what types

07:01.320 --> 07:04.980
of claims are being made and then you would see the context.

07:04.980 --> 07:05.813
So here we go.

07:05.813 --> 07:08.910
There's no verify information to support the claim

07:08.910 --> 07:11.220
that the judge voided the pay packet.

07:11.220 --> 07:14.730
Here it claims that the shares are worth 48 billion,

07:14.730 --> 07:16.530
because the risk cannot be verified,

07:17.460 --> 07:20.160
does not specify the exact value of the shares.

07:20.160 --> 07:22.410
What you could do now if you were building

07:22.410 --> 07:25.380
an AI application with this is you can have

07:25.380 --> 07:27.900
another cause to LLM for each one of these

07:27.900 --> 07:30.390
and just say, "Okay, based on the context,

07:30.390 --> 07:33.690
do we need to alert the user to check this claim?"

07:33.690 --> 07:38.040
So maybe in user interface you have some visual highlighting

07:38.040 --> 07:40.950
that you say, "Hey, these are the claims you need to look at

07:40.950 --> 07:44.250
and rewrite potentially or make 'em up to date."

07:44.250 --> 07:45.840
You could also use this as a trigger.

07:45.840 --> 07:49.080
So you could say, "Send this back to GPT-4 in order

07:49.080 --> 07:52.337
to rewrite this excerpt if there is some claim

07:52.337 --> 07:55.260
that we have the context to fix."

07:55.260 --> 07:56.640
Yeah, there's lots you can do with this.

07:56.640 --> 07:59.280
This is more of a mental model for you to think about

07:59.280 --> 08:01.830
using for lots of different purposes.

08:01.830 --> 08:05.820
But in particular, I use this for SEO generated content

08:05.820 --> 08:07.740
just to make sure that it is accurate,

08:07.740 --> 08:10.088
and it's not putting more fake information

08:10.088 --> 08:11.100
(laughs) out there on the internet,

08:11.100 --> 08:12.630
because that's one of the biggest

08:12.630 --> 08:14.130
bugbears people have with AI.

08:14.130 --> 08:16.430
And hopefully you can help avoid it with this.
