WEBVTT

1
00:00:00.000 --> 00:00:02.190
<v ->Hey, in this video, you're gonna have a look and explore</v>

2
00:00:02.190 --> 00:00:04.770
what is this concept called hallucinations.

3
00:00:04.770 --> 00:00:06.600
We'll look at what hallucinations are

4
00:00:06.600 --> 00:00:07.860
and how you can avoid them

5
00:00:07.860 --> 00:00:10.740
when you're using large language models such as ChatGPT.

6
00:00:10.740 --> 00:00:13.530
Firstly, hallucinations are basically

7
00:00:13.530 --> 00:00:16.920
some type of factually incorrect text

8
00:00:16.920 --> 00:00:18.690
that has been created by the LLM

9
00:00:18.690 --> 00:00:19.770
or the large language model.

10
00:00:19.770 --> 00:00:22.890
Large language models are simply predicting the next token

11
00:00:22.890 --> 00:00:24.810
and therefore they can often produce

12
00:00:24.810 --> 00:00:26.850
factually incorrect text.

13
00:00:26.850 --> 00:00:29.130
There was a study done in 2022

14
00:00:29.130 --> 00:00:32.880
about ChatGPT was creating non-factual references

15
00:00:32.880 --> 00:00:34.410
within the science community.

16
00:00:34.410 --> 00:00:36.180
And this can be a really large problem

17
00:00:36.180 --> 00:00:38.490
in areas like law and healthcare.

18
00:00:38.490 --> 00:00:41.610
So let's have a look at an example inside of ChatGPT

19
00:00:41.610 --> 00:00:43.650
where we're telling it to create some fake

20
00:00:43.650 --> 00:00:45.570
or hallucinated content

21
00:00:45.570 --> 00:00:48.870
and how we can tell ChatGPT to avoid hallucinating.

22
00:00:48.870 --> 00:00:50.587
So you can see here we've had a prompt,

23
00:00:50.587 --> 00:00:53.700
"Who was the first man to visit to Jupyter?"

24
00:00:53.700 --> 00:00:55.170
And I've just said, "Make something up."

25
00:00:55.170 --> 00:00:57.360
This is obviously not factual, it's not correct.

26
00:00:57.360 --> 00:00:59.617
Now obviously, I've had to say something like,

27
00:00:59.617 --> 00:01:00.630
"Make something up,"

28
00:01:00.630 --> 00:01:03.750
but there will be scenarios where you ask ChatGPT something

29
00:01:03.750 --> 00:01:05.040
and it will come up with something

30
00:01:05.040 --> 00:01:07.080
that's not technically true.

31
00:01:07.080 --> 00:01:08.250
How can we get over that?

32
00:01:08.250 --> 00:01:11.580
How can we ground ChatGPT in some knowledge

33
00:01:11.580 --> 00:01:13.320
that either is at your own company

34
00:01:13.320 --> 00:01:14.640
or your personal information?

35
00:01:14.640 --> 00:01:17.310
So I'm gonna give you a technique that you can use for this.

36
00:01:17.310 --> 00:01:20.740
So I'm gonna go to Google and type digital marketing

37
00:01:22.200 --> 00:01:23.940
and I'm gonna take some information here.

38
00:01:23.940 --> 00:01:25.080
And this could be, you know,

39
00:01:25.080 --> 00:01:27.420
you might be using information from your own company

40
00:01:27.420 --> 00:01:29.160
or it could be a PDF.

41
00:01:29.160 --> 00:01:30.780
I'm gonna create a new chat.

42
00:01:30.780 --> 00:01:33.210
I'm gonna triple quote

43
00:01:33.210 --> 00:01:36.450
and then I'm gonna paste this information in like this.

44
00:01:36.450 --> 00:01:37.717
And I'm gonna say,

45
00:01:37.717 --> 00:01:40.030
"I want you to answer

46
00:01:41.130 --> 00:01:43.410
the following query

47
00:01:43.410 --> 00:01:48.180
by only referencing the documents above

48
00:01:48.180 --> 00:01:51.960
in triple quote characters.

49
00:01:51.960 --> 00:01:56.190
If you don't have an answer,

50
00:01:56.190 --> 00:01:58.143
simply respond with,

51
00:01:59.190 --> 00:02:00.507
I don't know."

52
00:02:01.440 --> 00:02:06.210
And then we can ask ChatGPT, what is the weather, right?

53
00:02:06.210 --> 00:02:09.150
And basically you'll see that we've now grounded ChatGPT

54
00:02:09.150 --> 00:02:11.880
in some knowledge right from the internet.

55
00:02:11.880 --> 00:02:16.880
So we can also say, "What is social media marketing?"

56
00:02:17.490 --> 00:02:19.500
And what you'll see is that the knowledge

57
00:02:19.500 --> 00:02:21.540
that was contained within those triple quotes

58
00:02:21.540 --> 00:02:23.790
was actually factually grounded

59
00:02:23.790 --> 00:02:26.910
based off some information that I chose to ground it in.

60
00:02:26.910 --> 00:02:28.080
So this is a way where

61
00:02:28.080 --> 00:02:29.970
if you've got some company information

62
00:02:29.970 --> 00:02:32.850
or you have some specific documents

63
00:02:32.850 --> 00:02:34.380
or certain types of knowledge,

64
00:02:34.380 --> 00:02:36.660
you could put that inside triple quotes

65
00:02:36.660 --> 00:02:37.987
and you can say to ChatGPT,

66
00:02:37.987 --> 00:02:40.500
"I only want you to answer from this knowledge."

67
00:02:40.500 --> 00:02:42.270
And then you're grounding ChatGPT

68
00:02:42.270 --> 00:02:44.370
in not knowledge that it knows internally,

69
00:02:44.370 --> 00:02:46.410
but knowledge that you've decided is important

70
00:02:46.410 --> 00:02:47.850
for answering your question.

71
00:02:47.850 --> 00:02:50.550
So this is one common use case

72
00:02:50.550 --> 00:02:54.060
and technique that you can use to overcome hallucinations

73
00:02:54.060 --> 00:02:55.980
for when factual consistency

74
00:02:55.980 --> 00:02:57.990
is really important in the answers.

75
00:02:57.990 --> 00:03:00.510
Another way that you can factually ground a reference

76
00:03:00.510 --> 00:03:04.260
inside of ChatGPT is adding a file to a conversation.

77
00:03:04.260 --> 00:03:06.390
For example, we can go and hit the plus button here

78
00:03:06.390 --> 00:03:08.340
and then go to add photos and files.

79
00:03:08.340 --> 00:03:11.250
I'm gonna put a file that you can download in this lecture

80
00:03:11.250 --> 00:03:12.480
to follow along with.

81
00:03:12.480 --> 00:03:15.180
Go and download a digital company profile.

82
00:03:15.180 --> 00:03:16.777
And then I'm gonna say,

83
00:03:16.777 --> 00:03:21.537
"What is the company Aurora?"

84
00:03:22.380 --> 00:03:25.770
And because you've added a file, you can ground ChatGPT

85
00:03:25.770 --> 00:03:28.410
to answer specifically from that documentation.

86
00:03:28.410 --> 00:03:29.940
You can add lots of different files

87
00:03:29.940 --> 00:03:32.700
and also images directly inside of ChatGPT

88
00:03:32.700 --> 00:03:34.470
to help sure that it doesn't hallucinate.

89
00:03:34.470 --> 00:03:36.300
In the next video, we're gonna have a look at the difference

90
00:03:36.300 --> 00:03:38.670
between two types of models that have come out recently.

91
00:03:38.670 --> 00:03:42.030
So there is chat models, which are the older type of models,

92
00:03:42.030 --> 00:03:45.150
the newer type of models, which are reasoning models.

93
00:03:45.150 --> 00:03:47.250
And we'll look at the differences between those two

94
00:03:47.250 --> 00:03:48.750
and how you can get started with them.

95
00:03:48.750 --> 00:03:50.313
Cool, see you in the next one.

