WEBVTT

0
00:00.630 --> 00:01.740
In the last lesson,

1
00:01.800 --> 00:06.800
we managed to get our API to work and we got back some live data.

2
00:07.800 --> 00:08.910
But when we ran it,

3
00:09.030 --> 00:13.500
we saw that some of the texts that we were getting back was formatted really

4
00:13.500 --> 00:17.970
strangely with these pound signs and ampersands,

5
00:18.090 --> 00:21.290
and it's not the actual text that we see.

6
00:21.920 --> 00:24.290
So what's happening here? Well,

7
00:24.380 --> 00:28.310
what we're actually seeing here are called HTML entities,

8
00:28.880 --> 00:33.880
and there are a way of replacing certain characters in HTML so that it doesn't

9
00:35.060 --> 00:38.990
get confused with HTML code. So for example,

10
00:39.020 --> 00:44.020
the less than symbol could be a part of HTML code.

11
00:44.600 --> 00:48.590
And instead of using that, we have to use the

12
00:48.650 --> 00:51.890
&lt and then semicolon.

13
00:52.850 --> 00:55.100
So if we look down this table,

14
00:55.370 --> 01:00.370
we can actually see this &quot; actually stands for a double quotation mark.

15
01:01.940 --> 01:05.930
And that would make sense cause it's saying "Mario Kart

16
01:05.930 --> 01:07.400
64"

17
01:08.060 --> 01:11.570
and this #039,

18
01:11.570 --> 01:16.570
if we look up in this list, is actually a single quotation mark.

19
01:17.570 --> 01:20.960
And that would make sense as well, cause it would be Stalin's death.

20
01:21.740 --> 01:26.270
So how do we get hold of the actual human readable text? Well,

21
01:26.420 --> 01:30.740
we can use this tool called the free formatter to

22
01:31.160 --> 01:36.050
unescape the HTML results that we're getting back from our API.

23
01:36.890 --> 01:40.040
I've copied and pasted this part we've got here.

24
01:41.540 --> 01:43.790
And if I go ahead and click on unescape,

25
01:44.060 --> 01:49.060
you can see that it formats it into the original human readable format

26
01:49.430 --> 01:53.480
and now it says in "Mario Kart 64" Waluigi is a playable character.

27
01:54.110 --> 01:58.580
And if I paste the cold war ended with Joseph Stalin, blah, blah, blah,

28
01:59.030 --> 02:02.120
death and I click unescape, then you can see 

29
02:02.120 --> 02:06.500
it says the Cold War ended with Joseph Stalin's death and it replaces that with

30
02:06.560 --> 02:08.450
an apostrophe. Now,

31
02:08.450 --> 02:13.070
essentially we know what to Google and that's kind of the first step towards

32
02:13.070 --> 02:14.240
solving any problem.

33
02:14.720 --> 02:18.620
So if you Google for escaping HTML entities in Python,

34
02:18.860 --> 02:22.940
then the first result we get in Stack Overflow gives us the answer.

35
02:23.540 --> 02:28.340
We have to import the HTML module and use one of the methods in that module

36
02:28.340 --> 02:33.110
called unescape in order to unescape the text that we're getting back.

37
02:33.950 --> 02:37.790
The part where we're interested in this is in our quiz brain,

38
02:38.270 --> 02:42.260
because that's the part what we format it into our user answer.

39
02:43.130 --> 02:45.330
Let's change the question

40
02:45.380 --> 02:50.380
text to be equal to the self.current_question.text,

41
02:50.870 --> 02:52.640
so this part that we have here

42
02:52.970 --> 02:57.970
which is being put into our input and we can use this q_text instead.

43
02:59.800 --> 03:03.640
But instead of using just the text that we get back from the API,

44
03:03.970 --> 03:06.550
we're going to import the HTML module.

45
03:08.550 --> 03:08.970
<v 1>Yeah.</v>

46
03:08.970 --> 03:13.970
<v 0>And we're going to use the method inside this HTML module called unescape</v>

47
03:14.700 --> 03:18.150
to unescape this string that we get from the API.

48
03:18.930 --> 03:23.700
And now if I run this code again, you can see that this time,

49
03:23.730 --> 03:28.730
no matter what is inside the string, say an apostrophe in this case or a double

50
03:31.860 --> 03:35.580
quote in this case, they're all being formatted correctly.

51
03:36.780 --> 03:37.613
There you have it.

52
03:38.070 --> 03:42.270
We started off with some strange characters and after a bit of Googling around,

53
03:42.270 --> 03:47.190
we found the solution to turn them into human readable text. Aas a programmer,

54
03:47.220 --> 03:49.500
this is a skill that you have to really hone.

55
03:49.800 --> 03:53.040
This is something that is going to take you to the next level

56
03:53.250 --> 03:55.680
to this intermediate++ level.

57
03:56.010 --> 03:59.040
You have to find out solutions to your own problems,

58
03:59.340 --> 04:01.020
and Google is your best friend.