WEBVTT

0
00:01.750 --> 00:02.770
All right, so here we are.

1
00:03.000 --> 00:09.180
I just want to discuss what happens when we include a character that is not specified in the encoding...

2
00:09.180 --> 00:12.270
type that we tell the browser that we accept.

3
00:12.990 --> 00:13.770
That's quite interesting.

4
00:14.370 --> 00:19.470
Let's just start off by creating a very simple form. The " "action" we don't need...

5
00:20.040 --> 00:22.320
and we're going to be getting into the action attribute later.

6
00:23.160 --> 00:25.920
Now, let me just show you a GET request.

7
00:27.490 --> 00:31.570
I want us to have our attribute, "accept-charset". 

8
00:32.610 --> 00:35.510
Let's just start of by saying it's utf-8. 

9
00:37.100 --> 00:42.620
Very simple. Then what I want to do is I want us to use the character theta.

10
00:43.940 --> 00:46.280
Theta, I've just copied it from a Word document...

11
00:46.510 --> 00:47.110
that's all I've done.

12
00:48.330 --> 00:54.920
This is theta. And the reason I'm using theta is that this is not included in the ISO character.

13
00:55.620 --> 00:58.110
So I just want to show you what happens. Anyway...

14
00:58.440 --> 01:00.960
now I want us to have our input here.

15
01:01.530 --> 01:03.850
The next thing we need on our input is a name...

16
01:03.870 --> 01:06.720
otherwise it won't be submitted to the server.

17
01:06.720 --> 01:09.810
And let's just give it a name of "yourName".

18
01:11.920 --> 01:16.210
Very simple. And then lastly, if we save this and you look at our browser right now, we don't have

19
01:16.210 --> 01:16.890
a submit button.

20
01:17.440 --> 01:20.950
So all I want to do is create a submit button. We're going to see this later on in the course...

21
01:21.400 --> 01:26.710
but for now, just know we can define another type on this input element being "submit".

22
01:27.700 --> 01:32.680
And we can give it a value of "submit me", or just "submit", to keep it simple.

23
01:34.080 --> 01:35.390
So far, so good, right?

24
01:35.490 --> 01:41.190
And what happens if we do type theta into this text box? Let me just zoom in a little bit so you can...

25
01:41.190 --> 01:42.930
see, we're putting theta in here. 

26
01:43.860 --> 01:50.310
If we click submit, you can see in the URL, you've literally got the name of our variable, which...

27
01:50.310 --> 01:53.890
we've called yourName, and we've assigned that to the value of theta.

28
01:54.510 --> 01:59.910
And what's really cool, is that this is allowed because we've specified utf-8 as our encoding type...

29
01:59.920 --> 02:05.100
and as we've seen, it's got over 1.1 million values that it can represent, or characters...

30
02:05.100 --> 02:07.950
I should say. And theta is definitely one of them.

31
02:08.700 --> 02:15.330
The other thing you should bear in mind, is that the "method" and our "accept-charset" attributes here, are...

32
02:15.330 --> 02:16.320
the default values.

33
02:16.560 --> 02:19.080
So actually we could delete everything...

34
02:19.080 --> 02:22.710
we could refresh the page, let me just delete the URL...

35
02:24.400 --> 02:28.630
put in theta again, click submit, and we get exactly the same result.

36
02:29.350 --> 02:34.660
So this only becomes important, of course, when you want to start messing around with the accept-charset

37
02:34.740 --> 02:37.250
attribute, which we're going to do now, by the way.

38
02:37.840 --> 02:41.770
So let's define our character encoding now, accept-charset...

39
02:42.580 --> 02:45.370
but now this time, I want us to use ISO. 

40
02:48.340 --> 02:54.580
Nowwwww things are going to get interesting. Let's delete the URL, so it's very clean. Let's say the...

41
02:54.580 --> 02:58.410
use types theta, and let the user now click submit.

42
02:58.420 --> 02:59.370
What do you think is going to happen?

43
03:02.210 --> 03:07.390
We'll look at that, we get thrown back to us, a weird looking URL with a whole lot of percentage...

44
03:07.400 --> 03:07.880
signs.

45
03:09.000 --> 03:14.610
And it's very hard for the server, if not impossible sometimes, to actually determine what the character...

46
03:14.610 --> 03:16.710
encoding of such a request was.

47
03:17.310 --> 03:17.850
Why?

48
03:18.270 --> 03:24.780
Because the server usually assumes that it is the same as the encoding the page containing the form...

49
03:24.780 --> 03:25.560
was submitted in.

50
03:26.130 --> 03:31.590
In other words, the server is going to assume in our instance here, that the encoding type is utf-8, 

51
03:31.980 --> 03:37.530
and it's going to try and decode that information using utf-8, and it's not going to work.

52
03:37.710 --> 03:42.540
So to cut a long story short, once you start adding characters outside of your defined character...

53
03:42.540 --> 03:43.260
encoding type...

54
03:43.680 --> 03:46.410
a lot of strange things start to happen.

55
03:46.710 --> 03:49.350
And it's also very browser dependent, I might add.

56
03:49.680 --> 03:51.450
Different browsers do different things.

57
03:51.720 --> 03:55.800
Browsers might replace the unsupported characters with useless question marks,

58
03:56.160 --> 04:00.990
they may attempt to fix the characters, for example, smart quotes to regular quotes...

59
04:01.560 --> 04:07.920
some browsers might replace the character with a character entity reference, or some browsers might

60
04:07.920 --> 04:13.980
send it anyway as a different character encoding that what you specified in the form, mixing it in with...

61
04:13.980 --> 04:19.680
the original encoding of your entire document. Whew, I know, I've just said a lot.

62
04:20.130 --> 04:23.640
And bear in mind the example we looked at, was on Chrome...

63
04:23.640 --> 04:30.300
and what Chrome has tried to do, is it's trying to replace the unknown character with a character entity

64
04:30.300 --> 04:31.620
reference in the browser.

65
04:31.950 --> 04:33.420
I know that might be a bit confusing.

66
04:33.420 --> 04:33.950
Don't worry.

67
04:34.230 --> 04:41.790
In the next lecture, let me just explain a bit more exactly why we see these weird numbers with percentage...

68
04:41.790 --> 04:42.700
signs everywhere.

69
04:43.020 --> 04:46.440
This is really, really interesting and it is very advanced.

70
04:46.440 --> 04:51.330
And I know we are very early on in the course, so I don't want to scare you. If you're not quite following

71
04:51.330 --> 04:51.480
me...

72
04:51.480 --> 04:52.080
it's okay.

73
04:53.050 --> 04:56.230
We're going to get through it together. I'll see you in the next lecture.