1
00:00:05,190 --> 00:00:08,280
In this lecture, I'm going to talk about Jupyter Notebook.

2
00:00:09,240 --> 00:00:15,300
Occasionally, I get this strange question from students asking why I don't do things in Jupyter notebook.

3
00:00:15,900 --> 00:00:18,120
Let me explain why I think this is strange.

4
00:00:18,870 --> 00:00:24,780
Firstly, it's my position that it's completely unnecessary and actually doesn't change anything to

5
00:00:24,780 --> 00:00:26,340
use Jupyter Notebook or not.

6
00:00:27,090 --> 00:00:33,120
Let me repeat that using Jupyter Notebook is exactly the same as not using Jupyter Notebook.

7
00:00:33,390 --> 00:00:36,570
There's no difference other than the fact that it looks different.

8
00:00:36,870 --> 00:00:39,210
In other words, the screen is a different color.

9
00:00:39,870 --> 00:00:42,060
Obviously, it's such a difference is trivial.

10
00:00:42,720 --> 00:00:46,160
In this lecture, I'm going to demonstrate how that's the case.

11
00:00:51,270 --> 00:00:58,050
One major reason I dislike Jupyter Notebook is because it causes too many students to be unaware of

12
00:00:58,050 --> 00:00:59,550
how Python really works.

13
00:01:00,120 --> 00:01:06,030
If you're only comfortable inside notebook and when you see Python code in a text file or anywhere else

14
00:01:06,330 --> 00:01:09,180
and you get scared or intimidated, that's not good.

15
00:01:10,050 --> 00:01:14,760
That Python code that's in the text file is exactly the same as what would be in the notebook.

16
00:01:14,790 --> 00:01:16,890
There's really nothing scary about it.

17
00:01:17,610 --> 00:01:24,450
Programmers working at real jobs eventually need to write code that's going to be deployed and run automatically.

18
00:01:25,050 --> 00:01:31,650
In other words, your final code is going to sit in a Python file that runs by itself without Jupyter

19
00:01:31,650 --> 00:01:32,280
notebook.

20
00:01:33,240 --> 00:01:39,330
So if you're going to have any hope of using these skills in a real job, you had better be comfortable

21
00:01:39,330 --> 00:01:42,630
with writing Python code outside of Jupiter notebook.

22
00:01:43,320 --> 00:01:49,200
You would also better be aware that there's actually zero difference between writing code in a Jupiter

23
00:01:49,200 --> 00:01:52,290
notebook and writing code in Python or the console.

24
00:01:57,300 --> 00:02:02,700
Here's one example I like of how you might use Python in the quote unquote real world.

25
00:02:03,390 --> 00:02:08,400
Let's say you write a script that emails your boss to tell him you're going to be late for work.

26
00:02:09,030 --> 00:02:14,220
And let's say you don't actually want to send this email manually, but you want it to get sent automatically

27
00:02:14,220 --> 00:02:20,220
every Friday morning so that your boss doesn't yell at you for coming in to work late after you party

28
00:02:20,220 --> 00:02:21,420
too hard Thursday night?

29
00:02:22,170 --> 00:02:23,580
Well, that's very simple.

30
00:02:24,120 --> 00:02:28,800
All I have to do is on my server create what's called a crontab in it.

31
00:02:29,070 --> 00:02:34,860
I just enter the code for when I want this command to run and then to the right of that, I specify

32
00:02:34,860 --> 00:02:36,300
the command that I want to run.

33
00:02:37,110 --> 00:02:38,730
That's just Python space.

34
00:02:38,730 --> 00:02:44,220
And then the script name, as you can see, it's simply how you would run this Python scripts from the

35
00:02:44,220 --> 00:02:45,060
command line.

36
00:02:45,690 --> 00:02:52,350
And now, every Friday at 9:45 a.m., this script is going to send the same email to your boss.

37
00:02:52,410 --> 00:02:53,580
Tell him you'll be late.

38
00:02:55,150 --> 00:02:56,740
Well, let's not get off track here.

39
00:02:56,770 --> 00:03:01,600
The point of this is you really don't want to be using Jupyter Notebook for something like this.

40
00:03:06,750 --> 00:03:13,170
I think one perceived advantage of Jupiter notebook is that you can see the results of intermedia calculations.

41
00:03:13,800 --> 00:03:19,920
However, this is merely a perceived advantage and not a real advantage because you can do the exact

42
00:03:19,920 --> 00:03:22,920
same thing even when you're not inside notebook.

43
00:03:23,900 --> 00:03:30,110
Firstly, as I'm sure you've seen by now, I Python also prints out the results after you enter a command.

44
00:03:31,040 --> 00:03:36,860
I Python is called a reptile, which stands for Read Avow Print Loop, and that's generally what they

45
00:03:36,860 --> 00:03:37,440
all do.

46
00:03:37,490 --> 00:03:38,900
No matter what language you're in.

47
00:03:39,260 --> 00:03:45,560
So whether that's Python, Ruby or any other language, the key word here is print.

48
00:03:45,950 --> 00:03:46,640
Why is that?

49
00:03:47,450 --> 00:03:53,540
Well, one of my golden rules for writing and debugging code is when in doubt, print it out.

50
00:03:54,200 --> 00:03:59,720
I can't tell you how many times I've gotten a question on the Q&A when it could have been easily answered

51
00:03:59,720 --> 00:04:03,110
by inserting a print statement into the existing code.

52
00:04:08,280 --> 00:04:11,400
Anyway, what's the point of this long discussion about printing things out?

53
00:04:12,120 --> 00:04:17,550
Well, it's that if you think Jupyter Notebook is the only program that helps you do this, you need

54
00:04:17,550 --> 00:04:19,380
to expand your horizons a little bit.

55
00:04:19,980 --> 00:04:25,560
You should, in fact, always be doing this if you're not using an abundant amount of print statements

56
00:04:25,560 --> 00:04:27,750
while quoting, you are not doing it right.

57
00:04:28,470 --> 00:04:31,020
Remember, programming is not philosophy.

58
00:04:31,440 --> 00:04:33,810
You're not supposed to be running a program in your head.

59
00:04:34,260 --> 00:04:38,520
That's like trying to do long division in your head when you have a calculator in your hand.

60
00:04:39,970 --> 00:04:42,820
So just by printing things out, you can be more efficient.

61
00:04:43,330 --> 00:04:48,400
Stop trying to guess what a program will do and just let the program itself tell you what it's doing.

62
00:04:49,330 --> 00:04:52,900
The key takeaway here is you should always be printing things out.

63
00:04:52,930 --> 00:04:59,140
The fact that Jupyter Notebook shows you the result of each block of code is not simply a happy surprise.

64
00:05:04,190 --> 00:05:10,250
But another important lesson here is that if you want to use Jupyter Notebook, there's absolutely nothing

65
00:05:10,250 --> 00:05:11,660
stopping you from doing so.

66
00:05:12,230 --> 00:05:18,830
In other words, using Jupyter Notebook is 100 percent compatible with everything we are already doing.

67
00:05:19,640 --> 00:05:26,630
In fact, if you recall, your goal in these courses is not to run my code, but to write your own code.

68
00:05:26,840 --> 00:05:32,720
And of course, since it's your code, you can write it however you want, including Jupyter Notebook

69
00:05:33,470 --> 00:05:34,730
and the rest of his lecture.

70
00:05:34,730 --> 00:05:40,610
I'm going to prove to you that you can take any scripts from our course repository, which we know runs

71
00:05:40,610 --> 00:05:47,420
in the console because that's how I always demonstrate it and show you that this exact same code runs

72
00:05:47,420 --> 00:05:48,470
in Jupyter Notebook.

73
00:05:49,190 --> 00:05:49,850
Let's begin.

74
00:05:56,730 --> 00:06:03,370
OK, so let's say I'm in the folder numpy class, and I'm interested in the code inside classification

75
00:06:03,390 --> 00:06:10,080
example that PI, as you can see, what I have right now is this code inside a text editor.

76
00:06:10,890 --> 00:06:16,470
Now, if you're not aware of what a text editor is, it's just a program that shows you the contents

77
00:06:16,470 --> 00:06:19,410
of a text file and lets you edit those contents.

78
00:06:19,680 --> 00:06:22,080
It's the ideal program for writing code.

79
00:06:22,830 --> 00:06:28,110
Now, occasionally, if you're writing in a language like Java or Swift, you might want to use an I.D..

80
00:06:28,410 --> 00:06:30,120
But even then, it's totally optional.

81
00:06:30,750 --> 00:06:35,400
These days, I prefer to write Java in a plain text editor like sublime text as well.

82
00:06:36,600 --> 00:06:41,280
In any case, normally one does not need to use an I.D. for writing Python code.

83
00:06:42,870 --> 00:06:44,370
Now about Jupyter Notebook.

84
00:06:44,760 --> 00:06:51,510
Well, let's start up a Jupyter notebook, so I'm going to go Jupyter Notebook.

85
00:06:53,590 --> 00:06:57,610
All right, so now I've got Jupyter notebook running, so I'm going to start a new notebook.

86
00:07:01,000 --> 00:07:07,450
Also, notice that we started the notebook in the same directory as the relevant Python file, so that's

87
00:07:07,450 --> 00:07:09,220
something you want to keep in mind for the future.

88
00:07:10,500 --> 00:07:15,840
Now, at this point, what I'm going to do is I'm going to prove that everything in this Python file

89
00:07:16,170 --> 00:07:20,070
works exactly the same in the notebook as it does in the console.

90
00:07:23,610 --> 00:07:25,170
So let's start with the imports.

91
00:07:27,810 --> 00:07:28,920
Let's grab this one to.

92
00:07:32,630 --> 00:07:33,110
All right.

93
00:07:33,450 --> 00:07:34,370
Paste that in.

94
00:07:36,370 --> 00:07:36,760
Run it.

95
00:07:37,060 --> 00:07:38,410
So everything's fine so far.

96
00:07:39,730 --> 00:07:41,200
Now let's load in the data.

97
00:07:47,720 --> 00:07:50,720
OK, we've loaded in the data.

98
00:07:51,740 --> 00:07:56,510
Now, I'm not sure what its type is, so I can check it by using the tight function.

99
00:07:57,950 --> 00:07:58,820
So let me try that.

100
00:08:03,780 --> 00:08:10,080
Cool, so I get S.K. learned utils bunch, so that's the type of the variable data.

101
00:08:12,410 --> 00:08:18,320
Now, previously you recall that we did this example in Python, but as you can see, the result is

102
00:08:18,320 --> 00:08:20,000
the same in Jupyter notebook.

103
00:08:21,460 --> 00:08:27,070
Alternatively, if you wanted to run this Python file in the console, so you wanted to type.

104
00:08:28,940 --> 00:08:29,810
Let's say.

105
00:08:39,150 --> 00:08:45,900
So let's say you wanted to type in Python classification, example that PI, then you could just add

106
00:08:45,900 --> 00:08:50,430
some print statements if you wanted to show those same lines while this file was running.

107
00:08:53,980 --> 00:08:58,690
Now, I'm not going to go through such elaborate detail for the rest of this example, since you've

108
00:08:58,690 --> 00:08:59,440
already seen it.

109
00:08:59,800 --> 00:09:01,010
So let's just get through the rest.

110
00:09:01,030 --> 00:09:01,750
Line by line.

111
00:09:05,760 --> 00:09:09,590
Let's say I want to check out the keys in the data variable.

112
00:09:11,990 --> 00:09:12,460
OK.

113
00:09:12,590 --> 00:09:13,640
Looks good so far.

114
00:09:15,560 --> 00:09:18,440
Let's check the shape of the data attribute.

115
00:09:20,900 --> 00:09:21,350
OK.

116
00:09:21,470 --> 00:09:22,520
Looks good so far.

117
00:09:24,560 --> 00:09:25,820
Check the target's.

118
00:09:29,580 --> 00:09:29,890
OK.

119
00:09:30,180 --> 00:09:38,010
Still, what we expect the target names looks good.

120
00:09:41,310 --> 00:09:45,000
Target shape, it should be five, sixty nine, yeah.

121
00:09:47,090 --> 00:09:49,340
And let's check out the feature names.

122
00:09:52,990 --> 00:09:53,350
OK.

123
00:09:53,950 --> 00:09:55,090
That's the same as well.

124
00:09:56,890 --> 00:09:58,960
Now let's do our train to split.

125
00:10:03,700 --> 00:10:04,150
OK.

126
00:10:06,310 --> 00:10:10,060
Now, let's instantiate and fit our model.

127
00:10:15,420 --> 00:10:19,980
All right, now, let's check the train score.

128
00:10:23,540 --> 00:10:24,860
And the test score.

129
00:10:30,250 --> 00:10:30,790
All right.

130
00:10:32,410 --> 00:10:35,230
Now, let's see how we can make new predictions.

131
00:10:38,790 --> 00:10:44,580
So you see, have you assigned to a variable, it doesn't print the result, but if you have an expression,

132
00:10:46,620 --> 00:10:47,850
then it does print the result.

133
00:10:50,990 --> 00:10:57,560
So are some predictions, and this was an alternative way to calculate the accuracy of the predictions.

134
00:10:58,590 --> 00:11:07,160
So we have the same answer as before, and we also have this other example where we can use a neural

135
00:11:07,160 --> 00:11:08,420
network to do the same thing.

136
00:11:08,990 --> 00:11:13,700
So let's build the model, train it and print the train score.

137
00:11:16,090 --> 00:11:18,790
And let's print the test score as well.

138
00:11:24,020 --> 00:11:30,260
All right, so everything works the exact same way as it does without using notebook.

139
00:11:37,490 --> 00:11:45,980
Now, another thing I could have done was I could have just taken this whole thing and copied and pasted

140
00:11:45,980 --> 00:11:47,210
it in and run it.

141
00:11:47,720 --> 00:11:51,080
So that's another possible thing you can do.

142
00:11:52,070 --> 00:11:57,500
But the downside of that is you don't get to see the entire media outputs, but again, that's what

143
00:11:57,500 --> 00:11:58,610
print statements are for.

144
00:12:00,870 --> 00:12:03,840
All right, so what can we conclude from this exercise?

145
00:12:04,380 --> 00:12:11,070
Well, we can see that this code runs exactly the same inside Jupyter notebook as it does everywhere

146
00:12:11,070 --> 00:12:11,580
else.

147
00:12:12,180 --> 00:12:18,510
That's why I always say Python code is Python code, and no matter where it is, if you want to use

148
00:12:18,510 --> 00:12:24,150
Jupyter Notebook to run the course code, there's absolutely nothing stopping you from doing so.

