WEBVTT

00:00.990 --> 00:07.170
Hello again! In this video, we are going to look at files and streams. You probably have some idea of

00:07.170 --> 00:08.700
what a file means.

00:09.210 --> 00:12.300
C++ has a very minimalistic interpretation.

00:12.780 --> 00:17.310
A file is just a sequence of bytes, which has some file name to identify it.

00:17.940 --> 00:24.180
So we do nt need to know whether the data is stored on a disk or an SD card or some physical device

00:24.180 --> 00:25.350
that is pretending to be a file.

00:26.010 --> 00:30.480
We do not know how exactly it is stored, and we do not care what it is meant to represent.

00:33.530 --> 00:38.060
The C++ library provides fstream objects for interacting with files.

00:38.600 --> 00:43.850
These are very similar to the iostream objects that we have used for interacting with the console, to send

00:44.300 --> 00:46.490
output to the console and read input from it.

00:48.180 --> 00:54.570
These fstream objects always access files sequentially, so that means one byte after another.

00:55.020 --> 00:59.280
You cannot jump forwards or go backwards, and they are always in the same order.

01:01.380 --> 01:07.470
The stream does not know how much data it is going to send or receive, and the stream has no concept of

01:08.100 --> 01:09.270
structure to the data.

01:09.900 --> 01:14.280
So this means, in particular, that fstreams do not understand file formats.

01:17.110 --> 01:20.260
There are four main things you can do with an fstream object.

01:20.590 --> 01:21.430
You can open it.

01:21.760 --> 01:24.400
So this will start interacting with a file.

01:25.390 --> 01:27.970
This will connect to the string object to the file, if you like.

01:28.420 --> 01:31.540
So after that, the file is available for use by your program.

01:32.560 --> 01:33.900
You can read.

01:34.030 --> 01:37.540
So this means that data is copied from the file into your program's memory.

01:38.440 --> 01:38.800
You can

01:38.890 --> 01:39.250
write,

01:39.280 --> 01:44.930
so that means that data is copied from your programs's memory to the file. And you can close the file.

01:45.460 --> 01:49.930
So after you have finished using your file, you can end your interaction session.

01:50.380 --> 01:53.590
And this will close the connection between the stream object and the file.

01:54.010 --> 02:00.070
And then that file is no longer available for use by the program. Unless an fstream object opens it

02:00.130 --> 02:00.400
again.

02:07.280 --> 02:13.250
Each of these operations is performed by calling a function in the operating system's API. The Application

02:13.250 --> 02:14.540
Program Interface.

02:15.200 --> 02:20.570
So the program is going to stop and wait and then the operating system is going to do the read or whatever.

02:21.140 --> 02:28.010
And then when it finishes, the call to the API will return. And the program will then resume executing

02:28.010 --> 02:28.910
your instructions.

02:29.390 --> 02:34.370
So while this call is being performed, your program has to stop and wait.

02:34.610 --> 02:36.620
It cannot execute any of your code.

02:40.790 --> 02:46.560
We cannot use a file until it has been opened. Once we have finished using a file,

02:46.580 --> 02:47.540
we should close it.

02:48.620 --> 02:53.420
If you are writing to a file, this will make sure that all the data that you wrote is actually saved

02:53.720 --> 02:54.230
correctly.

02:55.070 --> 03:01.070
And also, operating systems have a limit on the amount of files this program can have open at any one

03:01.070 --> 03:01.490
time.

03:02.210 --> 03:05.540
If you try to open too many files, the operating system will stop you.

03:05.930 --> 03:07.820
So this is a bit like allocating memory.

03:08.210 --> 03:13.070
If your program allocates too much memory, the operating system will stop you from allocating more

03:13.070 --> 03:16.670
memory. I actually fell into this once,

03:16.670 --> 03:20.360
many years ago. I had a loop that opened to file, but did not close it.

03:21.440 --> 03:25.790
And then eventually, after 40 iterations or whatever it was, the operating system said,

03:26.180 --> 03:26.990
"Sorry, no more!"

03:29.990 --> 03:35.480
When you compile a C++ program into a binary, the compiler will add some extra runtime code to

03:35.480 --> 03:35.990
the program.

03:36.380 --> 03:38.930
So this will set the program up and then call main().

03:39.440 --> 03:43.940
And then when main() returns, this code will do any clearing up that is needed.

03:44.300 --> 03:48.740
And one of the things it does is that it will go through all the files that the program has opened

03:49.220 --> 03:50.450
and make sure they are closed.

03:51.800 --> 03:55.180
So in theory, you do not actually need to close any files.

03:55.190 --> 03:59.300
But, as with releasing memory, it is a good idea to do it as soon as you can.

03:59.780 --> 04:05.870
It is actually more important with files, because other programs can access files and - well, you can have

04:05.870 --> 04:09.620
file looking, but C++ does not support that directly.

04:10.160 --> 04:16.070
So you could have other programs trying to use your file. So you really want to have your file

04:16.070 --> 04:17.870
open for the shortest possible time.

04:23.800 --> 04:26.830
And then one final point before we go on.

04:27.460 --> 04:34.210
When you send or receive data with a file, it is not going to be actually sent one byte at a time.

04:35.290 --> 04:39.910
The reason is that interacting with physical devices is very time-consuming.

04:40.420 --> 04:46.120
You know, it takes thousands of processor instructions. So you really want to minimize the number

04:46.120 --> 04:47.110
of actual transfers.

04:48.520 --> 04:52.030
So what happens is that the data is gathered up into a memory buffer.

04:52.360 --> 04:56.680
And then when it reaches the optimal size, the data gets sent off to the device.

04:59.470 --> 05:04.210
So this reduces the number of actual transfers, so that reduces the number of times your program has

05:04.210 --> 05:04.510
to stop and wait.

05:05.380 --> 05:11.400
However, there is a disadvantage to this. If your program waits too long between transferring data,

05:11.860 --> 05:18.850
then the discrepancy, between what the file says, and what your program thinks the file says, will get bigger

05:18.850 --> 05:19.400
and bigger.

05:19.900 --> 05:21.580
And that can cause problems.

05:21.910 --> 05:22.840
But we will come back to that.

05:23.470 --> 05:25.150
Okay, so that's it for this video.

05:25.480 --> 05:26.290
I'll see you next time.

05:26.290 --> 05:28.510
But meanwhile, keep coding!
