1
00:00:00,000 --> 00:00:02,370
Part of the vision of TensorFlow

2
00:00:02,370 --> 00:00:03,540
to make machine learning and

3
00:00:03,540 --> 00:00:06,779
deep learning easier to
learn and easier to use,

4
00:00:06,779 --> 00:00:09,900
is the concept of having
built-in data sets.

5
00:00:09,900 --> 00:00:11,910
You seen the little bit
of a preview of

6
00:00:11,910 --> 00:00:13,860
the way back in the first course,

7
00:00:13,860 --> 00:00:17,220
when the fashion MNEs was
available to you without you

8
00:00:17,220 --> 00:00:18,600
needing to download and split

9
00:00:18,600 --> 00:00:21,460
the data into
training a test sets.

10
00:00:21,680 --> 00:00:25,515
Expanding on this,
there's a library called

11
00:00:25,515 --> 00:00:29,160
TensorFlow Data Services
or TFTS for short,

12
00:00:29,160 --> 00:00:31,200
and that contains many data sets

13
00:00:31,200 --> 00:00:33,045
and lots of different categories.

14
00:00:33,045 --> 00:00:36,120
Here's some examples;
and while we

15
00:00:36,120 --> 00:00:37,440
can see that there are many

16
00:00:37,440 --> 00:00:39,295
different data sets
for different types,

17
00:00:39,295 --> 00:00:42,290
particularly image-based,
there's also a few for text,

18
00:00:42,290 --> 00:00:46,440
and we'll be using
the IMDB reviews dataset next.

19
00:00:46,580 --> 00:00:49,160
This dataset is ideal

20
00:00:49,160 --> 00:00:51,560
because it contains
a large body of texts,

21
00:00:51,560 --> 00:00:53,990
50,000 movie reviews which

22
00:00:53,990 --> 00:00:56,240
are categorized as
positive or negative.

23
00:00:56,240 --> 00:00:59,584
It was authored by
Andrew Mass et al at Stanford,

24
00:00:59,584 --> 00:01:02,760
and you can learn more
about it at this link.