1
00:00:01,150 --> 00:00:02,950
Letter three, Project one.

2
00:00:04,090 --> 00:00:09,850
This is recognizing gestures with light sensor specifically for model training and development.

3
00:00:10,270 --> 00:00:16,090
Now, in the last lesson, we get the data samples we need to train the model before it can be used

4
00:00:16,090 --> 00:00:17,050
by a neural network.

5
00:00:17,230 --> 00:00:19,030
Data needs to be cleaned up.

6
00:00:19,600 --> 00:00:22,030
Sometimes you just change the range of data.

7
00:00:22,630 --> 00:00:27,790
For example, we might change the range from zero to 1000 to zero to one.

8
00:00:28,480 --> 00:00:32,410
Neural networks work better with smaller numbers than large ones.

9
00:00:33,160 --> 00:00:40,060
People who work on tiny email often use more sophisticated pre-processing techniques to get so-called

10
00:00:40,060 --> 00:00:44,170
features from real data, which speeds up the training process.

11
00:00:44,740 --> 00:00:50,710
And in this lesson, we will learn more about pre processing functions available in edge impulse and

12
00:00:50,710 --> 00:00:52,930
train simple, fully connected networks.

13
00:00:53,800 --> 00:00:58,870
One preparation you should make is that you need to make sure you have Arduino IDE installed and working

14
00:00:58,870 --> 00:01:01,760
with Real Terminal after you collected the samples.

15
00:01:01,780 --> 00:01:05,440
It is time to design an impulse now on impulse.

16
00:01:05,440 --> 00:01:13,080
Here is the word edge impulse used to denote data processing, training, pipeline bressan, create

17
00:01:13,090 --> 00:01:15,010
impulse and set window length.

18
00:01:16,150 --> 00:01:20,010
Four thousand ATMs and window length increased to 15 minutes.

19
00:01:21,300 --> 00:01:26,620
Now you can change these settings to make sure that each time and inferences mean we are going to take

20
00:01:26,620 --> 00:01:29,080
sensor measurements for 1000 amps.

21
00:01:30,010 --> 00:01:36,700
It is set to 40 hertz during data collection, which means that 40 times a second, 40 times a minute.

22
00:01:37,870 --> 00:01:44,950
So in a nutshell, your device is going to collect 40 data samples in a tiny window of a thousand miles.

23
00:01:45,940 --> 00:01:52,840
It will take these data samples to process them and feed them to a neural network to get an answer.

24
00:01:53,170 --> 00:01:57,550
We, of course, use the same window size while we are learning how to use it.

25
00:01:58,300 --> 00:02:04,150
And now for this proof of concept project, we are going to try three different processing blocks with

26
00:02:04,150 --> 00:02:11,700
default parameters, except for adding scaling flatten block, which takes computes average minimum

27
00:02:11,770 --> 00:02:14,890
blocks and other functions of real data within time window.

28
00:02:16,180 --> 00:02:21,700
Spectral features block, which extracts the frequency and power characteristics of a single.

29
00:02:21,730 --> 00:02:29,800
Over time, we have two blocks one called raw data, which sends raw data to the end and learning block

30
00:02:29,800 --> 00:02:32,110
and another called learning.

31
00:02:32,410 --> 00:02:34,360
This is optionally normalizing the data.

32
00:02:35,320 --> 00:02:37,210
Flooding block is the first thing we do today.

33
00:02:37,750 --> 00:02:42,310
Add this block, then add neural network, which is us as a learning block.

34
00:02:42,730 --> 00:02:47,590
We check the bulletin as input features and click on save impulse to save the changes you just made.

35
00:02:48,310 --> 00:02:55,690
Choose the processing block you want to flatten, then go to the next one, which has the name of processing

36
00:02:55,690 --> 00:02:56,500
block you choose.

37
00:02:57,070 --> 00:03:02,830
Flatten and there is a scaling field where they put zero point zero zero one and leave the rest of the

38
00:03:02,830 --> 00:03:04,270
scene when you're done.

39
00:03:04,660 --> 00:03:06,820
Click Save parameters, then generate.

40
00:03:07,810 --> 00:03:14,560
Then future visualization is particularly the useful tool in edge impulsive web interface, as it allows

41
00:03:14,560 --> 00:03:19,240
users to get graphical insights into how the data looks after pre-processing.

42
00:03:19,630 --> 00:03:23,500
For example, this is the data after flood in processing block.

43
00:03:23,950 --> 00:03:25,720
Here you can see in documentation.

44
00:03:26,850 --> 00:03:31,950
In this picture, we can see that the data points for different types of things are roughly divided

45
00:03:32,550 --> 00:03:38,910
now there is a lot of overlap you can see here between rock and other types of things which will cause

46
00:03:38,910 --> 00:03:42,270
problems and make the data less accurate for these types of things.

47
00:03:42,900 --> 00:03:47,190
After you've made and looked at the features, go to the classifier.

48
00:03:47,190 --> 00:03:54,480
Tom Gene, a simple, fully connected network with two hidden layers 20 and 10 neurons in each hidden

49
00:03:54,480 --> 00:04:02,910
layer for 500 epochs with a learning rate of one each for this network has two hidden layers 20 and

50
00:04:02,910 --> 00:04:04,110
10 neurons in each.

51
00:04:04,620 --> 00:04:09,490
Test results will be shown in a conflict matrix, which looks something like this.

52
00:04:10,140 --> 00:04:16,770
And then right after that, go back to create post-op, delete, flatten, block and just spectral features

53
00:04:16,770 --> 00:04:17,160
block.

54
00:04:17,640 --> 00:04:19,140
Generally, the features.

55
00:04:19,680 --> 00:04:26,790
Now you need to remember that the set scaling to zero 0.1 and trained neural network on special features

56
00:04:26,790 --> 00:04:28,470
need down here.

57
00:04:28,470 --> 00:04:30,120
You should see slight improvement.

58
00:04:31,250 --> 00:04:37,760
No, both flatten and spectral features blocks are actually not the best processing methods for Rock-Paper-Scissors

59
00:04:37,760 --> 00:04:38,750
gesture recognition.

60
00:04:39,770 --> 00:04:44,270
Think about it if we want to classify Rock-Paper-Scissors gestures.

61
00:04:44,450 --> 00:04:50,090
We just need to keep track of how many times and how long the light sensor has been getting values that

62
00:04:50,090 --> 00:04:51,230
are lower than normal.

63
00:04:52,190 --> 00:04:58,100
If it's been a long time, then it's rock feast passing above the sensors.

64
00:04:58,700 --> 00:04:59,590
There are seesaws.

65
00:04:59,630 --> 00:05:04,970
If there are two times that if you see source beeper is anything more than that.

66
00:05:05,330 --> 00:05:09,150
Sounds easy, but be serving time seriously.

67
00:05:09,170 --> 00:05:14,270
That is really important for neural network to be able to learn this relationship in data points.

68
00:05:15,260 --> 00:05:20,480
Now, flattening spectral features, processing blocks, they actually both remove the time relationship

69
00:05:20,480 --> 00:05:21,380
in each window.

70
00:05:21,950 --> 00:05:29,180
The flattened blocks simply turns the real values, which are in order into average, minimum and maximum,

71
00:05:29,180 --> 00:05:33,020
and other values that are calculated in the values in the time window.

72
00:05:33,530 --> 00:05:40,550
No matter how they are arranged, because this task was so simple, the spectral features block didn't

73
00:05:40,550 --> 00:05:45,710
work as well as it should have, because the duration of each gesture was too short.

74
00:05:46,400 --> 00:05:52,160
That means that the best way to get the best performance is to use your raw data block.

75
00:05:52,820 --> 00:05:55,130
This will give the time series data safe.

76
00:05:55,820 --> 00:06:02,450
Take a look at the sample project where we use raw data in a network that isn't connected as fully connected

77
00:06:02,450 --> 00:06:02,900
network.

78
00:06:03,620 --> 00:06:08,990
We were able to get ninety two point four percent of the same data right the first time around.

79
00:06:09,680 --> 00:06:15,770
For now, we're going to use a simple, fully connected model and spectral features processing to meet

80
00:06:15,770 --> 00:06:21,050
the model and look at the features and then both flatten and spectral features.

81
00:06:21,050 --> 00:06:25,910
Blocks are actually not the best processing methods, as I've said earlier.

82
00:06:26,660 --> 00:06:30,140
Then the final results after training were flat and FC.

83
00:06:30,650 --> 00:06:37,760
We had a sixty nine point nine accuracy spectral features FC seventy point four percent accuracy.

84
00:06:38,180 --> 00:06:44,450
Then the raw data called 1D ninety two point four per cent accuracy.

85
00:06:45,320 --> 00:06:50,930
Then the next slide we will be discussing on convolutions and how useful they are in later articles

86
00:06:50,930 --> 00:06:52,190
about sound processing.

87
00:06:52,580 --> 00:06:56,870
For now, we're going to use simple, fully connected model and spectral features processing.

88
00:06:57,680 --> 00:07:02,990
After you train the model, you can use the Lib Classification tab to see how well it works.

89
00:07:03,440 --> 00:07:09,470
Now this will take a sample of data from your device and classify it with the model that's on edge impulse.

90
00:07:10,010 --> 00:07:16,280
Now, we tried three different gestures and see that the accuracy actually somehow good enough to show

91
00:07:16,580 --> 00:07:17,960
that the idea is real.

92
00:07:18,670 --> 00:07:22,190
Now the next step is deployment on device here.

93
00:07:22,460 --> 00:07:29,030
After clicking on deployment to choose Arduino Library and download it, extract the archive and place

94
00:07:29,030 --> 00:07:31,160
it in your Arduino Libraries folder.

95
00:07:31,550 --> 00:07:40,250
Make sure to open Arduino IDE and choose Static Buffer Sketch, which is located in file and then examples

96
00:07:40,670 --> 00:07:43,070
that locate for name of your project.

97
00:07:43,700 --> 00:07:50,330
And then you could see Static Buffer, which already has all the boilerplate code for classification

98
00:07:50,690 --> 00:07:51,890
with your model place.

99
00:07:52,580 --> 00:07:58,850
Now, the only thing for use to fill in this data acquisition and device will use a simple for loop

100
00:07:59,210 --> 00:08:01,160
with delete to account for frequency.

101
00:08:01,520 --> 00:08:08,360
And if you remember, we had 25 MHz delay when gathering data for training data.

102
00:08:08,570 --> 00:08:11,510
Now, RMS stands for millisecond.

103
00:08:11,990 --> 00:08:13,340
Let's practice for a drop.

104
00:08:13,940 --> 00:08:19,820
If we were to do this, we could use a sensor data buffer that would let us do inference more often.

105
00:08:20,870 --> 00:08:26,300
There will be more lessons in the course about that now, following your changes to this sample code

106
00:08:26,720 --> 00:08:30,950
uploaded through real terminal and open a serial monitor.

107
00:08:31,280 --> 00:08:37,400
When you move your hand while making a gesture, you can see the probability results on the serial monitor,

108
00:08:37,670 --> 00:08:42,240
and that same goes for the paper and same goes for seesaws.

109
00:08:42,740 --> 00:08:49,670
Now, when it was just a proof of concept demonstration, it actually really shows tiny animals up to

110
00:08:49,670 --> 00:08:50,870
something big, right?

111
00:08:51,500 --> 00:08:52,580
You probably knew it.

112
00:08:52,790 --> 00:08:56,120
It is possible to record these gestures with a camera sensor.

113
00:08:56,480 --> 00:09:03,650
Even image is downscaled a lot, but recognizing gestures with just one pixel is entirely different

114
00:09:03,650 --> 00:09:04,100
level.

115
00:09:04,790 --> 00:09:12,680
Now try increasing or decreasing number of neurons in first and second hidden layers and look or see

116
00:09:12,680 --> 00:09:15,650
how that affects the accuracy and inference time.
