1
00:00:11,120 --> 00:00:16,280
So in this lecture, we'll be discussing the relationship between state space models and simple ordnance.

2
00:00:16,850 --> 00:00:21,710
Let's begin by reviewing the state space model so that we begin this lecture at a common point.

3
00:00:22,820 --> 00:00:28,970
OK, so the state space model is a linear model that specifies how some hidden state vector X of T is

4
00:00:28,970 --> 00:00:32,690
computed from the previous head and state vector X of T minus one.

5
00:00:33,020 --> 00:00:35,120
And some control input vector U of T.

6
00:00:35,720 --> 00:00:39,150
Note that these are related by matrices by convention.

7
00:00:39,170 --> 00:00:41,030
We call these matrices A and B.

8
00:00:41,810 --> 00:00:47,390
We also have another equation telling us how the hidden state is related to our observation vector y

9
00:00:47,390 --> 00:00:53,180
of T sometimes y of T can also be directly affected by the control input U of T.

10
00:00:53,600 --> 00:00:58,100
But very often this is simply left out in the second equation.

11
00:00:58,190 --> 00:01:00,890
By convention, we call the matrices C and D.

12
00:01:05,620 --> 00:01:10,330
So just as a quick reminder of how such a states based model might be used in the real world.

13
00:01:10,810 --> 00:01:14,410
Recall that this is the set of equations that we use in control systems.

14
00:01:15,010 --> 00:01:19,750
An example of a real world problem is that the inverted pendulum which you've seen if you've taken my

15
00:01:19,750 --> 00:01:26,020
courses on reinforcement learning, what's cool about typical courses on control systems is that unlike

16
00:01:26,020 --> 00:01:29,590
reinforcement learning, we often get to work with real world projects.

17
00:01:30,130 --> 00:01:35,620
So in the image here, you can see an example of a live physical carpool system that can be controlled

18
00:01:35,680 --> 00:01:37,600
using the techniques of control theory.

19
00:01:38,680 --> 00:01:43,990
So as an example of how this relates to our equations, we might represent our state vector with four

20
00:01:43,990 --> 00:01:50,050
measurements which are horizontal displacement, horizontal velocity angle from the vertical and angular

21
00:01:50,050 --> 00:01:50,710
velocity.

22
00:01:51,280 --> 00:01:54,280
What we get to observe might only be the displacement and angle.

23
00:01:54,580 --> 00:02:00,850
Since measuring velocity could be more difficult in practice because this system is based on the laws

24
00:02:00,850 --> 00:02:01,480
of physics.

25
00:02:01,840 --> 00:02:06,250
We can use physics equations to determine the matrices A, B, C and D.

26
00:02:11,000 --> 00:02:16,520
OK, so now let's recall the equations for a simple reason, suppose that we also include the output

27
00:02:16,520 --> 00:02:21,860
layer as well, and we assume that this Arnon is many too many, meaning that we compute the output

28
00:02:21,860 --> 00:02:22,940
for every time step.

29
00:02:23,600 --> 00:02:26,480
Note that for simplicity, I've excluded bias terms.

30
00:02:27,560 --> 00:02:32,600
In this case, our hidden state vector is called HFC, which is dependent on the previous head and state

31
00:02:32,600 --> 00:02:34,190
vector of T minus one.

32
00:02:34,640 --> 00:02:36,260
And the model input X50.

33
00:02:36,890 --> 00:02:43,010
They are related by the weights hidden and W input with an activation function sigma, which can be

34
00:02:43,010 --> 00:02:45,050
the real you 10h and so forth.

35
00:02:46,500 --> 00:02:52,680
Furthermore, the output y of T is related to the head and state h of T using the weight matrix W output.

36
00:02:57,500 --> 00:03:02,180
Of course, at this point, it should be obvious that the states base model in the Arnon are pretty

37
00:03:02,180 --> 00:03:07,730
much exactly the same, except that we've renamed some variables in the aunt and has a nominee already.

38
00:03:08,810 --> 00:03:14,930
As such, we can conclude that the ANA is just a non-linear generalization of the states base model.

39
00:03:15,620 --> 00:03:20,960
In addition, using the techniques of machine learning, this gives us another way to learn the matrices

40
00:03:20,960 --> 00:03:25,430
A, B, C and D without having to derive physics equations ourselves.