In this video, we will start learning about the mathematical formulation of neural networks. In the previous video, we established that the shape of an artificial neuron looks like this in order to resemble a real biological neuron. For a network of neurons, we normally divide it into different layers: the first layer that feeds the input into the network is obviously called the input layer. The set of nodes that provide the output of the network is called the output layer. And any sets of nodes in between the input and the output layers are called the hidden layers. When working with neural networks, the three main topics that we deal with are: forward propagation, backpropagation, and activation functions. The rest of this video will be mainly about forward propagation, and I will explain it through examples with numbers. Forward propagation is the process through which data passes through layers of neurons in a neural network from the input layer all the way to the output layer. Let's use one neuron and mathematically formulate the way information flows through it. As shown here, the data flows through each neuron by connections or the dendrites. Every connection has a specific weight by which the flow of data is regulated. Here x1 and x2 are the two inputs, they could be an integer or float. When these inputs pass through the connections, they're adjusted depending on the connection weights, w1 and w2. The neuron then processes this information by outputting a weighted sum of these inputs. It also adds a constant to the sum which is referred to as the bias. So, z here is the linear combination of the inputs and weights along with the bias, and a is the output of the network. For consistency, we will stick to these letters throughout the course, so z will always represent the linear combination of the inputs, and a will always represent the output of a neuron. However, simply outputting a weighted sum of the inputs limits the tasks that can be performed by the neural network. Therefore, a better processing of the data would be to map the weighted sum to a nonlinear space. A popular function is the sigmoid function, where if the weighted sum is a very large positive number, then the output of the neuron is close to 1, and if the weighted sum is a very large negative number, then the output of the neuron is close to zero. Non-linear transformations like the sigmoid function are called activation functions. Activation functions are another extremely important feature of artificial neural networks. They basically decide whether a neuron should be activated or not. In other words, whether the information that the neuron is receiving is relevant or should be ignored. The takeaway message here is that a neural network without an activation function is essentially just a linear regression model. The activation function performs non-linear transformation to the input enabling the neural network of learning and performing more complex tasks, such as image classifications and language translations. For further simplification, I am going to proceed with a neural network of one neuron and one input. Let's go over an example of how to compute the output. Let's say that the value of x1 is 0.1, and we want to predict the output for this input. The network has optimized weight and bias where w1 is 0.15 and b1 is 0.4. The first step is to calculate z, which is the dot product of the inputs and the corresponding weights plus the bias. So we find that z is 0.415. The neuron then uses the sigmoid function to apply non-linear transformation to z. Therefore, the output of the neuron is 0.6023. For a network with two neurons, the output from the first neuron will be the input to the second neuron. The rest is then exactly the same. The second neuron takes the input and computes the dot product of the input, which is a1 in this case, and the weight which is w2, and adds the bias which is b2. Using a sigmoid function as the activation function, the output of the network would be 0.7153. And this would be the predicted value for the input 0.1. This is in essence how a neural network predicts the output for any given input. No matter how complicated the network gets, it is the same exact process. To summarize, given a neural network with a set of weights and biases, you should be able to compute the output of the network for any given input. In the next video, we will start learning how to train a neural network and optimize the weights and the biases.