Whether you carry out a neural network yourself or you use an integrated in library for neural network knowing, it is of vital value to comprehend the significance of a sigmoid function. The sigmoid function is the crucial to understanding how a neural network discovers complex problems. This function also served as a basis for discovering other functions that cause effective and great options for supervised knowing in deep learning architectures.

In this tutorial, you will discover the sigmoid function and its role in learning from examples in neural networks.

After completing this tutorial, you will understand:

  • The sigmoid function
  • Direct vs. non-linear separability
  • Why a neural network can make intricate decision borders if a sigmoid system is used

Let’s get started.

A Gentle Introduction to sigmoid function. Photo by Mehreen Saeed, some rights reserved.

< img src ="https://machinelearningmastery.com/wp-content/uploads/2021/08/fish-300×223-1.jpg"alt="A Mild Introduction to sigmoid function. Image by Mehreen Saeed, some rights scheduled.

“width=” 716 “height=

“532”/ > A Mild Intro to sigmoid function. Image by Mehreen Saeed, some rights reserved. Tutorial Introduction This

  1. tutorial is divided into 3 parts; they are: The sigmoid function The sigmoid function and its residential or commercial properties
  2. Direct vs. non-linearly separable problems
  3. Utilizing a sigmoid as an activation function in neural networks

Sigmoid Function

The sigmoid function is a special form of the logistic function and is generally represented by σ(x) or sig(x). It is provided by:

σ(x) = 1/(1+exp -LRB–x-RRB-)Properties and Identities Of Sigmoid Function

The chart of sigmoid function is an S-shaped curve as shown by the green line in the chart below. The figure also shows the chart of the derivative in pink color. The expression for the derivative, in addition to some essential properties are shown on the right.

Graph of the sigmoid function and its derivative. Some important properties are also shown.

  1. Graph of the sigmoid function
  2. and its derivative. Some important homes are likewise revealed.
  3. A couple of other properties include: Domain:(-∞,+∞) Variety:(0, +1)σ(0)=
  4. 0.5 The function is monotonically increasing. The function is continuous everywhere.
  5. The function is differentiable everywhere in its domain. Numerically, it suffices to compute this function’s worth over a little variety of numbers, e.g., [-10, +10] For worths less than -10, the function’s value is nearly zero. For values greater than 10, the function’s worths are practically one.

The Sigmoid As A Squashing Function

The sigmoid function is likewise called a squashing function as its domain is the set of all real numbers, and its range is (0, 1). Thus, if the input to the function is either a huge negative number or a large favorable number, the output is always between 0 and 1. Exact same goes for any number in between -∞ and +∞.

Sigmoid As An Activation Function In Neural Networks

The sigmoid function is used as an activation function in neural networks. Simply to evaluate what is an activation function, the figure listed below programs the function of an activation function in one layer of a neural network. A weighted sum of inputs is gone through an activation function and this output functions as an input to the next layer.

A sigmoid unit in a neural network

< img src= "https://machinelearningmastery.com/wp-content/uploads/2021/08/sigmoidUnit-300×120.png"alt =" A sigmoid unit in a

neural network”width =”475″height =”190″/ > A sigmoid unit in a neural network When the activation function for a nerve cell is a sigmoid function it is a guarantee that the output of this unit will always be between 0 and 1. Likewise, as the sigmoid is a non-linear function, the output of this system would be a non-linear function of the weighted amount of inputs. Such a nerve cell that uses a sigmoid function as an activation function

is termed as a sigmoid unit. Linear Vs. Non-Linear

Separability? Expect we have a normal category problem, where we have a set of points in space and each point is assigned a class label. If a straight line (or a hyperplane in an n-dimensional area) can divide the 2 classes, then we have a linearly separable problem. On the other hand, if a straight line is insufficient to divide the 2 classes, then we have a non-linearly separable issue. The figure below shows data in the 2 dimensional area. Each point is appointed a red or blue class label. The left figure shows a linearly separable issue that needs a direct boundary to compare the two classes. The right figure shows a non-linearly separable problem, where a non-linear decision limit is required.

Linera Vs. Non-Linearly separable problems

Linera Vs. Non-Linearly separable issues For three dimensional space, a direct choice limit can be explained by means of the formula of an aircraft. For an n-dimensional area, the direct decision boundary is explained by the formula of a hyperplane.

Why The Sigmoid Function Is Important In Neural Networks?

If we use a direct activation function in a neural network, then this design can only find out linearly separable issues. However, with the addition of just one surprise layer and a sigmoid activation function in the surprise layer, the neural network can easily find out a non-linearly separable problem. Utilizing a non-linear function produces non-linear limits and thus, the sigmoid function can be utilized in neural networks for finding out complex decision functions.

The only non-linear function that can be used as an activation function in a neural network is one which is monotonically increasing. So for example, sin(x) or cos(x) can not be used as activation functions. Likewise, the activation function need to be specified everywhere and ought to be continuous everywhere in the area of genuine numbers. The function is also needed to be differentiable over the whole space of genuine numbers.

Normally a back propagation algorithm uses gradient descent to find out the weights of a neural network. To obtain this algorithm, the derivative of the activation function is needed.

The reality that the sigmoid function is monotonic, constant and differentiable everywhere, combined with the home that its derivative can be expressed in terms of itself, makes it simple to derive the update equations for finding out the weights in a neural network when using back propagation algorithm.


This section lists some ideas for extending the tutorial that you might want to explore.

If you check out any of these extensions, I ‘d love to know. Post your findings in the remarks below.

Further Reading

This section supplies more resources on the topic if you are wanting to go deeper.





In this tutorial, you found what is a sigmoid function. Specifically, you learned:

  • The sigmoid function and its properties
  • Linear vs. non-linear choice borders
  • Why adding a sigmoid function at the covert layer allows a neural network to find out complex non-linear limits

Do you have any concerns?

Ask your questions in the remarks below and I will do my best to address