RNN

4 min readJan 15, 2021

The idea behind RNNs is to make use of sequential information.

In all traditional neural network we assume that all inputs and outputs are independent of each other. But for many tasks that’s bad idea. If you want to predict the next word in a sentence you better know which words came before it

RNNs are called recurrent because they perform the same task for every element of a sequence, with the output being depended on the previous computations

hence there is a need to remember the previous words. Thus RNN came into existence, which solved this issue with the help of a Hidden Layer. The main and most important feature of RNN is Hidden state, which remembers some information about a sequence.

RNN Hidden Layer kept the all information for each element though the time.

RNN have a “memory”,Hidden Layer do the task like memory ,which remembers all information about what has been calculated. It uses the same parameters for each input as it performs the same task on all the inputs or hidden layers to produce the output. This reduces the complexity of parameters, unlike other neural networks.

The working of a RNN can be understood with the help of below example:

Let me summarize the steps in a recurrent neuron for you-

A single time step of the input is supplied to the network i.e. xt is supplied to the network
We then calculate its current state using a combination of the current input and the previous state i.e. we calculate ht
The current ht becomes ht-1 for the next time step
We can go as many time steps as the problem demands and combine the information from all the previous states
Once all the time steps are completed the final current state is used to calculate the output yt
The output is then compared to the actual output and the error is generated
The error is then backpropagated to the network to update the weights(we shall go into the details of backpropagation in further sections) and the network is trained

But there is one disadvantages with RNN
,Recurrent Neural Networks suffer from short-term memory. If a sequence is long enough, they’ll have a hard time carrying information from earlier time steps to later ones. So if you are trying to process a paragraph of text to do predictions, RNN’s may leave out important information from the beginning.
During back propagation, recurrent neural networks suffer from the vanishing gradient problem. Gradients are values used to update a neural networks weights. The vanishing gradient problem is when the gradient shrinks as it back propagates through time. If a gradient value becomes extremely small, it doesn’t contribute too much learning.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — -

Bidirectional RNN is that information where both ends of the sequence is used to estimate the output.we use information from both future and past observations to predict the current one. the outputs would be generated by concatenating the word sequences at each time and generating weights accordingly

Bidirectional RNN work well for some problems where current word are dependent on past word and feature word

example like tasks of filling in the blank in a text sequence

Bidirectional RNNs are also exceedingly slow. The main reasons for this are that the forward propagation requires both forward and backward recursions in bidirectional layers and that the backpropagation is dependent on the outcomes of the forward propagation. Hence, gradients will have a very long dependency chain.

RNN

Written by Pradeep Dhote