This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Suppose your text corpus contains 80000 different words. Which of the below would you complete to reduce the dimensionality of the input vector to neural classifier?
Randomly select 10% of the words and ignore the rest.
Use convolutional layer before fully-connected classifier layer
Use embedding layer before fully-connected classifier layer
Select 10% of most frequently used words and ignore the rest
We want to train a neural network to generate new funny words for a children's book. Which architecture can we use?
Word-level LSTM
Character-level LSTM
Word-level RNN
Character-level perceptron
Recurrent neural network is called recurrent, because:
A network is applied for each input element and output from the previous application is passed to the next one
It is trained by a recurrent process
It consists of layers which include other subnetworks
What is the main idea behind LSTM network architecture?
Fixed number of LSTM blocks for the whole dataset
It contains many layers of recurrent neural networks
Explicit state management with forgetting and state triggering
What is the main idea of attention?
Attention assigns a weight coefficient to each word in the vocabulary to show how important it is
Attention is a network layer that uses attention matrix to see how much input states from each step affect the final result.
Attention builds global correlation matrix between all words in vocabulary, showing their co-occurrence
You must answer all questions before checking your work.
Continue
Was this page helpful?