Module assessment

1.

Suppose your text corpus contains 80,000 different words. Which of the following would help reduce the dimensionality of the input vector to a neural classifier?

Randomly select 10% of the words and ignore the rest.

Use convolutional layer before fully connected classifier layer

Use embedding layer before fully connected classifier layer

Select 10% of most frequently used words and ignore the rest

2.

We want to train a neural network to generate new funny words for a children's book. Which architecture can we use?

Word-level LSTM

Character-level LSTM

Word-level RNN

Character-level perceptron

3.

Recurrent neural network is called recurrent, because:

A network is applied for each input element and output from the previous application is passed to the next one

It's trained by a recurrent process

It consists of layers, which include other subnetworks

The network processes the entire input multiple times in repeated passes

4.

What is the main idea behind LSTM network architecture?

Fixed number of LSTM blocks for the whole dataset

It contains many layers of recurrent neural networks

LSTMs use gating mechanisms (forget, input, and output gates) that explicitly control which information is retained or discarded across time steps

LSTMs use a larger hidden state vector than simple RNNs

5.

What is the main advantage of using TF-IDF representation over a simple bag-of-words representation?

TF-IDF captures the order of words in a sentence

TF-IDF gives higher weight to words that are more important for distinguishing documents, by down-weighting common words

TF-IDF uses neural networks to learn word importance

TF-IDF produces lower-dimensional vectors than bag-of-words

Feedback