# Simple LSTM Experiment to Predict Pattern 010101… Understanding Hidden State

Hi,

I did a quick experiment to see if I could understand what the hidden state in an LSTM does…

I tried to make an LSTM predict a sequence of [1,0,1,0,1…] based off an input sequence of X with X[0] = 1 and the remainder as random noise.

X = [1, randFloat, randFloat, randFloat...]
label = [1, 0, 1, 0...]

In my head, the model would understand:

1. The inputs X mean nothing, or at least very little (as it’s noise) - so it’d discard these values for the most part
2. Solely the hidden state from the previous sequence/timestep n would be used to predict the next timestep n+1… [1, 0, 1, 0…]
3. I also set X[0] = 1 so the first initial in an attempt to guide the net to predicting 1 on the first item (which it does)

So, this didn’t work. In theory, should it not? Can you someone explain?

## Code
import os
import numpy as np
import torch

from torchvision import transforms
from torch import nn
from sklearn import preprocessing
from util import create_sequences
import torch.optim as optim

### Create some fake data

sequence_1 = torch.tensor(np.random.uniform(size=50)).float().detach()
sequence_1[0] = 1
sequence_2 = torch.tensor(np.random.uniform(size=50)).float().detach()
sequence_2[0] = 1

labels_1 = np.zeros(50)
labels_1[::2] = 1
labels_1 = torch.tensor(labels_1, dtype=torch.long)
labels_2 = labels_1.clone()

training_data = [sequence_1, sequence_2]
label_data = [labels_1, labels_2]

### Create simple LSTM Model

class LSTM(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(LSTM, self).__init__()
self.lstm = nn.LSTM(input_dim, hidden_dim)
self.fc = nn.Linear(hidden_dim, output_dim)

def forward(self, seq):
lstm_out, _ = self.lstm(seq.view(len(seq), 1, -1))
out = self.fc(lstm_out.view(len(seq), -1))
out = F.log_softmax(out, dim=1)
return out

### We try to overfit on the dataset

INPUT_DIM = 1
HIDDEN_DIM = 6
model = LSTM(INPUT_DIM, HIDDEN_DIM, 2)

loss_function = nn.NLLLoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

for epoch in range(500):
for i, seq in enumerate(training_data):
labels = label_data[i]
scores = model(seq)
loss = loss_function(scores, labels)
loss.backward()
print(loss)

optimizer.step()