Input error for LSTM neural network

In summary, the problem is that you are trying to encode a DNA sequence with a LSTM neural network, but you are having trouble with the inputs. Both the sequence and the class are encoded with One Hot Encoding, but your code is not working correctly. You can resolve the issue by using the reshape function in numpy to convert the arrays into the correct shape.
  • #1
BRN
108
10
Hi everyone,

I have to classify a DNA sequence with a LSTM neural network but I have a problem with the inputs shame. Both the sequence and the class are encoded with One Hot Encoding and my code is this:

python code:
import pandas as pd
import numpy as np

data = pd.read_csv('splice.data', header = None)

data_shuffled = data.sample(frac = 1).reset_index(drop = True)

# space removing
for i in range (len(data)):
    data.loc[i][2] = data.loc[i][2].strip()

raw_data = np.array(data)

def one_hot_encoder(data):
    x = []
    y = []
    
    for i in range (len(data)):
        oh_class = np.zeros((1, 3))
        if data[i][0] == 'EI': oh_class[0][0] = 1
        elif data[i][0] == 'IE': oh_class[0][1] = 1
        else: oh_class[0][2] = 1
            
        y.append(oh_class)
            
        oh_seq = np.zeros((len(data[0][2]), 4))
        for j in range (len(data[0][2])):
            if data[i][2][j] == 'A': oh_seq[j][0] = 1
            elif data[i][2][j] == 'C': oh_seq[j][1] = 1
            elif data[i][2][j] == 'G': oh_seq[j][2] = 1
            else: oh_seq[j][3] = 1
            
        x.append(oh_seq)
                            
    return np.array(x), np.array(y)     

x_seq, y_class = one_hot_encoder(raw_data)

# Split into validation and training data
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x_seq, y_class, test_size = 0.2, random_state = 1)

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout

#Initialize the RNN

model = Sequential()
model.add(LSTM(units = 50, activation = 'relu', return_sequences = True, input_shape = (x_train.shape[1], x_train.shape[2])))
model.add(Dropout(0.2)) 
model.add(LSTM(units = 60, activation = 'relu', return_sequences = True))
model.add(Dropout(0.3)) 
model.add(LSTM(units = 80, activation = 'relu', return_sequences = True))
model.add(Dropout(0.4)) 
model.add(LSTM(units = 120, activation = 'relu'))
model.add(Dropout(0.5)) 
model.add(Dense(units = 3, activation='softmax'))
model.summary()

model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.fit(x_train, y_train, epochs = 100, batch_size = 80, validation_split = 0.1)

The error I receive is this

python error:
ValueError: Shapes (None, 1, 3) and (None, 3) are incompatible

Can anyone tell me how to solve?

Thanks!
 

Attachments

  • splice.data.txt
    311.5 KB · Views: 127
Technology news on Phys.org
  • #2
Somewhere you are trying to equate or merge (or somehow deal with) two arrays of different shapes. Which line is giving the error you pasted? Please include the whole error message.
 
  • Like
Likes BRN
  • #3
I'm sorry if I answer only now.

I resolved the issue and yes, and I had two arrays with different shapes. Now, I discover "reshape" function in numpy.

I'm a newbie with Python ...

Anyway, thanks for your concern!
 
  • Like
Likes Tom.G

FAQ: Input error for LSTM neural network

What is an input error for LSTM neural network?

An input error for LSTM neural network refers to an incorrect or invalid input that is fed into the network. This can happen due to various reasons such as incorrect data preprocessing, feature selection, or model architecture.

How does an input error affect the performance of an LSTM neural network?

An input error can significantly impact the performance of an LSTM neural network. It can cause the network to produce incorrect predictions, leading to poor accuracy and low model performance. In some cases, it can even cause the model to fail completely.

What are some common causes of input errors in LSTM neural networks?

Some of the common causes of input errors in LSTM neural networks include incorrect data formatting, missing values, outliers, and improper feature scaling. It is essential to thoroughly preprocess the data and carefully select the features to avoid input errors.

How can input errors be detected in an LSTM neural network?

Input errors can be detected by examining the network's performance metrics such as accuracy, loss, and validation error. If these metrics are consistently low or fluctuating, it may indicate the presence of input errors. Additionally, data visualization techniques can also help identify input errors.

How can input errors be prevented in an LSTM neural network?

To prevent input errors in an LSTM neural network, it is crucial to carefully preprocess the data and select appropriate features. It is also helpful to regularly monitor the model's performance and make adjustments as needed. Proper data validation and testing can also help detect and prevent input errors.

Similar threads

Replies
7
Views
7K
Replies
3
Views
5K
Replies
15
Views
2K
Replies
1
Views
1K
Replies
1
Views
2K
Back
Top