Skip to content Skip to sidebar Skip to footer

Padding Sequences In Tensorflow With Tf.pad

I am trying to load imdb dataset in python. I want to pad the sequences so that each sequence is of same length. I am currently doing it with numpy. What is a good way to do it in

Solution 1:

If you want to use tf.pad, according to me you have to iterate for each row.

Code will be something like this:

max_length = 250
number_of_samples = 5

padded_data = np.ndarray(shape=[number_of_samples, max_length],dtype=np.int32)   
sess = tf.InteractiveSession()

for i in range(number_of_samples):
    reviewToBePadded = dataSet[i] #dataSet numpy array
    paddings = [[0,0], [0, maxLength-len(reviewToBePadded)]]
    data_tf = tf.convert_to_tensor(reviewToBePadded,tf.int32)
    data_tf = tf.reshape(data_tf,[1,len(reviewToBePadded)])
    data_tf =  tf.pad(data_tf, paddings, 'CONSTANT')
    padded_data[i] = data_tf.eval()
print(padded_data)
sess.close()

New to Python, possibly not the best code. But I just want to explain the concept.

Post a Comment for "Padding Sequences In Tensorflow With Tf.pad"