Understanding tf.data.Dataset OutOfRangeError in TensorFlow

Look at code below:

#-*- coding: UTF-8 -*- 
import tensorflow as tf
import numpy as np
import os

batch_size = 3
shuffle_buffer = 10

a = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])

dataset = tf.data.Dataset.from_tensor_slices(a)


#dataset = dataset.map(parse)

# set batch_size, it must be before dataset iterator
dataset = dataset.shuffle(shuffle_buffer).batch(batch_size)

iterator = dataset.make_one_shot_iterator()
one_element = iterator.get_next()

init = tf.global_variables_initializer() 
init_local = tf.local_variables_initializer()

with tf.Session() as sess:
    sess.run([init, init_local])
    #try:
    for _ in range(10):
        print(sess.run(one_element))
        print('--batch end---')
    #except tf.errors.OutOfRangeError:
        #pass

The result is:

From result we can know:

  • If script causes OutOfRangeError, the last of elements in dataset will also be read, however, they can not consist of a full batch.  For example: the batch_size = 3, the last 2 elements in dataset will also be read, but the can not make a full batch.
  • If you want to use some elements of dataset in training, you should make shuffle in order all of elements will be traind.

,