im2sequence

Note: This API is only avaliable in [Static Graph] mode

paddle.fluid.layers.im2sequence(input, filter_size=1, stride=1, padding=0, input_image_size=None, out_stride=1, name=None)[source]

Extracts image patches from the input tensor to form a tensor of shape {input.batch_size * output_height * output_width, filter_size_height * filter_size_width * input.channels}. This op use filter to scan images and convert these images to sequences. After expanding, the number of time step are output_height * output_width for an image, in which output_height and output_width are calculated by below equation:

\[\begin{split}output\_height = 1 + (padding\_up + padding\_down + input\_height - filter\_size\_height + stride\_height - 1) / stride\_height \\ output\_width = 1 + (padding\_left + padding\_right + input\_width - filter\_size\_width + stride\_width - 1) / stride\_width\end{split}\]

And the dimension of each time step is filter_size_height * filter_size_width * input.channels.

Parameters
  • input (Variable) – The input should be a 4-D Tensor in \(NCHW\) format. The data type is float32.

  • filter_size (int32 | List[int32]) – The filter size. If filter_size is a List, it must contain two integers, \([filter\_size\_height, filter\_size\_width]\) . Otherwise, the filter size will be a square \([filter\_size, filter\_size]\) . Default is 1.

  • stride (int32 | List[int32]) – The stride size. If stride is a List, it must contain two integers, \([stride\_height, stride\_width]\) . Otherwise, the stride size will be a square \([stride\_size, stride\_size]\) . Default is 1.

  • padding (int32 | List[int32]) – The padding size. If padding is a List, it can contain four integers like \([padding\_up, padding\_left, padding\_down, padding\_right]\) to indicate paddings of four direction. Or it can contain two integers \([padding\_height, padding\_width]\) which means padding_up = padding_down = padding_height and padding_left = padding_right = padding_width. Otherwise, a scalar padding means padding_up = padding_down = padding_left = padding_right = padding. Default is 0.

  • input_image_size (Variable, optional) – the input contains image real size.It’s dim is \([batchsize, 2]\) . It is just for batch inference when not None. Default is None.

  • out_stride (int32 | List[int32]) – The scaling of image through CNN. It is valid only when input_image_size is not None. If out_stride is List, it must contain two integers, \([out\_stride\_height, out\_stride\_W]\) . Otherwise, the out_stride_height = out_stride_width = out_stride. Default is 1.

  • name (str, optional) – The default value is None. Normally there is no need for user to set this property. For more information, please refer to Name .

Returns

The output is a 2-D LoDTensor with shape {input.batch_size * output_height * output_width, filter_size_height * filter_size_width * input.channels}. The data type is float32.

Return Type: Variable

Examples

Given:

x = [[[[ 6.  2.  1.]
       [ 8.  3.  5.]
       [ 0.  2.  6.]]

      [[ 2.  4.  4.]
       [ 6.  3.  0.]
       [ 6.  4.  7.]]]

     [[[ 6.  7.  1.]
       [ 5.  7.  9.]
       [ 2.  4.  8.]]

      [[ 1.  2.  1.]
       [ 1.  3.  5.]
       [ 9.  0.  8.]]]]

x.dims = {2, 2, 3, 3}

And:

filter = [2, 2]
stride = [1, 1]
padding = [0, 0]

Then:

output.data = [[ 6.  2.  8.  3.  2.  4.  6.  3.]
               [ 2.  1.  3.  5.  4.  4.  3.  0.]
               [ 8.  3.  0.  2.  6.  3.  6.  4.]
               [ 3.  5.  2.  6.  3.  0.  4.  7.]
               [ 6.  7.  5.  7.  1.  2.  1.  3.]
               [ 7.  1.  7.  9.  2.  1.  3.  5.]
               [ 5.  7.  2.  4.  1.  3.  9.  0.]
               [ 7.  9.  4.  8.  3.  5.  0.  8.]]

output.dims = {8, 8}

output.lod = [[4, 4]]

Examples

import paddle.fluid as fluid
data = fluid.data(name='data', shape=[None, 3, 32, 32],
                         dtype='float32')
output = fluid.layers.im2sequence(
    input=data, stride=[1, 1], filter_size=[2, 2])