shard_index

paddle.fluid.layers.shard_index(input, index_num, nshards, shard_id, ignore_value=-1)[source]

This operator recomputes the input indices according to the offset of the shard. The length of the indices is evenly divided into N shards, and if the shard_id matches the shard with the input index inside, the index is recomputed on the basis of the shard offset, elsewise it is set to ignore_value. The detail is as follows:

shard_size = (index_num + nshards - 1) // nshards
y = x % shard_size if x // shard_size == shard_id else ignore_value

NOTE: If the length of indices cannot be evely divided by the shard number, the size of the last shard will be less than the calculated shard_size

Examples:

Input:
  X.shape = [4, 1]
  X.data = [[1], [6], [12], [19]]
  index_num = 20
  nshards = 2
  ignore_value = -1

if shard_id == 0, we get:
  Out.shape = [4, 1]
  Out.data = [[1], [6], [-1], [-1]]

if shard_id == 1, we get:
  Out.shape = [4, 1]
  Out.data = [[-1], [-1], [2], [9]]
Parameters
  • input (-) – Input indices, last dimension must be 1.

  • index_num (-) – An integer defining the range of the index.

  • nshards (-) – The number of shards

  • shard_id (-) – The index of the current shard

  • ignore_value (-) – An integer value out of sharded index range

Returns

The sharded index of input.

Return type

Variable

Examples

import paddle.fluid as fluid
batch_size = 32
label = fluid.data(name="label", shape=[batch_size, 1], dtype="int64")
shard_label = fluid.layers.shard_index(input=label,
                                       index_num=20,
                                       nshards=2,
                                       shard_id=0)