einsum

paddle. einsum ( equation, *operands ) [source]

The current version of this API should be used in dynamic graph only mode.

Einsum offers a tensor operation API which allows using the Einstein summation convention or Einstain notation. It takes as input one or multiple tensors and produces as output one tensor.

Einsum is able to perform a variety of tensor operations. Following lists a few:

  • for single operand
    • trace

    • diagonal

    • transpose

    • sum

  • for double operands
    • dot

    • outer

    • broadcasting and elementwise multiply

    • matrix multiply

    • batched matrix multiply

  • for many operads
    • broadcasting multiply

    • chained matrix multiply

The summation notation

  • The tensor dimensions are labeled using uncased English letters. E.g., ijk relates to a three dimensional tensor whose dimensions are labeled i, j, and k.

  • The equation is , separated into terms, each being a distinct input’s dimension label string.

  • Ellipsis enables broadcasting by automatically converting the unlabeled dimensions into broadcasting dimensions.

  • Singular labels are called free labels, duplicate are dummy labels. Dummy labeled dimensions will be reduced and removed in the output.

  • Output labels can be explicitly specified on the right hand side of -> or omitted.
    In the latter case, the output labels will be inferred from the input labels.
    • Inference of output labels
      • Broadcasting label , if present, is put on the leftmost position.

      • Free labels are reordered alphabetically and put after .

    • On explicit output labels
      • If broadcasting is enabled, then must be present.

      • The output labels can be an empty, an indication to output as a scalar

        the sum over the original output.

      • Non-input labels are invalid.

      • Duplicate labels are invalid.

      • For any dummy label which is present for the output, it’s promoted to

        a free label.

      • For any free label which is not present for the output, it’s lowered to

        a dummy label.

  • Examples
    • ‘…ij, …jk’, where i and k are free labels, j is dummy. The output label string is ‘…ik’

    • ‘ij -> i’, where i is a free label and j is a dummy label.

    • ‘…ij, …jk -> …ijk’, where i, j and k are all free labels.

    • ‘…ij, …jk -> ij’, an invalid equation since is not present for the output.

The summation rule

The summation procedure can be outlined as follows, although the actual steps taken may vary significantly due to implementation specific optimization.

  • Step 1: preparation for broadcasting, that is, transposing and unsqueezing the input operands to have each resulting dimension identically labeled across all the input operands.

  • Step 2: broadcasting multiply all the resulting operands from step 1.

  • Step 3: reducing dummy labeled dimensions.

  • Step 4: transposing the result tensor to match the output labels.

On trace and diagonal

The trace and diagonal are planned yet unimplemented features.

Parameters
  • equation (str) – The summation terms using the Einstein summation notation.

  • operands (list|Tensor) – The input tensors over which to compute the Einstein summation. The number of operands should equal the number of input terms in the equation.

Returns

result (Tensor), the result tensor.

Examples

>>> import paddle
>>> paddle.seed(102)
>>> x = paddle.rand([4])
>>> y = paddle.rand([5])

>>> # sum
>>> print(paddle.einsum('i->', x))
Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True,
1.81225157)

>>> # dot
>>> print(paddle.einsum('i,i->', x, x))
Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True,
1.13530672)

>>> # outer
>>> print(paddle.einsum("i,j->ij", x, y))
Tensor(shape=[4, 5], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[0.26443148, 0.05962684, 0.25360870, 0.21900642, 0.56994802],
        [0.20955276, 0.04725220, 0.20097610, 0.17355499, 0.45166403],
        [0.35836059, 0.08080698, 0.34369346, 0.29680005, 0.77240014],
        [0.00484230, 0.00109189, 0.00464411, 0.00401047, 0.01043695]])

>>> A = paddle.rand([2, 3, 2])
>>> B = paddle.rand([2, 2, 3])

>>> # transpose
>>> print(paddle.einsum('ijk->kji', A))
Tensor(shape=[2, 3, 2], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[[0.50882483, 0.56067896],
         [0.84598064, 0.36310029],
         [0.55289471, 0.33273944]],
        [[0.04836850, 0.73811269],
         [0.29769155, 0.28137168],
         [0.84636718, 0.67521429]]])

>>> # batch matrix multiplication
>>> print(paddle.einsum('ijk, ikl->ijl', A,B))
Tensor(shape=[2, 3, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[[0.36321065, 0.42009076, 0.40849245],
         [0.74353045, 0.79189068, 0.81345987],
         [0.90488225, 0.79786193, 0.93451476]],
        [[0.12680580, 1.06945944, 0.79821426],
         [0.07774551, 0.55068684, 0.44512171],
         [0.08053084, 0.80583858, 0.56031936]]])

>>> # Ellipsis transpose
>>> print(paddle.einsum('...jk->...kj', A))
Tensor(shape=[2, 2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[[0.50882483, 0.84598064, 0.55289471],
         [0.04836850, 0.29769155, 0.84636718]],
        [[0.56067896, 0.36310029, 0.33273944],
         [0.73811269, 0.28137168, 0.67521429]]])

>>> # Ellipsis batch matrix multiplication
>>> print(paddle.einsum('...jk, ...kl->...jl', A,B))
Tensor(shape=[2, 3, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[[0.36321065, 0.42009076, 0.40849245],
         [0.74353045, 0.79189068, 0.81345987],
         [0.90488225, 0.79786193, 0.93451476]],
        [[0.12680580, 1.06945944, 0.79821426],
         [0.07774551, 0.55068684, 0.44512171],
         [0.08053084, 0.80583858, 0.56031936]]])