reindex_heter_graph

paddle.geometric. reindex_heter_graph ( x: Tensor, neighbors: Sequence[Tensor], count: Sequence[Tensor], value_buffer: Tensor | None = None, index_buffer: Tensor | None = None, name: str | None = None ) → tuple[Tensor, Tensor, Tensor] [source]

Reindex HeterGraph API.

This API is mainly used in Graph Learning domain, which should be used in conjunction with paddle.geometric.sample_neighbors API. And the main purpose is to reindex the ids information of the input nodes, and return the corresponding graph edges after reindex.

Take input nodes x = [0, 1, 2] as an example. For graph A, suppose we have neighbors = [8, 9, 0, 4, 7, 6, 7], and count = [2, 3, 2], then we know that the neighbors of 0 is [8, 9], the neighbors of 1 is [0, 4, 7], and the neighbors of 2 is [6, 7]. For graph B, suppose we have neighbors = [0, 2, 3, 5, 1], and count = [1, 3, 1], then we know that the neighbors of 0 is [0], the neighbors of 1 is [2, 3, 5], and the neighbors of 3 is [1]. We will get following outputs: reindex_src: [3, 4, 0, 5, 6, 7, 6, 0, 2, 8, 9, 1], reindex_dst: [0, 0, 1, 1, 1, 2, 2, 0, 1, 1, 1, 2] and out_nodes: [0, 1, 2, 8, 9, 4, 7, 6, 3, 5].

Note

The number in x should be unique, otherwise it would cause potential errors. We support multi-edge-types neighbors reindexing in reindex_heter_graph api. We will reindex all the nodes from 0.

Parameters

x (Tensor) – The input nodes which we sample neighbors for. The available data type is int32, int64.
neighbors (list|tuple) – The neighbors of the input nodes x from different graphs. The data type should be the same with x.
count (list|tuple) – The neighbor counts of the input nodes x from different graphs. And the data type should be int32.
value_buffer (Tensor, optional) – Value buffer for hashtable. The data type should be int32, and should be filled with -1. Only useful for gpu version. Default is None.
index_buffer (Tensor, optional) – Index buffer for hashtable. The data type should be int32, and should be filled with -1. Only useful for gpu version. value_buffer and index_buffer should be both not None if you want to speed up by using hashtable buffer. Default is None.
name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to api_guide_Name.

Returns

reindex_src (Tensor), the source node index of graph edges after reindex.
reindex_dst (Tensor), the destination node index of graph edges after reindex.
out_nodes (Tensor), the index of unique input nodes and neighbors before reindex,

where we put the input nodes x in the front, and put neighbor nodes in the back.

Examples

           >>> import paddle

>>> x = [0, 1, 2]
>>> neighbors_a = [8, 9, 0, 4, 7, 6, 7]
>>> count_a = [2, 3, 2]
>>> x = paddle.to_tensor(x, dtype="int64")
>>> neighbors_a = paddle.to_tensor(neighbors_a, dtype="int64")
>>> count_a = paddle.to_tensor(count_a, dtype="int32")
>>> neighbors_b = [0, 2, 3, 5, 1]
>>> count_b = [1, 3, 1]
>>> neighbors_b = paddle.to_tensor(neighbors_b, dtype="int64")
>>> count_b = paddle.to_tensor(count_b, dtype="int32")
>>> neighbors = [neighbors_a, neighbors_b]
>>> count = [count_a, count_b]
>>> reindex_src, reindex_dst, out_nodes = paddle.geometric.reindex_heter_graph(x, neighbors, count)
>>> print(reindex_src.numpy())
[3 4 0 5 6 7 6 0 2 8 9 1]
>>> print(reindex_dst.numpy())
[0 0 1 1 1 2 2 0 1 1 1 2]
>>> print(out_nodes.numpy())
[0 1 2 8 9 4 7 6 3 5]