sample_neighbors

paddle.geometric. sample_neighbors ( row, colptr, input_nodes, sample_size=- 1, eids=None, return_eids=False, perm_buffer=None, name=None ) [source]

Graph Sample Neighbors API.

This API is mainly used in Graph Learning domain, and the main purpose is to provide high performance of graph sampling method. For example, we get the CSC(Compressed Sparse Column) format of the input graph edges as row and colptr, so as to convert graph data into a suitable format for sampling. input_nodes means the nodes we need to sample neighbors, and sample_sizes means the number of neighbors and number of layers we want to sample.

Besides, we support fisher-yates sampling in GPU version.

Parameters
  • row (Tensor) – One of the components of the CSC format of the input graph, and the shape should be [num_edges, 1] or [num_edges]. The available data type is int32, int64.

  • colptr (Tensor) – One of the components of the CSC format of the input graph, and the shape should be [num_nodes + 1, 1] or [num_nodes + 1]. The data type should be the same with row.

  • input_nodes (Tensor) – The input nodes we need to sample neighbors for, and the data type should be the same with row.

  • sample_size (int, optional) – The number of neighbors we need to sample. Default value is -1, which means returning all the neighbors of the input nodes.

  • eids (Tensor, optional) – The eid information of the input graph. If return_eids is True, then eids should not be None. The data type should be the same with row. Default is None.

  • return_eids (bool, optional) – Whether to return eid information of sample edges. Default is False.

  • perm_buffer (Tensor, optional) – Permutation buffer for fisher-yates sampling. If use_perm_buffer is True, then perm_buffer should not be None. The data type should be the same with row. If not None, we will use fiser-yates sampling to speed up. Only useful for gpu version. Default is None.

  • name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Returns

  • out_neighbors (Tensor), the sample neighbors of the input nodes.

  • out_count (Tensor), the number of sampling neighbors of each input node, and the shape should be the same with input_nodes.

  • out_eids (Tensor), if return_eids is True, we will return the eid information of the sample edges.

Examples

>>> import paddle

>>> # edges: (3, 0), (7, 0), (0, 1), (9, 1), (1, 2), (4, 3), (2, 4),
>>> #        (9, 5), (3, 5), (9, 6), (1, 6), (9, 8), (7, 8)
>>> row = [3, 7, 0, 9, 1, 4, 2, 9, 3, 9, 1, 9, 7]
>>> colptr = [0, 2, 4, 5, 6, 7, 9, 11, 11, 13, 13]
>>> nodes = [0, 8, 1, 2]
>>> sample_size = 2
>>> row = paddle.to_tensor(row, dtype="int64")
>>> colptr = paddle.to_tensor(colptr, dtype="int64")
>>> nodes = paddle.to_tensor(nodes, dtype="int64")
>>> out_neighbors, out_count = paddle.geometric.sample_neighbors(row, colptr, nodes, sample_size=sample_size)