graph_khop_sampler

paddle.incubate. graph_khop_sampler ( row, colptr, input_nodes, sample_sizes, sorted_eids=None, return_eids=False, name=None ) [source]

Graph Khop Sampler API.

This API is mainly used in Graph Learning domain, and the main purpose is to provide high performance graph khop sampling method with subgraph reindex step. For example, we get the CSC(Compressed Sparse Column) format of the input graph edges as row and colptr, so as to covert graph data into a suitable format for sampling. And the input_nodes means the nodes we need to sample neighbors, and sample_sizes means the number of neighbors and number of layers we want to sample.

Parameters
  • row (Tensor) – One of the components of the CSC format of the input graph, and the shape should be [num_edges, 1] or [num_edges]. The available data type is int32, int64.

  • colptr (Tensor) – One of the components of the CSC format of the input graph, and the shape should be [num_nodes + 1, 1] or [num_nodes]. The data type should be the same with row.

  • input_nodes (Tensor) – The input nodes we need to sample neighbors for, and the data type should be the same with row.

  • sample_sizes (list|tuple) – The number of neighbors and number of layers we want to sample. The data type should be int, and the shape should only have one dimension.

  • sorted_eids (Tensor, optional) – The sorted edge ids, should not be None when return_eids is True. The shape should be [num_edges, 1], and the data type should be the same with row. Default is None.

  • return_eids (bool, optional) – Whether to return the id of the sample edges. Default is False.

  • name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Returns

  • edge_src (Tensor), The src index of the output edges, also means the first column of the edges. The shape is [num_sample_edges, 1] currently.

  • edge_dst (Tensor), The dst index of the output edges, also means the second column of the edges. The shape is [num_sample_edges, 1] currently.

  • sample_index (Tensor), The original id of the input nodes and sampled neighbor nodes.

  • reindex_nodes (Tensor), The reindex id of the input nodes.

  • edge_eids (Tensor), Return the id of the sample edges if return_eids is True.

Examples

>>> import paddle

>>> row = [3, 7, 0, 9, 1, 4, 2, 9, 3, 9, 1, 9, 7]
>>> colptr = [0, 2, 4, 5, 6, 7, 9, 11, 11, 13, 13]
>>> nodes = [0, 8, 1, 2]
>>> sample_sizes = [2, 2]
>>> row = paddle.to_tensor(row, dtype="int64")
>>> colptr = paddle.to_tensor(colptr, dtype="int64")
>>> nodes = paddle.to_tensor(nodes, dtype="int64")

>>> edge_src, edge_dst, sample_index, reindex_nodes = paddle.incubate.graph_khop_sampler(row, colptr, nodes, sample_sizes, False)