3 6 Heterogeneous Graph Learning On Ipus Tutorials

Crandi Man
-
3 6 heterogeneous graph learning on ipus tutorials

Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document. Many real-world graphs are heterogeneous, meaning single node types and edge types are insufficient to capture all the information in the graph, leading to graphs which have different node types and different edge types... This comes with a few considerations, for example how do we construct a model suitable for training with heterogeneous graph data and how do we create mini-batches from this data. We will answer both of those questions, focussing on using Graphcore IPUs to enable accelerating heterogeneous graph learning workloads. Look at three approaches PyTorch Geometric (PyG) takes with heterogeneous graph learning and learn how to run each on the IPU.

Understand how to sample heterogeneous graphs with a fixed size suitable for the IPU. While this tutorial will cover enough of the basics of GNNs, PyTorch Geometric and PopTorch for you to start developing and porting your GNN applications to the IPU; the following resources can be used... A large set of real-world datasets are stored as heterogeneous graphs, motivating the introduction of specialized functionality for them in PyG. For example, most graphs in the area of recommendation, such as social graphs, are heterogeneous, as they store information about different types of entities and their different types of relations. This tutorial introduces how heterogeneous graphs are mapped to PyG and how they can be used as input to Graph Neural Network models. Heterogeneous graphs come with different types of information attached to nodes and edges.

Thus, a single node or edge feature tensor cannot hold all node or edge features of the whole graph, due to differences in type and dimensionality. Instead, a set of types need to be specified for nodes and edges, respectively, each having its own data tensors. As a consequence of the different data structure, the message passing formulation changes accordingly, allowing the computation of message and update function conditioned on node or edge type. As a guiding example, we take a look at the heterogeneous ogbn-mag network from the dataset suite: The given heterogeneous graph has 1,939,743 nodes, split between the four node types author, paper, institution and field of study. It further has 21,111,007 edges, which also are of one of four types:

writes: An author writes a specific paper Large heterogeneous graphs are common in many real-world datasets. In this context, large means single graphs with thousands of nodes and edges. Heterogeneous means the nodes in the graph represent different types of entities, while the edges represent diverse relations between those entity types. For example, social network can be modelled in graph form, with nodes representing users, and the edges between them indicating friendship connections. A heterogeneous graph can accommodate other node types such as posts users have made, groups they belong to, and events they are attending.

An entire social network can become extremely large with many users, posts, etc and all of the relations between them. With the latest release of Graphcore’s Poplar SDK 3.3, we have extended our PyTorch Geometric IPU support to enable this class of application to be accelerated using Graphcore IPUs. In this blog we will briefly show the latest features that have enabled using large heterogeneous graphs with IPUs. We also have a number of tutorials on these topics that can be run on Paperspace Gradient Notebooks, as well as an example that demonstrates using GNNs for fraud detection on a large heterogeneous... As graph size increases, there comes a point where training with full-batch requires an amount of memory too large for accelerators – even the IPU with its industry-leading on-chip SRAM. Full-batch is when the input to your model is the entire graph and so each iteration involves all nodes and edges.

The solution for this is to sample from the large graph forming smaller mini-batches which are used as inputs to the model. A common approach for this is GraphSAGE neighbour sampling where a mini-batch is formed from nodes to compute a representation, then randomly selected neighbours of those nodes, and neighbours of those etc. In this way a good representation of the target nodes can be formed while being scalable. Using PyTorch Geometric, this approach is straightforward. There exists a NeighborLoader object which provides a data loader which produces mini-batches of samples. The IPU uses ahead-of-time compilation, meaning the entire graph must be static, including the inputs.

This enables an efficient layout of memory and communication, as well as certain optimisations to be made during compilation. arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

People Also Search

Note: Searching From The Top-level Index Page Will Search All

Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document. Many real-world graphs are heterogeneous, meaning single node types and edge types are insufficient to capture all the information in the graph, leading to graphs which have different node types and different edge types... This comes with a few considerations,...

Understand How To Sample Heterogeneous Graphs With A Fixed Size

Understand how to sample heterogeneous graphs with a fixed size suitable for the IPU. While this tutorial will cover enough of the basics of GNNs, PyTorch Geometric and PopTorch for you to start developing and porting your GNN applications to the IPU; the following resources can be used... A large set of real-world datasets are stored as heterogeneous graphs, motivating the introduction of special...

Thus, A Single Node Or Edge Feature Tensor Cannot Hold

Thus, a single node or edge feature tensor cannot hold all node or edge features of the whole graph, due to differences in type and dimensionality. Instead, a set of types need to be specified for nodes and edges, respectively, each having its own data tensors. As a consequence of the different data structure, the message passing formulation changes accordingly, allowing the computation of message...

Writes: An Author Writes A Specific Paper Large Heterogeneous Graphs

writes: An author writes a specific paper Large heterogeneous graphs are common in many real-world datasets. In this context, large means single graphs with thousands of nodes and edges. Heterogeneous means the nodes in the graph represent different types of entities, while the edges represent diverse relations between those entity types. For example, social network can be modelled in graph form, ...

An Entire Social Network Can Become Extremely Large With Many

An entire social network can become extremely large with many users, posts, etc and all of the relations between them. With the latest release of Graphcore’s Poplar SDK 3.3, we have extended our PyTorch Geometric IPU support to enable this class of application to be accelerated using Graphcore IPUs. In this blog we will briefly show the latest features that have enabled using large heterogeneous g...