Heterogeneous Graph Learning Pytorch Geometric Documentation

Crandi Man

-Oct 18, 2025, 10:17 AM

heterogeneous graph learning pytorch geometric documentation

A large set of real-world datasets are stored as heterogeneous graphs, motivating the introduction of specialized functionality for them in PyG. For example, most graphs in the area of recommendation, such as social graphs, are heterogeneous, as they store information about different types of entities and their different types of relations. This tutorial introduces how heterogeneous graphs are mapped to PyG and how they can be used as input to Graph Neural Network models. Heterogeneous graphs come with different types of information attached to nodes and edges. Thus, a single node or edge feature tensor cannot hold all node or edge features of the whole graph, due to differences in type and dimensionality. Instead, a set of types need to be specified for nodes and edges, respectively, each having its own data tensors.

As a consequence of the different data structure, the message passing formulation changes accordingly, allowing the computation of message and update function conditioned on node or edge type. As a guiding example, we take a look at the heterogeneous ogbn-mag network from the dataset suite: The given heterogeneous graph has 1,939,743 nodes, split between the four node types author, paper, institution and field of study. It further has 21,111,007 edges, which also are of one of four types: writes: An author writes a specific paper Graphs are a powerful data structure used to represent relationships between entities.

In many real - world scenarios, these relationships are complex, and the entities themselves can have different types. This is where heterogeneous graphs come into play. Heterogeneous graphs contain multiple types of nodes and edges, which allows for a more accurate representation of complex systems such as social networks, biological networks, and knowledge graphs. PyTorch Geometric (PyG) is a deep learning library that provides a convenient way to work with graph data in PyTorch. It offers a wide range of tools and functions to handle heterogeneous graphs, making it easier for researchers and practitioners to develop graph - based machine learning models. In this blog post, we will explore the fundamental concepts of heterogeneous graphs in PyTorch Geometric, learn how to use them, and discuss common and best practices.

A heterogeneous graph $G=(V, E)$ consists of a set of nodes $V$ and a set of edges $E$. The nodes and edges can be partitioned into different types. For example, in a social network graph, nodes could represent users, pages, and groups, while edges could represent friendships, likes, and memberships. In PyTorch Geometric, node and edge types are represented as strings. Each node type can have its own set of node features, and each edge type can have its own set of edge features. For example, a node of type “user” might have features such as age, gender, and location, while an edge of type “friendship” might have a feature indicating the duration of the friendship.

As shown in the previous example, we can create a heterogeneous graph in PyG using the HeteroData class. We can add node features and edge indices for different node and edge types. PyG provides several ways to load and preprocess heterogeneous graph data. For example, we can use the DataLoader class to load data in batches. Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document.

Many real-world graphs are heterogeneous, meaning single node types and edge types are insufficient to capture all the information in the graph, leading to graphs which have different node types and different edge types... This comes with a few considerations, for example how do we construct a model suitable for training with heterogeneous graph data and how do we create mini-batches from this data. We will answer both of those questions, focussing on using Graphcore IPUs to enable accelerating heterogeneous graph learning workloads. Look at three approaches PyTorch Geometric (PyG) takes with heterogeneous graph learning and learn how to run each on the IPU. Understand how to sample heterogeneous graphs with a fixed size suitable for the IPU. While this tutorial will cover enough of the basics of GNNs, PyTorch Geometric and PopTorch for you to start developing and porting your GNN applications to the IPU; the following resources can be used...

Documentation | PyG 1.0 Paper | PyG 2.0 Paper | Colab Notebooks | External Resources | OGB Examples PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data. It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of easy-to-use mini-batch loaders for operating on many small and single giant graphs, multi GPU-support, torch.compile support, DataPipe support, a large number of common benchmark datasets (based on simple interfaces... Whether you are a machine learning researcher or first-time user of machine learning toolkits, here are some reasons to try out PyG for machine learning on graph-structured data. In this quick tour, we highlight the ease of creating and training a GNN model with only a few lines of code.

ℋ²GB (Heterophilic and Heterogeneous Graph Benchmark) is a library built upon PyTorch, PyTorch Geometric and GraphGym. It is a collection of graph benchmark datasets, data loaders, modular graph transformer framework (UnifiedGT) and evaluators designed to systematically evaluate graph learning methods in both heterophilic and heterogeneous settings. As shown in the figure, ℋ²GB encompasses 9 diverse real-world datasets across 5 domains. We developed UnifiedGT, a modular graph transformer (GT) framework to summarize and systematically compare the performance of existing graph neural networks (GNNs) on this new benchmark. UnifiedGT is implemented as a Python library and is user-friendly. It includes a unified data loader and evaluator, making it easy to access datasets, evaluate methods, and compare performance.

Its data loaders are fully compatible with popular graph deep learning framework PyTorch Geometric. They provide automatic dataset downloading, standardized dataset splits, and unified performance evaluation. We established a standard workflow supporting both model selection and development. The green and blue arrows on top highlight two workflows how users can interact with ℋ²GB. ℋ²GB includes heterophilic and heterogeneous graph datasets from 5 domains. You can find datasets categorized by domains through the following image link:

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data. It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of easy-to-use mini-batch loaders for operating on many small and single giant graphs, multi GPU-support, torch.compile support, DataPipe support, a large number of common benchmark datasets (based on simple interfaces... There was an error while loading. Please reload this page. There was an error while loading.

Please reload this page. There was an error while loading. Please reload this page. Following the tutorial on Heteregeneous Graph Learning, I created my own dataset using Networks graphs as follows : After successfully loading the created dataset using data=torch.load("../Scripts/pytorch_dataset/processed/dataset_pytorch_obj.pt"), I cannot seem to understand how can I get the metadata from the loaded dataset, at this step :

Heterogeneous Graph Learning Pytorch Geometric Documentation

People Also Search

A Large Set Of Real-world Datasets Are Stored As Heterogeneous

As A Consequence Of The Different Data Structure, The Message

In Many Real - World Scenarios, These Relationships Are Complex,

A Heterogeneous Graph $G=(V, E)$ Consists Of A Set Of

As Shown In The Previous Example, We Can Create A