[RFC] Graph Neural Network Library for LibTorch - Seeking Community Input

Hi PyTorch Community! :waving_hand:

## The Problem

I’ve noticed a significant gap in the PyTorch ecosystem: **there’s no comprehensive Graph Neural Network (GNN) library for LibTorch/C++**.

While Python has excellent libraries like PyTorch Geometric and DGL, C++ developers are left to implement GNN operations from scratch. This creates barriers for:

- Production deployments requiring low-latency inference

- Integration with existing C++ ML pipelines

- Mobile and embedded applications

- High-performance training on large graphs

## My Proposal

I’m proposing to develop **LibTorch-Geometric** - a comprehensive C++ GNN library that would provide:

### Core Features

- **Graph data structures** with efficient batching for variable-sized graphs

- **Message passing framework** similar to PyG’s MessagePassing class

- **Standard GNN layers**: GCN, GraphSAGE, GAT, GIN

- **Graph operations**: Optimized sparse operations, pooling, sampling

- **CUDA acceleration** for performance-critical operations

### Example API (Draft)

```cpp

#include <libtorch_geometric/libtorch_geometric.h>

// Simple GCN model

class GCN : public torch::nn::Module {

public:

GCN(int64_t num_features, int64_t hidden_dim, int64_t num_classes) {

conv1 = register_module(“conv1”,

ltg::GCNConv(ltg::GCNConvOptions(num_features, hidden_dim)));

conv2 = register_module(“conv2”,

ltg::GCNConv(ltg::GCNConvOptions(hidden_dim, num_classes)));

}

torch::Tensor forward(torch::Tensor x, torch::Tensor edge_index) {

x = conv1->forward(x, edge_index);

x = torch::relu(x);

x = conv2->forward(x, edge_index);

return torch::log_softmax(x, 1);

}

private:

ltg::GCNConv conv1{nullptr}, conv2{nullptr};

};

```

## Why This Matters

- **Performance**: Native C++ speed without Python overhead

- **Production Ready**: Deploy GNNs without Python dependencies

- **Ecosystem Growth**: Brings graph deep learning to more use cases

- **Research Impact**: Enables high-performance GNN research

## My Background

I’m an MTech student & I have experience with C++, CUDA, and deep learning, and I’m committed to seeing this through to completion and long-term maintenance.

## Questions for the Community

1. **Interest Level**: Would this be valuable to the PyTorch ecosystem?

2. **API Design**: Does the proposed C++ API feel natural? Any suggestions for improvement?

3. **Priority Features**: Which GNN layers and operations should I prioritize first?

- Basic layers: GCN, GraphSAGE, GAT?

- Graph pooling operations?

- Large graph sampling algorithms?

4. **Integration**: How should this integrate with existing PyTorch tooling?

- Should it follow the same conventions as other LibTorch extensions?

- Any specific build system preferences?

5. **Performance Requirements**: What are the key bottlenecks you’ve experienced with Python GNN libraries?

6. **Contribution Path**: Would this be better as:

- Independent library in the PyTorch ecosystem (like PyG for Python)?

- Eventually proposed for inclusion in PyTorch core?

- Hybrid approach - start independent, propose inclusion if successful?

## Next Steps

Based on community feedback, I plan to:

1. Start with a prototype implementing basic GCN and message passing

2. Create benchmarking framework vs Python implementations

3. Iterate based on real-world usage and community input

4. Open source everything and build contributor community

## Timeline

- **Months 1-2**: Core infrastructure and basic layers

- **Months 3-4**: Standard GNN implementations

- **Months 5-6**: Performance optimization and CUDA kernels

- **Months 7-8**: Documentation, examples, and community feedback

-–

**TL;DR**: I want to build a comprehensive GNN library for LibTorch to fill the C++ ecosystem gap. Looking for community input on design, priorities, and contribution approach.

**Your thoughts?** Would love to hear from both potential users and PyTorch maintainers! :rocket:

-–

*Cross-posting this to PyTorch Forums as well to reach broader audience*