@Conwyn Thanks for inspiring me to make this topic.
I would like to share some valuable resources about Graph Neural Nets. So let’s go:
- “A Gentle Introduction to Graph Neural Networks”
A fully free article that explains with simple examples, interactive simulations what graph-based networks are and
their practical applications.
- Pytorch’s Geometrics: “Colab Notebooks and Video Tutorials”
Let me clarify in advance that the Pytorch is not the only framework to choose from. However, in terms of content, it is a very good collection of courses from MIT, Stanford and lessons developed by the “PyTorch Geometric Tutorial Project” itself.
I especially recommend it for people who want to get straight to the substance right away, without having to set up the SDK.
- Hardware and distributed usage.
One of the most interesting aspects of Pytorch’s Geometrics is that there are a number of high-level APIs to compile tensor neural networks into ‘cubes’ that readily support architectures dedicated to GNNs. I’m thinking of training distributed on Intelligence Processing Units - IPUs, as well as using several GPUs simultaneously. For TF fans, there are also APIs emulating training or inference for XLA based frameworks.
How-to videos with links to code examples from the Graphcore team:
- Also at paperspace.com are available ready to go images with tutorials, how looks like inference with classical models compiled into graph. IPU-POD size 4 is freely available,
but compiling of the graph takes about 15 min - take a break for a tea or a coffee or read something or take a walk instead of waiting.
I was thinking (which is often dangerous) but in NLP we have attention so there is a relation between each source target combination and the strongest is the best word. J’ai un chat - I have a cat - I habe eine Katze - Mae cath gyfa fi . Graphs have nodes with attributes and edges with strengths. So my crazy thought was whether GNN could be used in NLP.
I feel responsible for exposing you to the factor that triggered a group of neurons responsible for your peculiar thoughts I don’t want you to stray too far, so I’ll share my intuition Karpathy can’t help but laugh at the idea that NLP models are just boring text-generating machines without the capability for backward reflection. This might be true in the case of GPTx. However, BERT models stand out because they recognize context from both the right and left sides, as they are trained on the entire length of the sentence. They also apply random masking to a given token (word) throughout the sentences. BERT’s task is to predict the most likely word to be found at the masked index. This approach is what I most associate with the concept of graphs. Of course, it would require a dataset in the form of a graph and a customized implementation of this model to talk about GNN!
- BERT paper: Understanding BERT - arXiv
- Visualization in the form of PDF slides: BERT Slides - GitHub
- A video explaining the differences between the GPTx series and BERT: GPTx vs. BERT - YouTube