Considerations

Not all relationships generalize to graphs

Determine where graphs are appropriate

Some relationships are easier to represent as a graph than others

Recent developments demonstrated graph-based ML’s potential, but from a practical standpoint it is a concept that still needs to mature. Enterprises should take this into consideration when deciding if and how they should use it.

Many types of data naturally form graph structure networks. Modeling the relational structure in these domains enables much more accurate models and representations. Some examples include:^xix

Figure 9 Examples of real world networks^xx

These considerations could affect how suitable a graph is for a task:

Scalability: Traditional approaches to machine learning depend on samples being statistically independent. Because the samples are statistically independent, data scientists can isolate the individual contribution of each subset, and thereby optimize training. In contrast, in graph data structures the nodes are interconnected, so they are not statistically independent. The interrelatedness between nodes in graph data structure creates challenges in sampling, because the subgraphs sampled need to maintain a representative structure, but the interconnectedness can introduce bias into the training sample that distorts representativeness (e.g., nodes and edges that appear more often than others in the training set).^_xxi

Dynamic domains: Graphs are useful for modeling a broad range of problems because many problems naturally fit into a network structure of nodes and edges. The issue is many of these “natural networks” are dynamic and change over time. Frequently, this temporal dynamic is informative to the structure and behavior of the modeled system, but research on how to capture these temporal dynamics is still in its infancy.^xxii

Reliability: The theory underpinning GNNs is still developing and not well understood. This means that determining GNNs’ effectiveness in specific areas is experimental and bimodal in nature. Some applications see dramatic improvements in performance while others are completely unaffected.^xxiii