Project Description: Transformative researches to address the grand challenges of our time, such as, understanding the propagation of pandemic, harnessing generations of knowledge, uncovering disastrous software vulnerabilities, depend progressively on making sense of the graphs that model various relationships. Consequently, there is a great deal of interests in computing, mining and learning graphs, which is collectively called graph analytics. Such a long processing time, unfortunately, could lead to grave consequences, e.g., in computational epidemiology longer processing time could claim more lives!
The good news is that Graph Sampling and Random Walk (GSRW) can dramatically reduce the size of the original graphs while capture the desirable properties for the downstream graph analytics tasks. For instance, GraphZoom can learn from the sampled graphs and arrive at vertex embeddings that are similar to directly learn on the original graphs. Analogous effectiveness are also achieved in graph mining, computing and linear algebra. The goal of this research is to contribute to a distributed and GPU-based framework which can support a plethora of GSRW applications on trillion-edge graphs and is easy to use for domain experts from various disciplines.