GraphFrames
GraphFrames is a package for Apache Spark that provides DataFrame-based graphs. It provides high-level APIs in Java, Python, and Scala. It aims to provide both the functionality of GraphX and extended functionality taking advantage of Spark DataFrames. This extended functionality includes motif finding, DataFrame-based serialization, and highly expressive graph queries.
This article includes three example notebooks: a tutorial notebook available in Python and in Scala, and a Python user guide. For additional examples using GraphFrames with Scala, see GraphFrames user guide - Scala.
Databricks recommends using a cluster running Databricks Runtime for Machine Learning, as it includes an optimized installation of GraphFrames.
If you are not using a cluster running Databricks Runtime ML, download the JAR file from the GraphFrames library, load it to a volume, and install it onto your cluster.
GraphFrames tutorial
The following notebooks show you how to use GraphFrames to perform graph analysis.
Graph Analysis with GraphFrames (Python)
Graph Analysis with GraphFrames (Scala)
GraphFrames user guide (Python)
The following notebook includes Python code examples of how to use GraphFrames.
GraphFrames Python notebook
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for