Skip to content

Presentation about Graphframes and how we handle graphs with more than 2 billion nodes at Hybrid Theory

Notifications You must be signed in to change notification settings

rberenguel/identity-graphs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Keeping identity graphs in sync with Apache Spark

Presentation I (@berenguel) gave at the Data Love Conference on April 2021 and May at the Data+AI Summit to explain how we manage a 2 billion node graph at Hybrid Theory. You can find the slides here (some images might look slightly blurry). I recommend you check the version with presenter notes which is only available here. You can also head over the releases tab in case I have a more recent version and forgot to update this README.

If you want additional information about Spark in general, I gave an introduction to Spark talk with Carlos Peña that you can find here.


The video from Data Love is available here. Don't miss the whole playlist of videos of the conference.

You can watch the recording from Data+AI Summit by registering to it and selecting "On Demand" here.


This presentation is formatted in Markdown and prepared to be used with Deckset. The drawings were done on an iPad Mini 5 using Procreate.


Live at Data+AI Summit (2021, May)

Live at Data Love ❤️ (2021, April)


Buy Me A Coffee


About

Presentation about Graphframes and how we handle graphs with more than 2 billion nodes at Hybrid Theory

Topics

Resources

Stars

Watchers

Forks