G-Store: A Storage Manager for Graph Data

 

Created from a master’s thesis supervised by Dan Olteanu and submitted to Oxford University Computing Laboratory in September 2010

Author: Robin Steinhaus

 

Note: These pages describe G-Store version 0.1. The changelog contains a summay of changes.

 

Abstract

This thesis develops a storage manager for vertex-labeled graphs, called G-Store. G-Store exploits the structure of the graph to derive a data placement on disk that is optimized for access patterns found in graph queries. The placement strategy is based on a multilevel algorithm that partitions the graph into blocks and arranges these blocks on disk to minimize the distance on disk between adjacent vertices. G-Store has a built-in query engine that supports depth-first traversal, reachability testing, shortest path search, and shortest path tree search. Initial experiments show that G-Store is functional, is fast, and solves queries that traditional database management systems struggle with.

 

Contents

1  Introduction

 

2  Preliminaries

  2.1  Graphs: Notation and Definitions

  2.2  Memory Hierarchy

  2.3  Disk Drives

 

3  Working with Graphs in Main Memory

  3.1  Data Structures

    3.1.1  Unordered Edge List

    3.1.2  Compressed Storage Format

    3.1.3  Adjacency Lists

    3.1.4  Adjacency Matrix

  3.2  Algorithms

    3.2.1  Breadth-First Search

    3.2.2  Depth-First Search

 

4  Working with Graphs in a Relational Database

   4.1  Data Structures

     4.1.1  Edge List

     4.1.2  Relational Adjacency Matrix

   4.2  Queries

     4.2.1  Breadth-First with Joins

     4.2.2  Breadth-First with Recursive CTEs

     4.2.3  Depth-First with PL/pgSQL

     4.2.4  Depth-First with CONNECT BY

 

5  A Storage Manager for Graph Data

  5.1  Purpose

  5.2  Storing Vertices and Edges

  5.3  Storing Adjacent Vertices

  5.4  Exploiting Disk Blocks

  5.5  Preliminary Result

 

6  G-Store Storage Algorithm

  6.1  Objective and Overview

  6.2  Memory and Graph Representation

  6.3  Coarsening

  6.4  Turn-Around

  6.5  Uncoarsening

  6.6  Finalization

  6.7  Block Structure

 

7  Accessing G-Store

  7.1  Menu

  7.2  Schema Definition

  7.3  Query Definition

  7.4  Conditions

  7.5  Graph Traversal

  7.6  Path Search

 

8  First Experiments with G-Store

 

9  Outlook

 

References