8  First Experiments with G-Store

Figures 13 to 16 show the results of first experiments with G-Store. The underlying graph is the road network of California, USA. The graph is undirected, connected, and consists of 1,965,206 vertices and 2,766,607 edges (5,533,214 directed edges). The longest shortest path in the graph has length 850 [17]. We ran several scenarios:

 

•    A disk representation created with G-Store’s multilevel algorithm (‘G-Store ML’) and a disk representation created by randomly mapping vertices to blocks (‘RND’). The latter was created with G-Store’s FLAT RANDOM.

•    Query buffers of 16 MB and 80 MB for G-Store ML, and a query buffer of 80 MB for RND.

•    Three datasets: In dataset ‘S’, each vertex label holds a 4-byte identifier (oid). In dataset ‘M’, each vertex label holds the identifier and between 40 and 60 bytes additional data. In dataset ‘L’, each vertex label holds the identifier and between 400 and 600 bytes additional data.

•    Three different block sizes: 4096 bytes, 16,384 bytes, and 65,536 bytes.

•    22 queries for randomly chosen oids, N and M:

(a)     Five vertex selection queries:

         SELECT oid WHERE oid = N

         SPOOL TO "out.g";

(b)    Five traversal queries with a maximum path length of 12:

         SELECT oid WHERE LEVEL <= 12

         START WITH oid = N

         SPOOL TO "out.g";

(c)     Five block-first reachability queries:

         SELECT oid IN PATH

         START WITH oid = N END WITH oid = M

         SPOOL TO "out.g";

(d)    Five shortest path queries with the same oids as for (c):

         SELECT oid IN SHORTEST PATH

         START WITH oid = N END WITH oid = M

         SPOOL TO "out.g";

(e)     Two shortest path tree queries:

         SELECT oid IN SHORTEST PATH TREE

         START WITH oid = N

         SPOOL TO "out.g";

 

The experiments were run on a machine with an Intel Core Duo 2x2.53 GHz processor, 3 GB memory, and a serial ATA hard drive with a speed of 90 rotations per second.

Overall, the results are satisfactory. Most importantly, the number of output rows, the shortest path lengths, and the longest shortest path lengths were consistent in all scenarios. The comparably better performance of RND in vertex selection and traversal queries is explained by the fact that RND used fewer blocks to store the graph. We leave the interpretation of the results to the reader.

We had planned to benchmark G-Store’s query performance against PostgreSQL using the queries presented in Section 4.2. However, in several hours, PostgreSQL returned without error only for vertex selection queries without traversal. Specifically, we tried:

 

•    To match (b): The PL/pgSQL procedures shown in Section 4.2.3, extended with an argument to limit the path length to 12. We aborted the experiment after the procedures had not returned within several hours.

•    To match (d): A recursive common table expression similar to the one shown in Section 4.2.2. We limited the search to the length of the shortest path. The queries returned either with ERROR: write failed or with ERROR: could not write block N of temporary file: No space left on device, where N varied around 490,000. There were at least 10 GB free space on the disk.

•    To match (e): A recursive common table expression with GROUP BY to select the shortest paths (see Section 4.2.2). Same errors as above.