Cache

NetworkDisk comes with a crude cache that supports only readonly cache.

The cache works at the level of networkdisk.tupledict.ReadOnlyTupleDictView, the tree is stored in dict up to a certain depth called cache_level.

Because the current implementation does not handle any kind of write operation, it is restricted to used to (Di)Graphs declared as static. In particular, the schema being readonly, all write operations are disabled.

This limitation may be removed in the future by implementing actual cache policy control, although this is not considered as a high-priority feature to implement by the main developers.

Example of usage

We first create a big graph:

>>> G = nd.sqlite.DiGraph(db=":tempfile:", name="big_graph", insert_schema=True)
>>> G.add_edges_from(list(enumerate(range(1, 100))), color="red")
>>> G.add_edges_from(list(enumerate(range(100, 200))), color="blue")

We can then load it as a static graph, providing the static parameter, and enable the use of cache by the edge_cache_level parameter.

>>> StaticG = nd.sqlite.DiGraph(db=G.helper, name="big_graph", static=True, edge_cache_level=2)

We store the initial_query_count to monitor the number of query executed

>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = list(StaticG[10])
>>> StaticG.helper.sql_logger.querycount - initial_query_count
2

Here, the graph was accessed for the first time, so two queries were necessary. If we redo the same access, the query count is not incremented, thus showing that no DB-access is performed.

>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = list(StaticG[10])
>>> StaticG.helper.sql_logger.querycount - initial_query_count
0

However, since the edge_cache_level has been set to 2, when querying the data value nothing is cached.

>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> n = next(iter(StaticG[10]))
>>> _ = StaticG[10][n]["color"]
>>> StaticG.helper.sql_logger.querycount - initial_query_count
1
>>> #do it again
>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = StaticG[10][n]["color"]
>>> StaticG.helper.sql_logger.querycount - initial_query_count
1

Setting the edge_cache_level to 3 (or higher) will enable caching of edge data values.

>>> StaticG = nd.sqlite.DiGraph(db=G.helper, name="big_graph", static=True, edge_cache_level=3)
>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = StaticG[10][n]["color"]
>>> StaticG.helper.sql_logger.querycount - initial_query_count
1
>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = StaticG[10][n]["color"]
>>> StaticG.helper.sql_logger.querycount - initial_query_count
0

It is also possible to enable node caching, with the node_cache_level.

>>> StaticG = nd.sqlite.DiGraph(db=G.helper, name="big_graph", static=True, node_cache_level=3)
>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = n in StaticG
>>> StaticG.helper.sql_logger.querycount - initial_query_count
1
>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = n in StaticG
>>> StaticG.helper.sql_logger.querycount - initial_query_count
0

To set both edge_cache_level and node_cache_level it is simpler to just set the cache_level parameter, that defines the default value for both.

>>> StaticG = nd.sqlite.DiGraph(db=G.helper, name="big_graph", static=True, cache_level=3)
>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = n in StaticG
>>> _ = StaticG[10][n]["color"]
>>> StaticG.helper.sql_logger.querycount - initial_query_count
2
>>> initial_query_count = StaticG.helper.sql_logger.querycount
>>> _ = n in StaticG
>>> _ = StaticG[10][n]["color"]
>>> StaticG.helper.sql_logger.querycount - initial_query_count
0