Understanding RDF Data Representations in Triplestores

Matteo Lissandrini, Tomer Sage, Torben Bach Pedersen, Katja Hose

Abstract:

Because of the flexibility and expressiveness of their model, Knowledge Graphs (KGs) have attracted increasing interest. These resources are usually represented in RDF and stored in specialized data management systems called triplestores. Yet, while there exists a multitude of such systems, exploiting varying data representation and indexing schemes, it is unclear which of the many design choices are the most effective for a given database and query workload. Thus, first, we introduce a set of 20 access patterns, which we identify within 6 categories, adopted to analyze the needs of a given query workload. Then, we identify a novel three-dimensional design space for RDF data representations built on the dimensions of subdivision, redundancy, and compression of data. This design space maps the trade-offs between different RDF data representations employed to store RDF data within a triplestore. Thus, each of the required access patterns is compared against its compatibility with a given data representation. As we show, this approach allows identifying both the most effective RDF data representation for a given query workload as well as unexplored design solutions.
The SCR Space
The SCR system design space for RDF stores: Subdivision, Compression, and Redundancy.

Cite:

and
Understanding RDF Data Representations in Triplestores.” Proceedings of the 30th Italian Symposium on Advanced Database Systems, SEBD 2022

@inproceedings{sebd/Lissandrini22,
  author    = {Matteo Lissandrini and
               Tomer Sagi and
               Torben Bach Pedersen and
               Katja Hose},
  title     = {Understanding RDF Data Representations in Triplestores},
  booktitle = {30th Italian Symposium on Advanced Database Systems, {SEBD} 2022, Online Proceedings},
  publisher = {CEUR-WS.org},
  year      = {2022}
}