When is the Peak Performance Reached? An Analysis of RDF Triple Stores

Research & Innovation

With the significant growth in RDF datasets, application developers demand their online availability to meet the end users' expectations. Various interfaces are available for querying RDF data using SPARQL query language. Studies show that SPARQL endpoints may provide high query runtime performance at the cost of low availability. For example, it has been observed that only 32.2% of public endpoints have a monthly uptime of 99-100%. One possible reason for this low availability is the high workload experienced by these SPARQL endpoints. As, complete query execution is performed at server-side (i.e., SPARQL endpoint), this high query processing workload may result in performance degradation or even a service shutdown. We performed extensive experiments to show the query processing capabilities of well-known triple stores by using their SPARQL endpoints. In particular, we stressed these triple stores with multiple parallel requests from different querying agents. Our experiments revealed the maximum query processing capabilities of these triple stores after which point it leads to service shutdowns. We hope this analysis will help triple store developers to design workload-aware RDF engines to improve the availability of their public endpoints with high throughput.

Speakers: 

Available material for this talk.
Recording