Skip to content

ObdalibQuestBenchmarksBsbm

Ugur Donmez edited this page Feb 22, 2016 · 16 revisions

Table of Contents

1. Introduction

The Berlin SPARQL Benchmark (BSBM) is a benchmark for comparing the performance of storage systems that expose SPARQL endpoints. Such systems include native RDF stores, Named Graph stores, systems that map relational databases into RDF, and SPARQL wrappers around other kinds of data sources. The benchmark is built around an e-commerce use case, where a set of products is offered by different vendors and consumers have posted reviews about products.

This document presents the results of running the Berlin SPARQL Benchmark against:

 *Virtuoso Opensource 6.1.6 Triple Store
 *D2RQ Server 0.8.1
 *Virtuoso Opensource 6.1.6 RDF Views
 * Quest

SQL version of the benchmark was run against MySQL 5.5.24.

2. Benchmark dataset

The datasets were generated using the BSBM data generator and fulfill the characteristics described in section the BSBM specification. Five datasets with sizes 250k, 1M, 5M, 25M and 100M were used (where 1M means 1,000,000 triples).

3. Benchmark Machine

The benchmarks were run on a machine with the following specification:

Hardware:

 Intel Core i5 CPU 650 @ 3.20GHz × 4
 16 GB RAM
 320 GB (7200rpm) HDD
 1 TB (7200rpm) HDD


Software:

 Operating System: Ubuntu 12.04 64bit, Linux kernel 3.2.0-30-generic
 Java version "1.7.0_07"
 Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01)
 

General changes to software:

MySQL 5.5 changes to my.cnf:

 key_buffer		= 819M
 innodb_buffer_pool_size = 11192M
 max_allowed_packet	= 16M
 thread_stack		= 1M
 thread_cache_size       = 64
 thread_concurrency     = 20//
 
 sort_buffer_size = 16M
 read_buffer_size = 8M
 read_rnd_buffer_size = 16M
 bulk_insert_buffer_size = 128M
 thread_cache = 64
 

All datasets and database data files were put on the 1T disk.

4. Benchmark test procedure

The load performance of the systems was measured by loading the Turtle representation of the BSBM datasets into theVirtuoso triple store and by loading the relational representation in the form of MySQL dumps into MySQL behindD2RQ Server.Virtuoso specific SQL dumps were generated and loaded intoVirtuoso RDBMS forVirtuoso RDF Views.

We applied the following test procedure to each store:

#Load data into the store
# Restart store
# Run rampup phase (-rampup option of the test driver)
# Run single-client test run (50 mixes warm-up, 500 mixes performance measurement, randomizer seed: 808080)

5. Benchmark results

This section presents detailed results of running the Berlin SPARQL Benchmark against:

*Virtuoso Opensource 6.1.6 Triple Store
*D2RQ Server 0.8.1
*Virtuoso Opensource 6.1.6 RDF Views
*Quest 1.7alpha
*MySQL 5.5

5.1.Virtuoso Opensource 6.1.6 Triple Store

Configuration:

Changes to virtuoso.ini:

#td style="background: #ffd"
 NumberOfBuffers = 680000
 MaxDirtyBuffers = 500000
Triples were loaded with Bulk Loader. Multiple files were generated with 1M triples each (when datasets were >1M).

Load Time:

The table below summarizes the load times of turtle files (in mm:ss) :

250k 1M 5M 25M 100M
00:07 00:32 03:07 21:36 89:00

Benchmark Query results: QpS (Queries per Second)

250k 1M 5M 25M 100M
Query 1 143 132 145 124 64
Query 2 35 46 48 46 41
Query 3 93 94 110 91 58
Query 4 47 59 61 57 35
Query 5 30 34 33 20 8
Query 6 - - - - -
Query 7 15 11 11 10 6
Query 8 41 40 41 41 13
Query 9 377 366 396 326 44
Query 10 45 43 47 40 16
Query 11 285 292 330 283 32
Query 12 50 55 45 59 25

Benchmark results: QMpH (Query Mixes per Hour)

250k 1M 5M 25M 100M
5,330 5,120 5,030 4,520 2,310

Result summaries:

Virtuoso 250k: txt xml Detailed run log: zipped log

Virtuoso 1M: txt xml Detailed run log: zipped log

Virtuoso 5M: txt xml Detailed run log: zipped log

Virtuoso 25M: txt xml Detailed run log: zipped log

Virtuoso 100M: txt xml Detailed run log: zipped log

5.2.D2RQ Server 0.8.1

Configuration:

 Java Max Heap size set to //8G//
 //--fast// option was set to allow optimizations

The mapping file provided by BSBM website was used.

Load Time:

The table below summarizes the load times of sql files into MySQL 5.5.24 (in mm:ss) :

250k 1M 5M 25M
00:04 00:17 01:40 12:33

Benchmark Query results: QpS (Queries per Second)

250k 1M 5M
Query 1 51 11 0.5
Query 2 33 30 22
Query 3 46 6 0.3
Query 4 47 6 0.3
Query 5 54 11 0.5
Query 6 - - -
Query 7 10 3 0.7
Query 8 16 5 0.9
Query 9 60 51 38
Query 10 102 94 78
Query 11 128 118 100
Query 12 87 80 67

Benchmark results: QMpH (Query Mixes per Hour)

250k 1M 5M 25M
3,970 1,400 170 -

Note: We could not complete the test run for 25M dataset as the run would always terminate with HTTP 500 error from D2RQ sparql endpoint.

Result summaries:

D2RQ Server 250k: txt xml Detailed run log: zipped log

D2RQ Server 1M: txt xml Detailed run log: zipped log

D2RQ Server 5M: txt xml Detailed run log: zipped log

5.3.Virtuoso Opensource 6.1.6 RDF Views

Configuration:

Changes to virtuoso.ini:

#td style="background: #ffd"
 NumberOfBuffers              = 680000
 MaxDirtyBuffers              = 500000
 StopCompilerWhenXOverRunTime = 1

The following script provided by BSBM website was used to create tables and views: http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/results/store_config_files/create_tables_and_rdf_view.sql Virtuoso SQL dumps were generated and then imported intoVirtuoso RDBMS via isql-vt.

Load Time:

The table below summarizes the load times of SQL files (in mm:ss) :

250k 1M 5M 25M 100M
00:16 01:02 04:42 27:55 126:06

Benchmark Query results: QpS (Queries per Second)

250k 1M 5M 25M 100M
Query 1 75 75 72 67 66
Query 2 87 85 87 86 86
Query 3 83 84 75 72 68
Query 4 44 45 43 41 42
Query 5 113 80 52 18 8
Query 6 - - - - -
Query 7 - - - - -
Query 8 - - - - -
Query 9 - - - - -
Query 10 218 220 216 205 161
Query 11 94 98 98 95 88
Query 12 136 140 144 141 130

Benchmark results: QMpH (Query Mixes per Hour)

250k 1M 5M 25M 100M
22,200 21,300 19,500 13,800 8,580

Note: All runs were executed without queries 7,8 and 9, because they were producingVirtuoso query translation errors.

Result summaries:

Virtuoso Views 250k: txt xml Detailed run log: zipped log

Virtuoso Views 1M: txt xml Detailed run log: zipped log

Virtuoso Views 5M: txt xml Detailed run log: zipped log

Virtuoso Views 25M: txt xml Detailed run log: zipped log

Virtuoso Views 100M: txt xml Detailed run log: zipped log

5.4. QUEST 1.7alpha

Mapping Files:

These files were produced and used by our team for the benchmark:

OWL File OBDA File

The following script was executed to build indexes:mysqlinx.sql

Load Time:

The table below summarizes the load times of sql files into MySQL 5.5 (in mm:ss) :

250k 1M 5M 25M 100M
00:04 00:17 01:40 12:33 94:12

Benchmark Query results: QpS (Queries per Second)

250k 1M 5M 25M 100M
Query 1 - 222 224 193 167
Query 2 - 110 119 109 111
Query 3 - 219 149 196 158
Query 4 - 198 198 133 138
Query 5 - 133 60 23 10
Query 6 - - - - -
Query 7 - 198 166 173 158
Query 8 - 171 176 180 171
Query 9 - 215 245 220 211
Query 10 - 243 251 241 198
Query 11 - 242 249 234 232
Query 12 - 96 142 134 128

Benchmark results: QMpH (Query Mixes per Hour)

250k 1M 5M 25M 100M
- 23,000 21,100 15,600 10,300

Result summaries:

Quest 1M: txt Detailed run log: zipped log

Quest 5M: txt Detailed run log: zipped log

Quest 25M: txt Detailed run log: zipped log

Quest 100M: txt Detailed run log: zipped log

5.5. MySQL 5.5.24

The following script was executed to build indexes:mysqlinx.sql

Load Time:

The table below summarizes the load times of sql files into MySQL 5.5 (in mm:ss) :

250k 1M 5M 25M 100M
00:04 00:17 01:40 12:33 94:12

Benchmark Query results: QpS (Queries per Second)

250k 1M 5M 25M 100M
Query 1 1356 1193 1060 844 246
Query 2 1968 1731 1776 1715 1456
Query 3 1409 1248 1141 930 442
Query 4 1181 1029 883 662 195
Query 5 674 329 129 45 17
Query 6 - - - - -
Query 7 1490 1172 969 948 76
Query 8 509 191 233 266 77
Query 9 1646 1558 1640 1612 574
Query 10 1818 1555 1564 1504 131
Query 11 2697 2589 2683 2522 1251
Query 12 2262 2242 2343 2259 1214

Benchmark results: QMpH (Query Mixes per Hour)

250k 1M 5M 25M 100M
186,900 117,100 92,100 53,000 15,000

Result summaries:

MySQL 250k: txt xml Detailed run log: zipped log

MySQL 1M: txt xml Detailed run log: zipped log

MySQL 5M: txt xml Detailed run log: zipped log

MySQL 25M: txt xml Detailed run log: zipped log

MySQL 100M: txt xml Detailed run log: zipped log

5.6. Overall results

Benchmark Overall results: QMpH (Query Mixes per Hour)

250k 1M 5M 25M 100M
Virtuoso TS 5,330 5,120 5,030 4,520 2,310
D2RQ Server 3,970 1,400 170 - -
Virtuoso RDF Views(1) 22,200 21,300 19,500 13,800 8,580
Quest - 23,000 21,100 15,600 10,300
//MySQL(2)// // 186,900// //117,100// // 92,100// // 53,000 // 15,000

//(1)Virtuoso RDF Views test was run without queries 7,8 and 9. Excluding queries from the query mix, increases QMpH number, therefore the results are not directly comparable. When the issues are resolved, we will update the results. //

//(2) Although the corresponding MySQL queries give similar results, they are semantically not as complex as the SPARQL queries. Thus the MySQL results should just be used for general orientation.

Clone this wiki locally