How to build scalable SPARQL engines for Big RDF data
Prof. Panos Kalnis
Computer, Electrical and Mathematical
Science and Engineering Division
King Abdullah University of Science
and Technology (KAUST)
The RDF model is rapidly gaining popularity for knowledge representation and data sharing in the semantic web. As RDF datasets grow larger, building efficient SPARQL query engines becomes challenging. In this talk I will present three of our SPARQL systems that tackle the problem in radically different ways: (i) AdPart is a distributed RDF store that starts with a random partitioning of the RDF graph and dynamically rearranges the partitions to achieve good load balancing and optimize the execution of SPARQL queries. (ii) Spartex is a system based on the vertex-centric computation model. It implements a SPARQL engine as a vertex-centric program, and allows the combination of declarative SPARQL queries together with procedural generic graph analytics processing. (iii) MAGiQ is a novel framework that treats SPARQL queries as a series of matrix operations and utilizes existing matrix algebra libraries to achieve scalability and portability in a variety of architectures such as multicore CPUs, GPUs and supercomputers. I will present the tradeoffs of these systems and explain how they can handle complex queries on multi-billion edge RDF graphs.
Panos Kalnis is Professor and Chair of the Computer Science program at the King Abdullah Univ. of Science and Technology (KAUST). In 2009 he was visiting assistant professor at Stanford University. Before that, he was assistant professor at the National University of Singapore (NUS). In the past he was involved in the designing and testing of VLSI chips and worked in several companies on database designing, e-commerce projects and web applications. He has served as associate editor for the IEEE Transactions on Knowledge and Data Engineering (TKDE) from 2013 to 2015, and on the editorial board of the VLDB Journal from 2013 to 2017. He received his Diploma from the Computer Engineering and Informatics Dept., Univ. of Patras, Greece in 1998 and his PhD from the Computer Science Dept., Hong Kong Univ. of Science and Technology (HKUST) in 2002. His research interests include Big Data, Cloud Computing, Parallel and Distributed Systems, Large Graphs and Long Sequences.