Lena Wiese Advanced Data Management De Gruyter - Dr. Lena Wiese

e-ISBN (PDF) 978-3-11-044141-3. Library of ... During the last two decades, the landscape of database management systems has changed .... theory. Having presented several choices for graph data structures (from adja- cency matrix to ...
173KB taille 47 téléchargements 1375 vues
Lena Wiese Advanced Data Management De Gruyter

Lena Wiese

Advanced Data Management |

for SQL, NoSQL, Cloud and Distributed Databases

Author Dr. Lena Wiese Georg-August-Universität Göttingen Fakultät für Mathematik und Informatik Institut für Informatik Goldschmidtstraße 7 37077 Göttingen Germany [email protected]

ISBN 978-3-11-044140-6 e-ISBN (PDF) 978-3-11-044141-3 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliogra�e; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2015 Walter de Gruyter GmbH, Berlin/Munich/Boston ♾ Printed on acid-free paper Printed in Germany www.degruyter.com

|

To my family

Preface During the last two decades, the landscape of database management systems has changed immensely. Based on the fact that data are nowadays stored and managed in network of distributed servers (“clusters”) and these servers consist of cheap hardware (“commodity hardware”), data of previously unthinkable magnitude (“big data”) are produced, transferred, stored, modi�ed, transformed, and in the end possibly deleted. This form of continuous change calls for �exible data structures and e�cient distributed storage systems with both a high read and write throughput. In many novel applications, the conventional table-like (“relational”) data format may not the data structure of choice – for example, when easy exchange of data or fast retrieval become vital requirements. For historical reasons, conventional database management systems are not explicitly geared toward distribution and continuous change, as most implementations of database management systems date back to a time where distributed storage was not a major requirement. These de�ciencies might as well be attributed to the fact that conventional database management systems try to incorporate several database standards as well as have high safety guarantees (for example, regarding concurrent user accesses or correctness and consistency of data). Several kinds of database systems have emerged and evolved over the last years that depart from the established tracks of data management and data formats in di�erent ways. Development of these emergent systems started from scratch and gave rise to new data models, new query engines and languages, and new storage organizations. Two things are particularly remarkable features of these systems: on the one hand, a wide range of open source products are available (though some systems are supported by or even originated from large international companies) and development can be observed or even be in�uenced by the public; on the other hand, several results and approaches achieved by long-standing database research (having its roots at least as early as the 1960s) have been put into practice in these database systems and these research results now show their merits for novel applications in modern data management. On the downside, there are basically no standards (with respect to data formats or query languages) in this novel area and hence portability of application code or long-term support can usually not be guaranteed. Moreover, these emerging systems are not as mature (and probably not as reliable) as conventional established systems. The term NOSQL has been used as an umbrella term for several emerging database systems without an exact formal de�nition. Starting with the notion of NoSQL (which can be interpreted as saying no to SQL as a query language) it has evolved to mean “not only SQL” (and hence written as NOSQL with a capital O). The actual origin of the term is ascribed to the 2009 “NOSQL meetup”: a meeting with presentations of six database systems (Voldemort, Cassandra, Dynomite, HBase, Hypertable, and CouchDB). Still, the question of what exactly a NOSQL database system is cannot be answered unanimously; nevertheless, some structure slowly becomes visible in the

VIII � Preface NOSQL �eld and has led to a broad categorization of NOSQL database systems. Main categories of NOSQL systems are key-value stores, document stores, extensible record stores (also known as column family stores) and graph databases. Yet, other creatures live out there in the database jungle: object databases and XML databases do not espouse the relational data model nor SQL as a query language – but they typically would not be considered NOSQL database systems (probably because they predate the NOSQL systems). Moreover, column stores are an interesting variant of relational database systems. This book is meant as a textbook for computer science lectures. It is based on Master-level database lectures and seminars held at the universities of Hildesheim and Göttingen. As such it provides a formal analysis of alternative, non-relational data models and storage mechanisms and gives a decent overview of non-SQL query languages. However, it does not put much focus on installing or setting up database systems and hence complements other books that concentrate on more technical aspects. This book also surveys storage internals and implementation details from an abstract point of view and describes common notions as well as possible design choices (rather than singling out one particular database system and specializing on its technical features). This book intends to give students a perspective beyond SQL and relational database management systems and thus covers the theoretical background of modern data management. Nevertheless this book is also aimed at database practitioners: it wants to help developers or database administrators coming to an informed decision about what database systems are most bene�cial for their data management requirements.

Overview This book consists of four parts. Part I Introduction commences the book with a general introduction to the basics of data management and data modeling. Chapter 1 Background (page 3) provides a justi�cation why we need databases in modern society. Desired properties of modern database systems like scalability and reliability are de�ned. Technical internals of database management systems (DBMSs) are explained with a focus on memory management. Central components of a DBMS (like bu�er manager or recovery manager) are explored. Next, database design is discussed; a brief review of Entity-Relationship Models (ERM) and the Uni�ed Modeling Language (UML) rounds this chapter o�. Chapter 2 Relational Database Management Systems (page 17) contains a review of the relational data model by de�ning relation schemas, database schemas and database constraints. It continues with a example of how to transform an ERM into a relational database schema. Next, it illustrates the core concepts of relational database theory like normalization to avoid anomalies, referential integrity, relational query languages (relational calculus, relational algebra and SQL), concurrency management and transactions (including the ACID properties, concurrency control and scheduling). Part II NOSQL And Non-Relational Databases comprises the main part of this book. In its eight chapters it gives an in-depth discussion of data models and database systems that depart from the conventional relational data model. Chapter 3 New Requirements, “Not only SQL” and the Cloud (page 33) admits that relational databases mangement systems (RDMBSs) have their strengths and merits but then contrasts them with cases where the relational data model might be inadequate and touches on weaknesses that current implementations of relational DBMSs might have. The chapter concludes with a description of current challenges in data management and a de�nition of NOSQL databases. Chapter 4 Graph Databases (page 41) begins by explaining some basics of graph theory. Having presented several choices for graph data structures (from adjacency matrix to incidence list), it describes the predominant data model for graph databases: the property graph model. After a brief digression of how to map graphs to an RDBMS, two advanced types of graphs are introduced: hypergraphs and nested graphs. Chapter 5 XML Databases (page 69) expounds the basics of XML (like XML documents and schemas, and numbering schemes) and surveys XML query languages. Then, the chapter shifts to the issue of storing XML in an RDBMS. Finally, the chapter describes the core concepts of native XML storage (like indexing, storage management and concurrency control).

X � Overview Chapter 6 Key-value Stores and Document Databases (page 105) puts forward the simple data structure of key-value pairs and introduces the map-reduce concept as a pattern for parallelized processing of key-value pairs. Next, as a form of nested key-value pairs, the Java Script Object Notation (JSON) is introduced. JSON Schema and Representational State Transfer are further topics of this chapter. Chapter 7 Column Stores (page 143) outlines the column-wise storage of tabular data (in contrast to row-wise storage). Next, the chapter delineates several ways for compressed storage of data to achieve a more compact representation based on the fact that data in a column is usually more uniform than data in a row. Lastly, column striping is introduced as a recent methodology to convert nested records into a columnar representation. Chapter 8 Extensible Record Stores (page 161) describes a �exible multidimensional data model based on column families. The surveyed database technologies also include ordered storage and versioning. After de�ning the logical model, the chapter explains the core concepts of the storage structures used on disk and the ways to handle writes, reads and deletes with immutable data �les. This also includes optimizations like indexing, compaction and Bloom �lters. Chapter 9 Object Databases (page 193) starts with a review of object-oriented notions and concepts; this review gives particular focus to object identi�ers, object normalization and referential integrity. Next, several options for object-relational mapping (ORM) – that is, how to store object in an RDBMS – are discussed; the ORM approach is exempli�ed with the Java Persistence API (JPA). The chapter moves on to object-relational databases that o�er object-oriented extensions in addition to their basic RDBMS functionalities. Lastly, several issues of storing objects natively with an Object Database Management System (ODBMS) – like for example, object persistence and reference management – are attended to. Part III Distributed Data Management treats the core concepts of data management when data are scaled out – that is, data are distributed in a network of database servers. Chapter 10 Distributed Database Systems (page 235) looks at the basics of data distribution. Failures in distributed systems and requirements for distributed database management systems are addressed. Chapter 11 Data Fragmentation (page 245) targets ways to split data across a set of servers which are also known under the terms partitioning or sharding. Several fragmentation strategies for each of the di�erent data models are discussed. Special focus is given to consistent hashing. Chapter 12 Replication And Synchronization (page 261) elucidates the background on replication for sake of increased availability and reliability of the database systems. Afterwards, replication-related issues like distributed concurrency control and consensus protocols as well hinted hando� and Merkle trees are discussed.

Overview



XI

Chapter 13 Consistency (page 295) touches upon the topic of relaxing strong consistency requirements known from RDBMSs into weaker forms of consistency. Part IV Conclusion is the �nal part of this book. Chapter 14 Further Database Technologies (page 311) gives a cursory overview of related database topics that are out of the scope of this book. Among other topics, it glimpses at data stream processing, in-memory databases and NewSQL databases. Chapter 15 Concluding Remarks (page 317) summarizes the main points of this book and discusses approaches for database reengineering and data migration. Lastly, it advocates the idea of polyglot architectures: for each of the di�erent data storage and processing tasks in an enterprise, users are free to choose a database system that is most appropriate for one task while using di�erent database systems for other tasks and lastly integrating these systems into a common storage and processing architecture.

Contents Preface | VII Overview | IX List of Figures | XIX List of Tables | XXII

Part I: Introduction � �.� �.� �.� �.�.� �.�.� �.�

Background | 3 Database Properties | 3 Database Components | 5 Database Design | 7 Entity-Relationship Model | 8 Uni�ed Modeling Language | 11 Bibliographic Notes | 14

� �.� �.�.� �.�.� �.� �.� �.� �.� �.�.� �.�.� �.�

Relational Database Management Systems | 17 Relational Data Model | 17 Database and Relation Schemas | 17 Mapping ER Models to Schemas | 18 Normalization | 19 Referential Integrity | 20 Relational Query Languages | 22 Concurrency Management | 24 Transactions | 24 Concurrency Control | 26 Bibliographic Notes | 28

Part II: NOSQL And Non-Relational Databases � �.� �.�.� �.�.� �.�.�

New Requirements, “Not only SQL” and the Cloud | 33 Weaknesses of the Relational Data Model | 33 Inadequate Representation of Data | 33 Semantic Overloading | 34 Weak Support for Recursion | 34

XIV � Contents �.�.� �.� �.� �.�

Homogeneity | 35 Weaknesses of RDBMSs | 36 New Data Management Challenges | 37 Bibliographic Notes | 39

� �.� �.�.� �.�.� �.� �.�.� �.�.� �.�.� �.�.� �.�.� �.� �.� �.� �.� �.�.� �.�.� �.�.� �.�

Graph Databases | 41 Graphs and Graph Structures | 41 A Glimpse on Graph Theory | 42 Graph Traversal and Graph Problems | 44 Graph Data Structures | 45 Edge List | 46 Adjacency Matrix | 46 Incidence Matrix | 48 Adjacency List | 50 Incidence List | 51 The Property Graph Model | 53 Storing Property Graphs in Relational Tables | 56 Advanced Graph Models | 58 Implementations and Systems | 62 Apache TinkerPop | 62 Neo4J | 65 HyperGraphDB | 66 Bibliographic Notes | 68

� �.� �.�.� �.�.� �.�.� �.�.� �.�.� �.�.� �.� �.�.� �.�.� �.�.� �.� �.�.� �.�.� �.�.� �.� �.�.�

XML Databases | 69 XML Background | 69 XML Documents | 69 Document Type De�nition (DTD) | 71 XML Schema De�nition (XSD) | 73 XML Parsers | 75 Tree Model of XML Documents | 76 Numbering Schemes | 78 XML Query Languages | 81 XPath | 81 XQuery | 82 XSLT | 83 Storing XML in Relational Databases | 84 SQL/XML | 84 Schema-Based Mapping | 86 Schemaless Mapping | 89 Native XML Storage | 90 XML Indexes | 90

Contents

�.�.� �.�.� �.� �.�.� �.�.� �.�

Storage Management | 92 XML Concurrency Control | 97 Implementations and Systems | 100 eXistDB | 100 BaseX | 102 Bibliographic Notes | 104

� �.� �.�.� �.� �.�.� �.�.� �.�.� �.� �.�.� �.�.� �.�.� �.�.� �.�.� �.�.� �.�.� �.�.� �.�.� �.�

Key-value Stores and Document Databases | 105 Key-Value Storage | 105 Map-Reduce | 106 Document Databases | 109 Java Script Object Notation | 110 JSON Schema | 112 Representational State Transfer | 116 Implementations and Systems | 118 Apache Hadoop MapReduce | 118 Apache Pig | 121 Apache Hive | 127 Apache Sqoop | 128 Riak | 129 Redis | 132 MongoDB | 133 CouchDB | 136 Couchbase | 139 Bibliographic Notes | 140

� �.� �.�.� �.�.� �.� �.� �.�.� �.�.� �.�

Column Stores | 143 Column-Wise Storage | 143 Column Compression | 144 Null Suppression | 149 Column striping | 151 Implementations and Systems | 158 MonetDB | 158 Apache Parquet | 158 Bibliographic Notes | 159

� �.� �.� �.�.� �.�.� �.�.�

Extensible Record Stores | 161 Logical Data Model | 161 Physical storage | 166 Memtables and immutable sorted data �les | 166 File format | 169 Redo logging | 171



XV

XVI � Contents �.�.� �.�.� �.� �.�.� �.�.� �.�.� �.�.� �.� � �.� �.�.� �.�.� �.�.� �.�.� �.� �.�.� �.�.� �.�.� �.�.� �.� �.�.� �.�.� �.� �.� �.�.� �.�.� �.�.� �.�.� �.� �.�.� �.�.� �.�

Compaction | 173 Bloom �lters | 175 Implementations and Systems | 181 Apache Cassandra | 181 Apache HBase | 185 Hypertable | 187 Apache Accumulo | 189 Bibliographic Notes | 191 Object Databases | 193 Object Orientation | 193 Object Identi�ers | 194 Normalization for Objects | 196 Referential Integrity for Objects | 200 Object-Oriented Standards and Persistence Patterns | 200 Object-Relational Mapping | 202 Mapping Collection Attributes to Relations | 203 Mapping Reference Attributes to Relations | 204 Mapping Class Hierarchies to Relations | 204 Two-Level Storage | 208 Object Mapping APIs | 209 Java Persistence API (JPA) | 209 Apache Java Data Objects (JDO) | 215 Object-Relational Databases | 217 Object Databases | 222 Object Persistence | 223 Single-Level Storage | 224 Reference Management | 226 Pointer Swizzling | 226 Implementations and Systems | 229 DataNucleus | 229 ZooDB | 230 Bibliographic Notes | 232

Part III: Distributed Data Management �� ��.� ��.� ��.� ��.�

Distributed Database Systems | 235 Scaling horizontally | 235 Distribution Transparency | 236 Failures in Distributed Systems | 237 Epidemic Protocols and Gossip Communication | 239

Contents

��.�.� ��.�.� ��.�

Hash Trees | 241 Death Certi�cates | 243 Bibliographic Notes | 244

�� Data Fragmentation | 245 ��.� Properties and Types of Fragmentation | 245 ��.� Fragmentation Approaches | 249 ��.�.� Fragmentation for Relational Tables | 249 ��.�.� XML Fragmentation | 250 ��.�.� Graph Partitioning | 252 ��.�.� Sharding for Key-Based Stores | 253 ��.�.� Object Fragmentation | 254 ��.� Data Allocation | 255 ��.�.� Cost-based allocation | 256 ��.�.� Consistent Hashing | 257 ��.� Bibliographic Notes | 259 �� Replication And Synchronization | 261 ��.� Replication Models | 261 ��.�.� Master-Slave Replication | 262 ��.�.� Multi-Master Replication | 263 ��.�.� Replication Factor and the Data Replication Problem | 263 ��.�.� Hinted Hando� and Read Repair | 265 ��.� Distributed Concurrency Control | 266 ��.�.� Two-Phase Commit | 266 ��.�.� Paxos Algorithm | 268 ��.�.� Multiversion Concurrency Control | 276 ��.� Ordering of Events and Vector Clocks | 276 ��.�.� Scalar Clocks | 277 ��.�.� Concurrency and Clock Properties | 280 ��.�.� Vector Clocks | 281 ��.�.� Version Vectors | 284 ��.�.� Optimizations of Vector Clocks | 289 ��.� Bibliographic Notes | 293 �� Consistency | 295 ��.� Strong Consistency | 295 ��.�.� Write and Read Quorums | 298 ��.�.� Snapshot Isolation | 300 ��.� Weak Consistency | 302 ��.�.� Data-Centric Consistency Models | 303 ��.�.� Client-Centric Consistency Models | 305



XVII

XVIII � Contents ��.� ��.�

Consistency Trade-o�s | 306 Bibliographic Notes | 307

Part IV: Conclusion �� ��.� ��.� ��.� ��.� ��.� ��.� ��.�

Further Database Technologies | 311 Linked Data and RDF Data Management | 311 Data Stream Management | 312 Array Databases | 313 Geographic Information Systems | 314 In-Memory Databases | 315 NewSQL Databases | 315 Bibliographic Notes | 316

�� Concluding Remarks | 317 ��.� Database Reengineering | 317 ��.� Database Requirements | 318 ��.� Polyglot Database Architectures | 320 ��.�.� Polyglot Persistence | 320 ��.�.� Lambda Architecture | 322 ��.�.� Multi-Model Databases | 322 ��.� Implementations and Systems | 324 ��.�.� Apache Drill | 324 ��.�.� Apache Druid | 326 ��.�.� OrientDB | 327 ��.�.� ArangoDB | 330 ��.� Bibliographic Notes | 331 Bibliography | 333 Index | 347

List of Figures �.� �.� �.�

Database management system and interacting components | 5 ER diagram | 11 UML diagram | 15

�.�

An algebra tree (left) and its optimization (right) | 24

�.�

Example for semantic overloading | 34

�.� �.� �.� �.� �.� �.� �.� �.� �.�

A social network as a graph | 41 Geographical data as a graph | 42 A property graph for a social network | 55 Violation of uniqueness of edge labels | 56 Two undirected hyperedges | 58 A directed hyperedge | 59 An oriented hyperedge | 60 A hypergraph with generalized hyperedge “Citizens” | 60 A nested graph | 62

�.� �.� �.� �.� �.� �.� �.� �.� �.� �.�� �.��

Navigation in an XML tree | 77 XML tree | 78 XML tree with preorder numbering | 79 Pre/post numbering and pre/post plane | 79 DeweyID numbering | 80 Chained memory pages | 93 Chained memory pages with text extraction | 94 B-tree structure for node IDs in pages | 95 Page split due to node insertion | 96 Conflicting accesses in an XML tree | 98 Locks in an XML tree | 99

�.� �.�

A map-reduce example | 107 A map-reduce-combine example | 109

�.�

Finite state machine for record assembly | 157

�.� �.� �.� �.�

Writing to memory tables and data �les | 167 Reading from memory tables and data �les | 168 File format of data �les | 170 Multilevel index in data �les | 171

XX � List of Figures �.� �.� �.� �.� �.� �.��

Write-ahead log on disk | 172 Compaction on disk | 173 Leveled compaction | 175 Bloom �lter for a data �le | 176 A Bloom �lter of length m = �� with three hash functions | 178 A partitioned Bloom �lter with k = � and partition length m ′ = � | 181

�.� �.� �.� �.� �.� �.� �.� �.� �.� �.��

Generalization (left) versus abstraction (right) | 195 Unnormalized objects | 197 First object normal form | 198 Second object normal form | 198 Third object normal form | 199 Fourth object normal form | 200 Simple class hierarchy | 205 Resident Object Table (grey: resident, white: non-resident) | 227 Edge Marking (grey: resident, white: non-resident) | 228 Node Marking (grey: resident, white: non-resident) | 228

��.�

A hash tree for four messages | 242

��.� ��.� ��.� ��.� ��.�

XML fragmentation with shadow nodes | 252 Graph partitioning with shadow nodes and shadow edges | 253 Data allocation with consistent hashing | 257 Server removal with consistent hashing | 258 Server addition with consistent hashing | 259

��.� ��.� ��.� ��.� ��.� ��.� ��.� ��.� ��.� ��.�� ��.�� ��.�� ��.�� ��.�� ��.�� ��.��

Master-slave replication | 262 Master-slave replication with multiple records | 263 Multi-master replication | 263 Failure and recovery of a server | 264 Failure and recovery of two servers | 264 Two-phase commit: commit case | 267 Two-phase commit: abort case | 268 A basic Paxos run without failures | 270 A basic Paxos run with a failing leader | 272 A basic Paxos run with a dueling proposers | 273 A basic Paxos run with a minority of failing acceptors | 274 A basic Paxos run with a majority of failing acceptors | 275 Lamport clock with two processes | 279 Lamport clock with three processes | 279 Lamport clock totally ordered by process identi�ers | 280 Lamport clock with independent events | 281

List of Figures



XXI

��.�� ��.�� ��.�� ��.�� ��.�� ��.��

Vector clock | 283 Vector clock with independent events | 284 Version vector synchronization with union merge | 287 Version vector synchronization with siblings | 288 Version vector with replica IDs and stale context | 291 Version vector with replica IDs and concurrent write | 292

��.� ��.� ��.�

Interfering operations at three replicas | 296 Serial execution at three replicas | 297 Read-one write-all quorum (left) and majority quorum (right) | 298

��.� ��.� ��.�

Polyglot persistence with integration layer | 321 Lambda architecture | 323 A multi-model database | 324

List of Tables �.� �.� �.�

A relational table | 17 Unnormalized relational table | 20 Normalized relational table | 21

�.� �.�

Base table for recursive query | 35 Result table for recursive query | 35

�.� �.� �.� �.�

Node table and attribute table for a node type | 56 Edge table | 57 Attribute table for an edge type | 57 General attribute table | 57

�.� �.�

Schema-based mapping | 88 Schemaless mapping | 89

�.� �.� �.� �.� �.� �.� �.� �.� �.� �.�� �.�� �.��

Run-length encoding | 145 Bit-vector encoding | 145 Dictionary encoding | 146 Dictionary encoding for sequences | 146 Frame of reference encoding | 147 Frame of reference encoding with exception | 147 Di�erential encoding | 148 Di�erential encoding with exception | 148 Position list encoding | 150 Position bit-string encoding | 150 Position range encoding | 151 Column striping example | 157

�.� �.� �.�

Library tables revisited | 161 False positive probability for m = � · n | 180 False positive probability for m = � · n | 180

�.� �.� �.�

Unnormalized representation of collection attributes | 203 Normalized representation of collection attributes | 204 Collection attributes as sets | 219

��.� ��.�

Vertical fragmentation | 249 Horizontal fragmentation | 250

Index 2PC see two-phase commit 2PL see two-phase locking acceptor 269, 272 ACID properties 26, 39, 295, 316 adjacency 45, 60 adjacency list 50–51 adjacency matrix 46–48 Aerospike 315 a�nity 249 agent 266 all-or-nothing principle 3, 26 AllegroGraph 311 allocation see data allocation Ambari 120 anomaly 19, 20, 33, 161, 196, 301 anti-entropy 240 ArangoDB 330 array database 313 association 13, 194, 197, 198, 204 association class 13, 198, 204 atomicity 26 attribute 9, 11–14, 17, 19, 22, 34, 53–56, 71, 72, 77, 85, 100, 103, 193, 213, 217–219 – composite 9, 11, 18, 19, 217 – key 18, 19, 207 – multi-valued 9, 11, 13, 18, 203, 204, 218 attribute table 56 Avro 120, 325 axis 81 B-tree 92, 94, 96, 171 backward traversal 52 big data 38 bit-vector encoding 145 Bloom �lter 175–181 breadth-�rst search 44 bucket 108 Byzantine failure 239 candidate key 21 CAP principle 306 causal consistency 304 causality 277, 280 – e�ective 304

class 12, 193 client-centric consistency 305–306 clock 276–292 cloud database 39 clustering 249 collision 176 column family 161, 163 column name 163 column quali�er 163, 167–169, 176 column store 143 column striping 151 combine 108 comission failure 238 compaction 173–175, 187, 189, 191, 243 compatibility matrix 99 complete graph 42, 43 composite attribute 9, 11, 18, 19, 217 compression 144 concurrency 24, 25, 131, 237, 261, 263, 280, 282, 283, 288, 290, 319 concurrency control 26–28, 97–100, 139, 266–276, 308 concurrent events 280 consensus problem 266 consistency 4, 26, 237, 261, 263, 271, 295–307, 320, 321, 323 – eventual see eventual consistency – trade-o�s 306 – weak see weak consistency consistent hashing 257 convergent replicated data types 130 coordinator 266 Couchbase 139 CouchDB 136 counter column 170 crash failure 238 dangling references 200 DAO see Data Access Object Data Access Object 202 data allocation 255–259 data distribution problem 256 data locality 108, 144, 163 data replication problem 265 data stream 312

348 � Index data-centric consistency 303 database-as-a-service 39 DataNucleus 229 decision phase 266 de�nition level 152, 156 depth-�rst search 44 derived fragmentation 250 DeweyID 80, 96 dictionary encoding 146 di�erence 22 di�erential encoding 148 directed graph 43 directed hyperedge 58 directed multigraph 43 distribution transparency 236 Document Object Model 76 document order 76 Document Type De�nition 71–73 dotted version vectors 131, 291 Dremel 151 Drill 324 Druid 326 DTD see Document Type De�nition durability 26 Dynamo 257 edge 41, 42 edge cut 248, 254 edge label 53, 54 edge list 46 edge marking 227 edge table 56 end tag 69 entity 8, 17–19, 33, 71, 161, 163, 254 entity lifecycle 211, 215 entity-relationship model 8–11 epidemic protocol 239–241, 265 ERM see entity-relationship model Eulerian Cycle 45 Eulerian Path 45 eventual consistency 304 eXistDB 100 Extensible Markup Language 69–71 extensible record store 161 fail-recover 239 fail-stop 239 failure 171, 237–239, 262, 264, 265, 267, 268, 271, 272

�nite state machine 156 Flink 312 FLOWR expression 82 Flume 120 foreign key 18–20, 33, 84, 86, 87, 89, 111, 163, 203–205, 212, 213 forward traversal 52 fragmentation 245–254 frame of reference encoding 146 generalized hyperedge 59 Geode 315 geographic information system 314 GeoJSON 314 GeoServer 314 GIS see geographic information system gossip 239–241 graph 41–45 – complete 42, 43 – directed 43 – multi-relational 53 – oriented 43 – simple 42, 43 – single-relational 53 – undirected 42 – weighted 44 graph partitioning 252 graph problems 45 graph traversal 44 GRASS GIS 314 Hadoop 118 Hamilton Cycle 45 Hamilton Path 45 happened before 277 happened-before relation 277 hash function 176, 252, 255, 257 hash tree 241–243 Hazelcast 315 HDFS 118 head set 59 hinted hando� 265 history 300 Hive 127 homogeneity 35, 143 horizontal fragmentation 249 hybrid fragmentation 250 hyperedge 58 – directed 58

Index

– generalized 59 – oriented 59 – undirected 58 hypergraph 58 HyperGraphDB 66 hypernode 61 idempotent 117 identi�er overflow 96 identi�er stability 96 immutable data �les 166 in-memory database 315 incidence 45 – negative 53 – positive 53 incidence list 51–53, 61 incidence matrix 48–49, 61 inconsistency window 303, 306 index 90, 100, 103, 171 inlining 86 integrity 4, 24, 25, 118 interface 194, 201 interrelational constraints 18 intersection 22 intrarelational constraints 18 inverse attributes 200 isolation 26, 97, 306 Java Data Objects 215–217 Java Persistence API 209–214 Java Persistence Query Language 209 Java Script Object Notation 101, 110–112, 116, 229, 314, 319, 325, 326, 328 JDO see Java Data Objects Jena 311 join 23, 34, 35, 89, 91, 92, 124, 162, 203, 206, 207, 209, 214, 247, 249, 250, 253, 327, 331 JPQL see Java Persistence Query Language JSON see Java Script Object Notation JSON object 110 JSON Schema 112–116 key attribute 18, 19, 207 key-value pair 63, 101, 103, 105, 106, 109, 110, 113, 119, 155, 162, 169, 173 labeling scheme see numbering scheme lambda architecture 322 Lamport clock see scalar clock



349

leader 269, 271 lean and mean 317 learner 269, 271 linked data 311 lock escalation 99 locking 7, 27, 98, 319 Log-Structured Merge Tree 171 logical clock 277 lost update 282, 290, 291, 295, 298 main memory 4–7, 83, 92, 143, 162, 163, 170, 171, 208, 209, 223–228, 315 main memory address 6, 226, 227 main memory table 167 map 106, 107 map-reduce 106–109, 118 master-slave replication 262 memtable 167, 172 Merkle tree see hash tree method 12, 193 method chaining 64 migration 237, 317, 318 MonetDB 158 MongoDB 133 multi-level index 171 multi-master replication 263 multi-model database 322, 324, 327, 330 multi-relational graph 53 multi-valued attribute 9, 11, 13, 18, 203, 204, 218 multiedge 43 multigraph 42, 43 – directed 43 – undirected 43 multiplicities 13, 151 multiversion concurrency control 276 MVCC see multiversion concurrency control natural join 22 negative incidence 53 Neo4J 65 nested graph 61 network partition 238 NewSQL 315 node 41 node label 53, 54, 56 node marking 228 node table 56 node test 81

350 � Index non-blocking reads 276 non-redundancy 4, 246, 318 non-resident 225 normalization 19, 20, 33, 161, 196–199, 218 Not only SQL 38 null suppression 149 nullipotent 117 numbering scheme 78–81 Object Data Management Group 201 object identi�er 194–196 Object Management Group 201 object normal form 196–199 object-relational databases 217–222 object-relational impedance mismatch 194 object-relational mapping 202–217 omission failure 238 one-copy serializability 296, 297, 301 operator tree 23 optimistic concurrency control 26, 98 OrdPath 80, 97 OrientDB 327 oriented graph 43 oriented hyperedge 59 page bu�er 5, 167, 208, 209, 225, 226 page split 95 Parquet 158, 325 partial quorum 299 partition tolerance 302, 306, 307 partitioning 108, 245 Paxos 268–274 peer-to-peer replication 263 persistence 4, 200, 202, 213, 215, 216, 223, 224 pessimistic concurrency control 26, 98 Pig 121 point query 168 pointer swizzling 226–228 polyglot persistence 320 position bit-string 150 position list 149 position range 150 positive incidence 53 PostGIS 314 postorder numbering 78 Pre/Dist/Size encoding 102 pre/post diagram 79 pre/post numbering 78 predicate 81

pre�x numbering 80 preorder numbering 78 primary key 19–21, 86, 88, 182, 196, 211, 215 projection 22, 214, 249 property 54 property graph 53–55 proposer 269, 271 QGIS 314 quorum 265, 298–299 range query 168 Rasdaman 313 RDF see resouce description framework reachability 213, 215, 224 read phase 269, 276 read repair 265 Read-one write-all 298 recovery 172, 264 recursion 34, 35 Redis 132 redo logging 171 reduce 106, 107 redundancy 4, 8, 19, 33, 203, 206, 218, 262 reengineering 317 referential integrity 20, 200 relation schema 17, 18, 34, 37 relational algebra 22 relational calculus 22, 23 relational query language 22 relationship 9, 19, 34, 41, 58, 65, 194, 200, 213, 224 reliability 4, 236, 261, 264, 291, 323 renaming 22 renumbering 80 repetition level 152, 156 replication 4, 237, 261–266, 301–303, 315 replication factor 261, 263, 298 Representational State Transfer 116–117 resident 225 resident object table 226 resilient distributed datasets 121 resouce description framework 311 REST see Representational State Transfer Riak 129 round-tripping 90 row key 162, 163, 167–170, 176 ROWA see Read-one write-all rumor spreading 240

Index

run-length encoding 144 Samza 312 scalability 3, 38, 235, 236, 302, 320, 324 scalar clock 277–281 Scalaris 315 schedule 7, 27 schema evolution 8, 37–39, 174, 223, 319 schema independence 37 schema-based mapping 84, 86 schemaless 37, 38, 105, 253, 318, 319, 325 schemaless mapping 84, 89 SciDB 313 selection 22, 248, 250, 251, 253 semantic overloading 34, 41 semi-structured 3, 69, 109, 320 sequential consistency 295, 296 serializability 27, 296, 297, 299, 301 service level agreement 39 Sesame 311 session guarantees 305 shadow node 251, 253 sharding 245, 253 shared-nothing architecture 235 shu�le 106, 107 sibling version 287 Simple API for XML 76 simple graph 42, 43 single-level storage 224 single-relational graph 53 sliding window 312 snapshot 26, 118, 120, 223, 315 snapshot isolation 300–301 – non-monotonic 304 – parallel 304 source node 43, 46, 54 source set 58 spanning tree 45 Spark 120 SPARQL 311 specialization 14, 194, 196, 202, 204 split 106, 107 SQL see Structured Query Language SQL object 217 SQL/XML 84 Sqoop 128 start tag 69 Storm 312 strong clock property 281



351

Structured Query Language 22, 23 subclass 12, 14, 194, 199, 204–208, 212 superclass 14, 194, 199, 204, 205, 207, 208, 211, 212 synchronization 263, 285, 286, 288, 290 tail set 58 target node 43, 46, 54 target set 59 Tez 120 three-phase commit 268 time-to-live value 164, 166, 167, 169, 173, 187, 244, 291 timestamp 166 timestamp scheduler 27 TinkerPop 62, 327 TokuDB 316 tombstone 167, 172 trailer 171 transaction 24–28, 33, 36, 65, 97, 99, 133, 172, 211, 216, 246, 250, 252, 266, 276, 296, 297, 300–303, 319, 322, 328, 329 transitive closure 34 transparency 236 traversal 44 – backward 52 – forward 52 triple store 311 TTL see time-to-live value tuple reconstruction 144, 148 two-level storage 208 two-phase commit 266 two-phase locking 27 typed table 217, 221, 222 UML see Uni�ed Modeling Language undirected graph 42 undirected hyperedge 58 undirected multigraph 43 Uni�ed Modeling Language 11, 201 union 22, 34, 35, 207, 247, 250 upsert 117, 166, 172 validation phase 276 vector clock 281–284, 289–292 vector clock bounding 289 vector clock comparison 283 version vector 284–289 versioning 37, 166, 174, 223, 319

352 � Index vertex 41, 42 vertical fragmentation 249 virtual heap 225, 254 Virtuoso 311 visibility 12, 191, 193, 306 VoltDB 315 voting phase 266 weak clock property 281 weak consistency 39, 299, 302–303 weighted graph 44 wide column store 161 write phase 269, 276 write-ahead logging 172

XML see Extensible Markup Language XML Parser 75 XML Schema 73–75 XML tree 76 XPath 81–82 XQuery 82–83 XSD see XML Schema XSLT 83–84 YARN 120 ZooDB 230 ZooKeeper 120