In this paper, we examine a number of SQL and socalled “NoSQL” data stores designed to scale simple OLTP-style application loads over many servers. Originally motivated by Web 2.0 applications, these systems are designed to scale to thousands or millions of users doing updates as well as reads, in contrast to traditional DBMSs and data warehouses. We contrast the new systems on their data model, consistency mechanisms, storage mechanisms, durability guarantees, availability, query support, and other dimensions. These systems typically sacrifice some of these dimensions, e.g. database-wide transaction consistency, in order to achieve others, e.g. higher availability and scalability. Note: Bibliographic references for systems are not listed, but URLs for more information can be found in the System References table at the end of this paper. Caveat: Statements in this paper are based on sources and documentation that may not be reliable, and the systems described are “moving targets,” so some statements may be incorrect. Verify through other sources before depending on information here. Nevertheless, we hope this comprehensive survey is useful! Check for future corrections on the author’s web site Disclosure: The author is on the technical advisory board of Schooner Technologies and has a consulting business advising on scalable databases.

Oracle NoSQL Database and MongoDB server are both licensed under AGPL while MongoDB has certain client drivers under the Apache 2.0 license.  Oracle NoSQL Database is in many respects, as a NoSQL Database implementation leveraging BerkeleyDB in its storage layer, a commercialization of the early NoSQL implementations which lead to the adoption of this category of technology. Several of the earliest NoSQL solutions were based on BerkeleyDB and some are still to this day e.g. LinkedIn’s Voldemort. The Oracle NoSQL Database is a Java based key-value store implementation that supports a value abstraction layer currently implementing Binary and JSON types. Its key structure is designed in such a way as to facilitate large scale distribution and storage locality with range based search and retrieval. The implementation uniquely supports built in cluster load balancing and a full range of transaction semantics from ACID to relaxed eventually consistent. In addition, the technology is integrated with important open source technologies like Hadoop / MapReduce, an increasing number of Oracle software solutions and tools and can be found on Oracle Engineered Systems.

CUSTOMER TESTIMONIAL: FRONT PORCH DIGITAL’S RAPID APPLICATION DEVELOPMENT Front Porch Digital, Inc. is a world leader in digital asset workflow management serving global leaders in the entertainment industry. Front Porch Digital recently integrated Stretchr into its application development processes. “Stretchr has fundamentally changed the way we approach data systems development. Today’s data comes in so many shapes and sizes and is always changing, requiring you to spend a huge amount of time designing and editing schemas in traditional databases or developing expertise in NoSQL technology. With Stretchr all of that time and complexity goes away. You simply acquire the data, from any source and in any form. Stretchr then organizes the data for you based on how your users consume it – it couldn’t be simpler. Our first integration with Stretchr took an afternoon, and was effectively the insertion of one line of code into our existing application. So happy are we with the way Stretchr works and performs that we are tightly integrating our newest products with Stretchr, cutting development times significantly”.

In this paper, I describe some of the recent developments in the database management area, in particular the NoSQL phenomenon and the hoopla associated with it. The goal of the paper is not to do an exhaustive survey of NoSQL systems. The aim is to do a broad brush analysis of what these developments mean - the good and the bad aspects! Based on my more than three decades of database systems work in the research and product arenas, I will outline what are many of the pitfalls to avoid since there is currently a mad rush to develop and adopt a plethora of NoSQL systems in a segment of the IT population, including the research community. In rushing to develop these systems to overcome some of the shortcomings of the relational systems, many good principles of the latter, which go beyond the relational model and the SQL language, have been left by the wayside. Now many of the features that were initially discarded as unnecessary in the NoSQL systems are being brought in, but unfortunately in ad hoc ways. Hopefully, the lessons learnt over three decades with relational and other systems would not go to waste and we wouldn’t let history repeat itself with respect to simple minded approaches leading to enormous pain later on for developers as well as users of the NoSQL systems! Caveat: What I express in this paper are my personal opinions and they do not necessarily reflect the opinions of my employer.

