Introducing Azure DocumentDB – Microsoft’s fully managed NoSQL document database service

2014-08-21

Today is an extremely exciting day as we release Microsoft Azure DocumentDB, a fully managed, JSON document database service.

DocumentDB was built from ground up in response to the increasing demands of applications being developed here at Microsoft and by Microsoft Azure customers. We heard from customers that they need a database that can keep pace with their rapidly evolving applications – something fast, flexible and scalable. Increasingly NoSQL databases are becoming the tool of choice for many developers but running and managing these databases can be costly, especially at scale. We also heard that customers wanted more of the capabilities inherent to relational database systems – rich queries and transactional processing are still important. Most data stores offer extreme choices to developers – strong or eventual consistency, schema-free with limited query capabilities or schematized and rich queries capabilities, transactions or scale and so on. The fact is that numerous real world scenarios exist between these extremes and we want to address them.

So we considered what it would take to build a massively scalable, schema-free database with rich query and transaction processing using the most ubiquitous programming language (JavaScript), data model (JSON) and transport protocol (HTTP) – that is DocumentDB.

We decided to build a database engine which makes a deep commitment to the JSON data model and JavaScript language. This singular design choice, in-turn, enabled a set of distinctive capabilities including, the ability to automatically index documents without requiring any schema or secondary indices, the ability to issue SQL based relational and hierarchical queries over heterogeneous JSON values, the ability to integrate database transactions with JavaScript exceptions and the ability to seamlessly operate over JSON documents. As a multi-tenant database service, we have built each component of the stack with robust resource governance to ensure tenant isolation and the elastic scale of throughput and storage. As engineers, we obsess relentlessly on site reliability, high availability, performance, and scale. Finally, we believe that databases should be blazingly fast and yet safe by default.

Meeting the promise of schema-free

We wanted DocumentDB to support SQL queries over arbitrary documents without forcing the developer to create explicit schema or secondary indices or views. We wanted to give developers the freedom to rapidly iterate on application schema while preserving the ability to execute ad hoc queries. We also felt that queries should yield consistent results even when write rates are high.

Through the deep commitment to the JSON data model, DocumentDB is able to efficiently index, query and process heterogeneous documents. We designed the DocumentDB SQL language to be based on the JavaScript type system, expression semantics and ability to invoke JavaScript UDFs. DocumentDB’s query grammar adds document semantics, hierarchical and relational projections through a familiar SQL dialect for developers. This creates an efficient and natural way for you to query over JSON documents. The .NET SDK also includes a LINQ provider and we are considering native JavaScript mapping to our SQL query language.

We have designed the storage and indexing subsystem to serve consistent queries in the face of sustained high volumes of writes. This is accomplished using novel log structured storage techniques for index maintenance and indexing algorithms which fully exploit the SSDs. By default, all document properties are indexed and can be queried through the DocumentDB SQL query language.

Comments

Anonymous
August 21, 2014
2 questions that come to mind, that need answering for me to even consider this..

Seeing as I will store ALL my documents in Azure for this to work, how much will it cost me (ALL Costs from bandwidth, storage, etc)
How do I get my documents out , could be terrabytes, if I need to . Or if I build an archiving solution that needs to pull documents out.. How will this work.. I love this technology you've built BUT the biggest hurdles for me are what I mentioned above..

Anonymous
August 21, 2014
Looks like what I've been waiting for. Can't wait to try to replace my SQL Server/Table Storage mix data architecture with this one. Personally I don't like to use JavaScript more than absolutely necessary, but as far as I can see, also LINQ queries are supported against DocumentDB in .NET languages - so that's fine for me. Drawbacks are the costs (the preview price is OK but given this will be 100 % more expensive when out of Beta, it could be a hurdle for people needing lots of storage for historical data like me) and the document query limit of 2.000/Sec - if I'm right and this also means that a single query returning 2.000 docs already puts me to the limit if I don't throw more money at you :) However, it looks like it could be the solution I've been waiting for in Azure ...
Anonymous
August 21, 2014
Do you have document encryption on the road map?
Anonymous
August 21, 2014
My current system stores millions of documents quite efficiently using Couchbase Server (CS, hereafter). The downside in Azure is that I have to maintain VMs to run CS on myself. Issuing patches, etc. It seems to me that this could be a reasonable replacement for CS and save me some work. That said, there are features I use quite heavily that Azure DocumentDB does not seem to have yet. The first one is the concept of a view. It is essentially an index of documents that is built on the fly as data comes in. I can quickly pull large lists of data using it. The second one is an in-memory layer (memcached) of the most popular documents/views. This allow me very quick access for both pulling and pushing data. The speed of these features are absolutely vital to the performance of my very data intensive system. Do you expect Azure DocumentDB to have these features in the future? Thanks, Corey
Anonymous
August 21, 2014
Most of the application which were going with Table storage (no-SQL Storage)was facing transactional problems and were a bottleneck for the transition to cloud. Azure DocumentDB will solve remove all those obstacles ! Future is cloud ! Thanks for the awesome alternative !
Anonymous
August 22, 2014
To second Corey's question: Lists/Views/Paging seems to be missing. There is an example in the code samples that shows how to somewhat do this in a stored proc, but it seems as if it can't handle millions of items, and requires you to pull back all data to sort. Ideally this could be controlled by a View or a special index of some kind. Also, is there a story on partial document updates? If I have a large document but just want to change one property or maybe add something to a collection, can I just do that? Lastly, the guidance on CUs and Collections is a little confusing. I can't really tell if you are recommending a sharing approach (as it is mentioned that Collections are how data is partitioned for scale). So, if I have 10 CUs, do I need to have 10 collections, or can I just have 1 collection that gets all of the resources of the 10 CUs?
Anonymous
August 22, 2014
Hey - Just wanted to check if you have the typescript definition file for Javascript SDK
Anonymous
August 22, 2014
Is Azure DocumentDB covered under Microsoft Trust / HIPAA compliant? (as it appears there is governance setup robustly within a multi-tenant environment)? Is there a BA agreement on this product? thank you for your time and assistance on this question.
Anonymous
August 23, 2014
Its great to see the Azure team innovating in this space and will be great if their effort with DocumentDB can be released as a standalone db with a community (free) and self-hosting(use your own nodes at an affordable price point) version. This allows different projects and businesses to have a pick at what deployment (community, self-hosting, azure cloud) works for them.
Anonymous
August 25, 2014
@jose fajardo, to find out how much DocumentDB will cost, please take a look at this page azure.microsoft.com/.../documentdb. In order to export documents in bulk from DocumentDB, you can use the ReadFeed method from any of the client SDKs. The response from the method can then be streamed to the local file system or e.g., Azure Blob storage for archival. If you’d like to see archiving capabilities in the service please post your suggestions to feedback.azure.com/.../263030-documentdb
Anonymous
August 25, 2014
@Jeff, we do have encryption of the list of future feature work. Please help us prioritize the timing by voting for it on feedback.azure.com/.../263030-documentdb.
Anonymous
August 25, 2014
@Corey, these are great suggestions. Regarding views, note that views that are based on filters and projections can be accomplished by simple SQL queries since DocumentDB supports automatic indexing. We do understand that there are scenarios where views based on aggregates are quite useful. Please post your feedback at feedback.azure.com/.../263030-documentdb. Common documents will be read from memory in DocumentDB. You can also implement a caching layer in your application by using DocumentDB with Azure Cache. We will add more documentation and tooling on how to do this.
Anonymous
August 25, 2014
@Ryan LM, for paging, please take a look at the QueryWithPaging method in the MSDN samples in this file (code.msdn.microsoft.com/.../sourcecode). Sorting (order by) and partial document updates are planned for future updates. Please vote for these features at feedback.azure.com/.../263030-documentdb. In the preview offers, the maximum size of a collection is 10GB. To fully utilize 10CUs, you should create at least 10 collections. With 10 CUs: • If you create 10 collections, they will be allocated 2000 request units each = total of 20,000 request units. • If you create 30 collections, they will be allocated 667 request units each also = total of 20,000 request units. Hope that helps.
Anonymous
August 25, 2014
@Emmaneul Buah, thanks for the feedback. Please vote for standalone installations of DocumentDB at feedback.azure.com/.../263030-documentdb. Be sure to distinguish whether you want a stand-alone deployment option vs. a local emulator.
Anonymous
August 25, 2014
@Sushant, we don't have the Typescript definition file for the JavaScript SDK. It's a great suggestion - please propose it at feedback.azure.com/.../263030-documentdb
Anonymous
August 26, 2014
Will there be an on-premise variant as well?
Anonymous
August 27, 2014
@LeRenard242 If you would like to see this as a feature, please go vote for it on feedback.azure.com/.../6352899-on-premise-instance
Anonymous
August 29, 2014
Can I use DocumentDB in my local server instead of Azure ? If the answer is NO, then sorry but I'm not interested in about the db anymore.
Anonymous
September 02, 2014
"This allows developers to write application logic which can be shipped using HTTP POST and executed directly on the storage partition within a transaction boundary." I'm very surprised to read this. After all these years of education around SQL injection, we now encourage "procedure injection"? In principle, anonymouse T-SQL blocks could be sent to the database from clients already today, so this isn't even novel...
Anonymous
September 03, 2014
The comment has been removed
Anonymous
September 04, 2014
In msdn. msdn.microsoft.com/.../microsoft.azure.documents.client.connectionmode.aspx > Direct and Gateway connectivity modes are supported. Direct is the default. But now, Gateway is default.
Anonymous
September 11, 2014
The comment has been removed
Anonymous
October 17, 2014
What is the roadmap of Java SDK for DocumentDB
Anonymous
October 21, 2014
@Mallikarjun, we are entering the final stages of the Java SDK. we can expect the first version of this to be published within weeks. Follow us on twitter at @DocumentDB or check back on the blog for the announcement when it happens.
Anonymous
January 07, 2015
Is there a date for general availability of DocumentDB? If not, can you give any kind of indication as to when this will happen?
Anonymous
January 20, 2015
@Paul we're hard at work to make this happen. All I can say here is expect something by the end of 2015 Q1.

Share via

Introducing Azure DocumentDB – Microsoft’s fully managed NoSQL document database service

Comments

Additional resources