Tuesday 6 December 2011

Testing your document structure for inconsistencies within MongoDB

One of the advantages with schema-less design is that it works well for prototyping; you can have a collection of documents with each of the documents of variable structure. You can modify the document structure for one, some or all documents within the collection all without requiring a schema for the collection or each and every document.
However, this is also a disadvantage during prototyping; there are no constraints to stop documents within the same collection having variable structure. Deliberate updates to a document succeed silently as do accidental updates; ie when you update a document with a subdocument hanging off the wrong node. So when you assume you have consistency across all the documents, within a collection, but don't, you will run into some issues. You could also argue here that you're not coding defensively enough if you're not checking consistency at the time of execution; I'm not going to go into that right now though.

That exact structure inconsistency happened to me, and I ended up going down a rabbit hole. The smart thing to do was to blat the DB and recreate it during each test, but there were reasons that I didn't do that, which again I'm not going to go into here. Additionally the error that was coming back from performing an operation on the inconsistent structure wasn't obvious and didn't indicate to me that there was document structure inconsistencies, but that's another story.
Anyhoo, I didn't want this to happen again, so to verify the structural consistency of a collection I now pull in a json template; an example of the structure of the document I'm expecting to find within the collection I'm working with, and compare it to the collection in the DB. You can define your template in a json/txt file or you can manually create the DBObject. Simply put, I perform a symmetrical diff on the documents that are contained in the collection(s) I'm working with, and report on any additional field not defined in the template, I also report on any field that is defined in the template but not the document.

This example creates a DBObject with a few NVPs at the root, a couple as subdocument NVPs and finally an array.



We use this 'template' to compare against the documents within the tests collection.

This is all very lightweight but as a method to verify crude consistency it is very handy, for me anyway.


1 comment:

  1. https://twitter.com/#!/al3xandru @al3xandru Thanks for the posting on myNoSQL

    http://nosql.mypopescu.com/post/13918817814/nosql-document-databases-testing-your-document

    ReplyDelete