The Non-Relational DB Strikes Back!

When I started at Jot, one of the things I fell most in love with about the platform was the way that application developers on the system never, ever had to think about “the database”. You just had nodes (JS objects which could serialize themselves to XML) and nodes had properties. Setting a property on the node persisted it and created a new version of the node.

Instead of thinking about how you were going to represent your problem in objects and then figuring out some way to map them to an RDBMS, you just started figuring out your problem in code, letting the normal development cycle iterations of what a “thing” is continue without stopping after every change to babysit the database or perform some sort of lame (or brittle) “migration”. Recently I’ve been excited to see this kind of work start to evolve in the Django world while aspects of the non-relational data store have been finding more mindshare through projects like CouchDB and ERlang’s built-in Mnesia persistence layer, although they all still feel relatively primitive in comparison to the “experimentation is free” environment that Jot offered. Sometimes folks ask my why I don’t get into RoR, and every time I look into it again I’m alway struck how….backward it is. Hopefully the rumored gemstone port to ruby will plug up some of the remaining conceptual leaks that the RDBMS addiction has tortured the RoR and DJango app development process with.

Adding to the non-RDBMS data storage action is the announcement by Amazon of their SimpleDB service. It shares some of the best features of the Jot model (easy key/value setting, no schema, query anything) but doesn’t yet seem to have the ability to version individual records. Even if SimpleDB doesn’t do it, I expect it to pop up in another form somewhere else soon.

I’m tremendously excited about these sorts of services and data stores. It’s been clear for some time that most data storage tasks for even departmental applications are main-memory tasks. It’ll be interesting to see how the language environments respond to these changes. Microsoft’s LINQ integration into .NET languages is the first major stab in this direction, and I expect the next up-and-coming language will probably develop something similar in order to one-up Java and Ruby by making “schema evolution” look more like adding properties to an object or a class prototype (in JS parlance).

Hopefully soon all of this work will soon yield a web framework for general consumption that will show the rest of the world what Jot got dead right: that when your data and your program can evolve in harmony and without friction or risk, you are truly liberated. When storage is free (and it nearly is), “screwing up” should mean starting a fire in your data center. Everything else is just a version rollback.

15 Comments

  1. Ryan
    Posted December 15, 2007 at 3:15 am | Permalink

    So why hasn’t Jot open-sourced their hot framework?

  2. Posted December 15, 2007 at 6:28 am | Permalink

    I am surprised that you didn’t mention Persevere since it does almost exactly what you seem to be describing, for JavaScript (short of versioning, but that is just a matter of time).

  3. Helge
    Posted December 15, 2007 at 7:15 am | Permalink

    To me this sounds like Zope. OO database which transparently stores your Python objects, transparent mapping of the objects to the web, indexing etc etc. IMHO conceptually the best framework around.

    And my personal opinion is that Zope partly failed due to this, the same issue Smalltalk always had :-) People just do not want to put their data (and their code!) in some binary blob.

  4. Posted December 15, 2007 at 10:22 am | Permalink

    Ryan: Jot was bought by Google. Haven’t heard anything from them since.

    Kris: Perservere is really pretty close, but doesn’t have a query language. Pure orthogonal persistence gets some some of the way there, but you need more than that to really make apps sing.

    Helge: ZODB suffered, primarily, from difficulty integrating (and a weird build process that hampered adoption). I was also brutally slow and required “rebuilding” pretty frequently. I had great hopes for it at one point, but it appears to be an artifact of history now.

  5. Posted December 15, 2007 at 11:50 am | Permalink

    Very good points! You should read through Google’s BigTable paper. It’s exactly what you describe – except that it’s implemented in a tremendously scalable manner. An open source implementation of BigTable (i have heard something called ‘Haboob’ or the like) would kick ass!

  6. Posted December 15, 2007 at 1:54 pm | Permalink

    Thanks for the great insights, Alex. …I also can’t wait for some of these api’s to mature. Looking forward to it materializing in an existing or future framework…

  7. Martijn Faassen
    Posted December 15, 2007 at 2:51 pm | Permalink

    Alex, I’m confused when the ZODB was difficult to integrate. I used it in a non-Zope setting in 2003 or so, and I don’t recall getting a lot of problems. The build-process is setuptools driven now.

    I also don’t know what you mean by it being “brutally slow”. In what context?

    Finally I don’t know at all by what you mean by rebuilding it frequently. Do you mean packing it to remove older transactions? The equivalent operation, I understand, needs to be done by the typical RDB as well to clean up space on the filesystem.

    That’s not to say the ZODB doesn’t have its problems, I just have a hard time actually recognizing the problems you describe. One of the problems is indeed one described by Helge: often people don’t want to commit their whole “world” into a single database (especially not code, though that hasn’t been the primary use of the ZODB for a long time). And having a powerful query language like SQL around is sometimes nice.

    One difference between the ZODB and something like CouchDB is that CouchDB focuses on usage over the network, out-of-process. The ZODB is in-process (though one can spread it over multiple processes using ZEO). Both approaches have separate advantages and drawbacks.

  8. Posted December 15, 2007 at 4:14 pm | Permalink

    Alex, you write:

    “when your data and your program can evolve in harmony and without friction or risk, you are truly liberated”

    sounds like an OODB to me.

  9. Posted December 15, 2007 at 5:06 pm | Permalink

    Hi Alex,

    I read your post with great interest. My company, AppJet (appjet.com), has a hosted server-side JavaScript framework for building web apps from your browser. We share your philosophy about storage frameworks; sometimes people ask us why we didn’t just use RoR and/or SQL, and the reasons in your post are why.

    Our persistent storage system is based on JavaScript objects — you can persist an arbitrary graph of JavaScript objects that you create to be “storable”. You can also have “StorableCollections” of objects, which support filtered and sorted views.

    We just launched a few days ago to let people start playing around with our system, so things are pretty basic, but we’d be interested in your feedback. We want AppJet to be a dream environment for quickly putting web apps together, and we’re willing to write our own software (framework, storage system, execution engine) when we feel existing components won’t cut it.

    — David

  10. Posted December 16, 2007 at 6:06 am | Permalink

    Hey, good piece and I tend to agree that the non-relational db’s are on the rise for various reasons.

    However, I do think ZODB is closer to your description of world dominating non-relational data stores than may be apparent at first blush. From what I remember, and it was way way back around the turn of the century in fact, ZODB was plenty fast and I didn’t have to rebuild it on a daily basis. You have to purge old transactions once in a while but that could be automated.

    Even in record serialization to XML, you’d still need to “compact” the store once in a while, or at least one should look into it. ZODB just provides that out of the box.

    I’m not advocating ZODB, but I think it may be more advanced than the process you describe here. Something closer to the “jot” process may be the old “pickling” paradigm that Pythons of yore used to do. Not sure if anyone still uses pickling, but that was the object serialization that they also used about half a decade ago.

  11. Posted December 16, 2007 at 11:22 am | Permalink

    btw …also nice to see that SimpleDB was built on top of Erlang…

  12. Posted December 16, 2007 at 2:17 pm | Permalink

    Hey Amir, Martijn:

    So I really really wanted ZODB to be “the one” but when I had last looked at it deeply it just wasn’t. When I last used it there was some serious difficulty in integrating it non-Zope environments and ensuring that it’s performance was acceptable. I gave up on it after following the project for nearly a year. It seems to have moved forward significantly in the interim (thanks for the poke on that Martjin). I will look into it more deeply.

    I like Python a lot (although the lack of fixed lambdas and reasonable closure syntax has really soured me in the last couple of years). Hopefully someone can convince Guido that these kinds of features deserve some support deeper in the system.

    Regards

  13. Posted December 16, 2007 at 2:43 pm | Permalink

    Amir:

    One more point on Jot: I’m not free to discuss it’s serialization strategy at more length than the old API documentation (http://developer.jot.com/WikiHome/DevDocToc) used to describe and to say that the system exposed serialization through endpoints to multiple formats. The platform never exposed compacting or any other type of cleanup or indexing to users. It’s hard to see how ZODB has evolved in this direction since it’s wiki appears to be down. It’s not clear where the evolutions which had at one point been slated for ZODB4 ever went to.

    Lastly, the query language turns out to be important. Jot used a modified version of XPath with the ability to easily call into JavaScript functions to perform higher-order matching, and this worked very well. I don’t like XPath per sae, but having a compact way to say “get me stuff that matches this rule from here” is useful. Since the data’s hierarchy usually represents the relationships between things in the system, there wasn’t much of the ad-hoc tree walking stuff that ZODB kinda backed you into (at least the last time I spent serious time with it, which was years ago).

    Regards

  14. Posted December 17, 2007 at 10:10 am | Permalink

    Query language is important and Persevere does have query capabilities. Persevere uses JSONPath syntax, which is certainly gaining traction as an important query language due to it’s simple and intuitive syntax, and according to Dustin Machi, JSONPath will actually even appear in Dojo soon.
    Persevere also bridges the gap of local and remote access to persisted objects (the contrast Martijn made between ZODB and CouchDb), client and server side JS can access the persisted data consistently.

  15. Bob Haugen
    Posted December 17, 2007 at 1:40 pm | Permalink

    Hi Alex,

    Since you left Jot, I developed several serious apps using their API, including a mini-ERP. (Finally got reasonably good at Dojo on that last one…)

    It ain’t perfect, but I still haven’t seen anything better. The query features of the api are good, but the full integrated server-side Javascript means you can do almost anything.

    And having a very nice Wiki surrounding your business app makes it even better.

    I’m hoping it all comes back to life again. But if you ever see anything open-source that comes close, please post something on your blog.

5 Trackbacks

  1. By SimpleDB: un’appendice - ReFactor.it on December 15, 2007 at 4:47 am

    […] Commenti come questo “The Non-Relational DB Strikes Back!” o frasi come questa: “Many developers simply want to store, process, and query their data without worrying about managing schemas, maintaining indexes, tuning performance or scaling access to their data.” (Il corsivo è mio). […]

  2. By My daily readings 12/16/2007 « Strange Kite on December 16, 2007 at 4:31 am

    […] Continuing Intermittent Incoherency » The Non-Relational DB Strikes Back! […]

  3. By test 12/16/2007 « Strange Kite on December 16, 2007 at 10:43 am

    […] Continuing Intermittent Incoherency » The Non-Relational DB Strikes Back! […]

  4. By links for 2007-12-17 « Mike Does Tech on December 16, 2007 at 5:28 pm

    […] Continuing Intermittent Incoherency » The Non-Relational DB Strikes Back! […]

  5. By JSON.Com on December 17, 2007 at 3:24 pm

    Non-Relational Web DB Interoperability

    Alex Russell recently wrote about the advantages of non-relational (dynamic object) databases for application development. The flexibility of using dynamic persisted objects that behave like dynamic OOP objects provides a great level of agi…