On Browser WYSIWYG

The state of in-browser WYSIWYG is somewhere between pitiful and mind-numbingly painful. Opera and Safari have pulled themselves up by the bootstraps and soon all the major browsers will be at the same level of awful, more or less. This area of DHTML doesn’t get much love from browser vendors in part because only the heartiest souls ever venture into these deep, shark-infested waters so there aren’t many people clamoring for fixes to the underlying capabilities. Everyone sane just picks up the Dojo Editor, TinyMCE, or one of the other good editing packages that are available.

Since recently delving back into the Dojo editor for the 0.9 release I’ve been chewing on the problem some more, and I think the solution is fairly simple in terms of the APIs which toolkit authors should expect of browser vendors. The goals of editing generally boil down to:

  • Allow users to apply formatting to stuff they have
  • Let users add new stuff
  • Allow users to undo stuff they did
  • Serialize the stuff users have done and, optionally, the undo stack

The current contentEditable/designMode systems fail in the undo case because (particularly on IE) it’s not possible to denote what is and isn’t an “action” that the user is doing, nor can you be informed by the browser when it pickles off a new state to the undo stack. This means that the undo stack captures things which aren’t changes in your editing area and may appear to be “broken” by UI feedback that you provide to users in other ways.

Further, the existing system’s dependence on pseudo-magical “commands” makes nearly zero sense. Every editing component worth its salt today has to build its own ways of executing DOM manipulation and then rolling back from change sets. Browsers half-coddle editing system authors when it would be better if they just got out of the way and gave us APIs which are suited to the “build the entire UI in javascript” path which everyone already takes anyway.

Since it’s not really reasonable to expect that browsers will remove contentEditable, here are my proposed APi additions to it which would allow any sane editing UI to ditch the entire command structure which can slowly fade into the background over time.

  • editableElement.openUndoTransaction(callbackHandler): starts an undo transaction, implicitly closing any previously opened transactions. All subsequent DOM manipulation to elements which are children of this element will be considered part of the transaction and normal browser-delimited undo transaction creation is suspended until the transaction is closed. The optional callback handler is fired when the user cycles back this far in the undo stack from some future state.
  • editableElement.closeUndoTransaction(): ends a manual undo transaction. Implicitly called by openUndoTransaction. Closing the transaction has the effect of pushing the current DOM state (or whatever differential representation the browser may use internally) onto the browser’s undo stack for this editable element. When an undo transaction is closed, browsers may resume automated generation of undo states on the stack intermingled with the manually added states.
  • Support for non-standard DOM positioning properties of range objects as outlined in MSDN

These APIs added to elements with contentEditable set will allow us to use regular-old DOM methods of manipulating user selections and adding complex content from user input without fighting for control of the undo stack or inventing our own (which has so many problems that I don’t want to begin to address them). Additionally, this method of manipulation will allow toolkit authors to deliver editors which operate on the semantics of the markup more easily.

Note that we suppose the current uneven level of Range and DOM APIs will persist over time, and some things may get easier over time in conjunction with these APIs as those problems are slowly alleviated. Additionally, interaction with the global undo stack for the browser is as-yet unspecified. I’m inclined to suggest that unless the editable element has focus, undo should not affect it but my unfamiliarity with the implementation of the global undo stacks in browsers may nix that and require a broader solution. There may also need to be methods for ignoring a particular set of DOM operations (say, from event handlers) to prevent browsers from taking snapshots at bad times, but I think we can ignore that for now.

Lastly, there is probably room for an API to register interest in *any* undo operation and to push things onto the browser’s undo stack for non-editing elements, but this API solves the problem where it is most accute today.


  1. Posted September 6, 2007 at 5:21 am | Permalink

    Over here at Xopus, we have written a WYSIWYG editor in JavaScript, for editing XML (so XHTML too). During the development of this editor, we had to deal with many many browser quirks, especially those that concern contenteditable and designmode. At the moment, the only thing of the browsers contenteditable implementation we still use is the caret and caret movement. And in the near future we will even stop using that caret, because we concluded that we spent more time at compensating contenteditable quirks than it would cost us to write our own caret. :)

    So, from my own experience, I can tell that developing a WYSIWYG editor in-browser will eventually not involve contenteditable anymore. Hardly funny anymore, right?

  2. F.Baube
    Posted September 9, 2007 at 4:09 am | Permalink

    You equate in-browser editing with HTML, but there’s a whole ‘nother world of schema-driven XML editing that’s out there waiting for a motivated developer to conquer. This could range from microformats on up to DITA and DocBook. AFAIK this kind of editing capability simply does not exist yet.

  3. Posted September 9, 2007 at 11:29 am | Permalink

     Hi F.Baube. You’re right, I do equate browser editing with HTML, because HTML is what the browser will render. If you want to do something like Xopus’s XML editor, you will still need to render something out to the user during the editing experience and having that part of the experience work correctly greatly increases the likelihood that the editor will be useable. It will get easier to edit all kinds of content in a browser when HTML editing gets better.

  4. Posted September 10, 2007 at 8:09 pm | Permalink

    have you seen any key-grabbing implementations which avoid content-editable / design-mode entirely?  any worth pointing out?  i’ve been playing around with writing an editor that simply watches key events and artifices the expected screen output.the main barrier i’ve encountered is when watching key events, i havent figured out how to prevent the browser (personally i use opera) from digesting the keystrokes.  since 1/2 the keyboard does something in opera, i’ve had rather poor luck in my playing around.  windows start flying across the screen and things start doing haywire as the text i write fires opera commands to resize, move, rescale and shift things around.like the F.Baube, I am less interested in HTML editing specifically than i am in creating editor environments on the web.

One Trackback

  1. By BB’s blog » Blog Archive » HTML 5.0 on September 9, 2007 at 12:16 am

    […] Bổ sung 9/9 : HTML 5.0 còn support 1 cái rất hay là “contenteditable” attribute  , gọi là browser wysiwyg editor ( implement ngay trong browser chứ ko còn trong javascript). Thật là tuyệt :-X, giờ chỉ cần xây dựng các format command là có đc 1 bộ editor ngon lành […]