XMLHTTP notes: cloning nodes, inserting forms, and caching

In the past two weeks I've again created an Ajax-driven interface, and as usual I discovered quite a few interesting XMLHTTP bugs and problems. This entry contains three Explorer and one Safari bug, and it talks about cloning nodes from HTML to XML, from XML to XML, appending HTML that contains a form, and extremely agressive caching.

Explorer and Safari - Cloning nodes from HTML files

The project I worked on was quite complicated, and for that reason I first researched the importing of HTML pages instead of XML documents. My original idea was the following:

Unfortunately this nice scheme of things didn't entirely work. First of all, as I discovered earlier, responseXML is only available if the XMLHTTP request retrieves a text/xml page. No worries, I thought, I'll just use a .htaccess file to set the content type of all .html pages to text/xml. This trick worked fine: all browsers granted access to the responseXML.

req holds the XMLHTTPRequest object. To be entirely on the safe side I read out req.getResponseHeader('Content-Type'). All browsers agreed that the content type was in fact text/xml. So far so good.

Then I took the responseXML.firstChild, a <div> tag in my case. Later on I would have to rewrite this line to search for the <div id="content">, not an easy task since document.getElementById doesn't work in XML files. I decided to worry about that part of the script later.

Then the time came to actually append the data to an HTML element. In my innocence I thought that cloning the firstChild of the data and appending it to the document would be enough. Not so, of course.

If you try to append a node from an XMLHTTP-requested HTML page served as text/xml to another HTML page, Explorer and Opera give error messages, while Safari crashes.

I guess the problem is that I try to append an XML node to an HTML document. In any case, this approach worked only in Mozilla. It appends the new node, but refuses to interpret its contents as HTML. The very dirty last line document.getElementById(id).innerHTML = document.getElementById(id).innerHTML turned out to work: magically the XML was transformed to HTML, and the page showed up exactly as I wanted it.

Unfortunately, due to the other browsers' refusal to import the node, I had to change my plans.

Explorer - responseText and forms

I decided to forget about normal HTML pages imported as text/xml. My next plan was to use HTML snippets, ie. small HTML files that hold only the necessary code, again without <head> and <body> tags, navigation, footers etc. I wanted to copy the responseText to an element's innerHTML.

This worked fine, except in Explorer, which threw a very weird error. After studying for a while, and encountering this page, I found out what was wrong.

When you try to import responseText into the innerHTML of an element, and the responseText contains forms, or the element is a form, Explorer throws an error.

Fortunately I was able to change my plans so that I only had to import HTML snippets that did not contain any forms.

Safari - cloning nodes from XML to XML

When appending a cloned node from another XML file to an XML file, the node is cloned but its nodeName becomes null in Safari.

After a while I found out that Safari required the use of importNode(), which doesn't exist in Explorer. The correct code turned out to be:

Explorer - Agressive caching

Finally, while working on this project I discovered that Explorer is quite agressive in its caching of XML files: it never seems to be aware that an XML file has changed. I read somewhere that using a POST request instead of a GET solves this problem, but I haven't yet tested it.

If you re-request an already cached XML file in Explorer, it runs the onreadystatechange event handler even before the request is officially sent.

This caused quite odd disturbances in my application. Setting the onreadystatechange event handler after the req.send() is not an option; as I discovered earlier the event handler doesn't work if it's set after the send().

Comments

1 Posted by Alex Lein on 6 December 2005 | Permalink

Very odd bugs indeed. May I suggest a possible solution to the XML caching problem? When you put in the path/name of the file, append a random number in the QueryString. So instead of "/path/file.xml" use "/path/fie.xml?". This should fool IE into thinking that different content is being loaded.
Interestingly enough I also ran into this problem, but with Opera and not IE.

2 Posted by Alex Lein on 6 December 2005 | Permalink

That should have read "/path/fie.xml?{random number/time of day}"

3 Posted by Vincenzo on 6 December 2005 | Permalink

Moving nodes from one document to a different one is a Microsoft extension. The proper method is to use importNode().
When you cloneNode() the resulting node belongs to the same document of the source node.
I advise you to take a look at the Sarissa library: http://sourceforge.net/projects/sarissa

4 Posted by blatimer on 6 December 2005 | Permalink

importNode should help with some errors.

I've also had good success by sending files as application/xhtml+xml instead of just text/xml. Of course this won't help with explorer, but it does convince mozilla and opera that the nodes I'm giving it are HTML nodes and can be dropped right into the page without using innerHTML.

5 Posted by brett on 6 December 2005 | Permalink

Why try importing markup via responseXML or responseText? The point of these methods are to import data - not markup. Keep the presentation, data, and logic as seperate as possible.

6 Posted by d_b on 6 December 2005 | Permalink

"Why try importing markup via responseXML or responseText?"

because it's orders of magnitude faster than trying to create complex DOM nodes at the browser.

7 Posted by molily on 6 December 2005 | Permalink

»It appends the new node, but refuses to interpret its contents as HTML.«
AFAIK: Opera and Firefox interpret the element node as HTML when you set the XHTML namespace for them in your XML file. Example:
<p xmlns="http://www.w3.org/1999/xhtml"><strong>important!</strong></p>
If you copy this »p« element node in your document, it will be recognized as HTML.

8 Posted by Angus Turnbull on 7 December 2005 | Permalink

Sounds like you're doing something very similar to me :). I wrote an Ajax library earlier this year that was geared towards importing fragments of HTML documents:

http://www.twinhelix.com/javascript/htmlhttprequest/

I got so fed up with MSIE's bugs that I eventually forced it to use a hidden IFRAME as a transport instead of XMLHTTP, which worked wonders (it wouldn't construct a DOM tree with non-"text/xml" documents, for instance, but IFRAMEs allow accessing the standard DOM with no problems). Also, as others have mentioned (and you note in the post), document.importNode() cures most issues with the other browsers using XMLHttpRequest.

9 Posted by Joshua Richardson on 7 December 2005 | Permalink

If you want to get rid of explorers caching, you could try modifiying some of the headers sent with the pages (i.e. .htaccess) like Cache-Control and Pragma headers.

10 Posted by Michiel on 7 December 2005 | Permalink

You can use this to modify the headers:

xmlHttp.setRequestHeader("If-Modified-Since", "Wed, 15 Nov 1995 04:58:08 GMT")

This will prevent IE from caching.

11 Posted by ppk on 7 December 2005 | Permalink

I'd prefer not to add a date-based query string, since I want the XML to be cached as long as the XML file on the server doesn't change.

I tried cache control and pragma and it didn't help a bit.

I didn't try the if-modified-since header. I'll take another look when this application enters the test phase.

12 Posted by Maian on 7 December 2005 | Permalink

IE has a proprietary .xml property that is the XML equivalent of innerHTML. If you are willing to branch code, you can use that.

Moz, Opera, and Safari all support importNode.

Also, even if the both documents were XML, the DOM specification disallows transfer between two different documents. Mozilla's ability to transfer between two documents is a bug (https://bugzilla.mozilla.org/show_bug.cgi?id=47903) and other browsers may or may not be following Mozilla's example. This is what importNode is specifically for.

13 Posted by Tino Zijdel on 7 December 2005 | Permalink

If you want a browser to check back on the server for a newer version you actually need to sent something like:

header('Expires: '.gmdate('D, d M Y H:i:s', time()+86400).' GMT');
header('Last-Modified: '.gmdate('D, d M Y H:i:s', time()).' GMT');
header('Cache-Control: public, max-age=86400');
header('Pragma: !invalid');

Now when a browser makes a new request you will need to look at the If-Modified-Since header; either when that's expired (for instance when the 86400 seconds have passed since the first request) or when the XML file has changed you can resent the data with new expiration headers, otherwise you only need to sent a HTTP/1.1 304 Not Modified header together with the original headers (without data).

Note that some versions of Opera will also report the status of the xmlHTTPRequest object as 304 in that case - so only checking status 200 is not a good idea.

Also check for Cache-Control/Pragma:no-cache; if you find that header always resent your data with new headers.

There is no need to set additional request headers clientside or fiddle with random suffixes to your querystring; HTTP provides everything you need to dictate caching on the clientside and from my experience most browsers deal very well with it.

14 Posted by Ben on 9 December 2005 | Permalink

Why not pass valid javascript as the responseText and then eval it? Then dynamically append to the DOM as you walk through your data structure. I've heard that google does this, and I've used it with great success.

If that's unacceptable, did you try making a document fragment based on the responseXML?

15 Posted by sdesnoo on 16 December 2005 | Permalink

Set "Cache-Control: must-revalidate" in the server response. This also works perfect for the refresh of in HTML included javascript files.

16 Posted by Adrian Geissel on 13 January 2006 | Permalink

There is a way to insert forms into IE - and also to preserve registered event handlers, etc. The additional benefit is that this code behaves as an alternative to importNode(). Try

var dummy = document.createElement("div");
dummy.insertAdjacentHTML("AfterBegin", req.responseText);
nEl = dummy.firstChild;

And then place the nEl element within the primary document DOM, eg.

oEl.appendChild(nEl);

One question for your blog - any idea why importNode() on responseXML should not register event handlers on Safari 1.3? Any work-around ideas?