The CSSOM View Module

A week ago W3C published the first working draft of the W3C CSSOM View specification (written by Anne van Kesteren), and I must say I'm very happy with it. Since I was testing stuff anyway I created a new compatibility table for most of the methods and properties specified in this document, and browser compatibility is already excellent.

That's no coincidence. This specification contains definitions for many properties (and a few methods) that browsers have already been supporting for ages (such as offsetWidth), and W3C has paid scrupulous attention to the current implementation. No more theorizing into the blue — just check what browsers do and describe it in the specification. Excellent idea.

Almost all of these properties were originally invented by Microsoft and were copied by the other browser vendors — not only because IE's market share forced them to, but also because these properties were just good ideas.

elementFromPoint

One method deserves special attention: elementFromPoint(). This method expects two coordinates and then reports which HTML element is located at these coordinates. This is a godsend for drag-and-drop scripts. If the user drops an element, get the mouse coordinates and use this method to find out which HTML element is located at these coordinates.

One catch: you first have to temporarily hide the dragged element, because otherwise elementFromPoint() would always report the dragged element — after all it itself is the topmost element under the mouse.

I'm going to add this functionality to my Drag and Drop script, but for the moment this seems to be the idea:

Unfortunately the browsers do not entirely agree which mouse coordinates this method needs. IE and FF3 need clientX/Y (relative to the viewport), while Opera and Safari need pageX/Y (relative to the document). I expect Opera and Safari to change their implementation, though; market share considerations leave them no other choice. And the brand-new working draft in fact specifies clientX/Y.

Critique of the specification

Despite this specification being an excellent piece of work, I have several light points of critique. None of them is show-stopping, but the specification needs just a little bit more work to move from excellent to outstanding.

WindowView

I have doubts about the WindowView interface, which contains ancient properties such as innerWidth.

The problem is that innerWidth/Height and pageXOffset/pageYOffset are essentially doubles: they report the same information as document.documentElement.clientWidth/Height and document.documentElement.scrollTop/Left: the inner width of the viewport (browser window) and the scrolling offset of the document.

Since we already have that information available, why repeat it? The only reason would be that there might be situations where the documentElement does not span the entire viewport, but as far as I know these situations don't exist nowadays, and frankly I wonder if they'll ever exist.

I created a quick test that gives the root <html> element a wide margin and a border. Clicking outside the border still reports the <html> element as target, and document.documentElement.clientWidth and window.innerWidth report the same number of pixels in all browsers.

So even though the root element may appear to cover only part of the viewport, JavaScript still acts as if it covers the entire viewport. That makes sense: there is no block-level element that contains the root element (or the root element wouldn't be the root element).

The other properties of the window view, outerWidth/Height and screenX/Y, are mostly useless. They've been around since Netscape 3, and in the ten years I've been writing scripts I've never needed to use either of them.

For al these reasons I'm wondering if the WindowView shouldn't be scrapped outright. It just serves no purpose.

pixelDepth

From the Browser Wars on, we've had two proprties that give the color depth of the screen: colorDepth and pixelDepth. The only difference between the two is that IE doesn't support pixelDepth. As far as I'm concerned we do not need two properties that contain exactly the same information, so pixelDepth should be removed from the specification.

The getClientRects() and getBoundingClientRect() methods

I don't understand these methods; or rather, the TextRectangle objects they return. They contain information

First of all we don't need this information; finding the position of an element is already possible.

I admit that there's no single property that holds this information; then why not create it? Something like element.viewportOffsetX/Y could be useful, and making this a property instead of a method would be more in line with the rest of the specification. (Should W3C go this way, it should also create an element.documentOffsetX/Y property pair.)

On the other hand, that would mean introducing a new property, and part of the point of this specification seems to be that it describes only properties that have already been implemented.

Finally, it seems that one element can contain several boxes, but I have not been able to find out why or how. In my tests only IE sometimes reports more than one box, anyway.

I feel this part of the specification is not yet ready. At the very least, the relation of TextRectangle boxes to actual elements should be defined in the case there's more than one TextRectangle box, because I don't understand what to expect (and I suspect browser vendors don't, either, because Firefox and Opera never report more than one box, anyway).

Furthermore, I think that the inclusion of these methods should be critically reviewed, since I'm not sure they are useful enough to implement, especially not if their actual meaning is vague.

x and y

The specification also contains all mouse pointer property pairs; an area of JavaScript that featured truly horrible browser incompatibilities even at the time I wrote the book. Fortunately, browsers have sanitised their act, and again the specification pays scrupulous attention to the actual implementation of these property pairs.

The single problem is the x/y property pair. The specification states they must return pageX/Y, but currently they return clientX/Y in all browsers, save Firefox, which doesn't implement x/y at all.

Again, why do we need a second property pair to hold information that's already available? Besides, in this single instance the specification departs from current browser behaviour.

Conclusion

Despite these minor points, the current working draft is an excellent piece of work that, I hope, will quickly grow to a fully-fledged recommendation. Browser vendors have to do very little in order to comply with this specification, and we badly needed these definitions.

Comments

1 Posted by Barney on 29 February 2008 | Permalink

Excellent news, and an excellent practical summary. That table is going to be very useful to me. There's a lot of sense here.

Thanks!

2 Posted by Jonathan Snook on 29 February 2008 | Permalink

I haven't mentioned this to Anne yet, but I think an elementsFromPoint function would be most helpful returning all elements that exist at a single point. In this way, I'm only going through a list of maybe 4 or 5 elements to determine if a dragged element is over its location instead of potentially 30-40+ (depending on how complex of a page I have).

3 Posted by ppk on 29 February 2008 | Permalink

But wouldn't the user intuitively expect the dragged element to be dropped on the topmost visible element? Which is exactly the element elementFromPoint returns.

4 Posted by Jonathan Snook on 29 February 2008 | Permalink

But the problem is that the topmost element is usually the dragged element, not the drop element. Showing and hiding the element can cause flicker. Which leaves elementFromPoint no better off than event.target (for drag and drop detection, anyway). Plus, with elementsFromPoint, there may be a desire to track multiple points within a container such as "if I'm over a box that allows insertion, drop above this box but if I'm not and I'm just in the column container, drop it in the column, (or, I'm in a box I'm not allowed to insert into, check if I'm in the container)". It might also help where fixed elements are possibly obscuring drop points (not sure why you might need this...).

5 Posted by ppk on 29 February 2008 | Permalink

In general, switching a style, doing a few calculations and undoing the style switch does not cause a flicker, since browsers only apply the styles after the script has finished or encounters a natural breaking point such as an alert.

That said, it could be there are situations in which going through the entire stack might be useful; and having elementsFromPoint (plural) certainly won't hurt.

So maybe we should try this after all.

Thanks for the idea.

6 Posted by Anne van Kesteren on 29 February 2008 | Permalink

Hi guys. Feedback is welcome on [email protected] though I'll take notes of what's being said here too. Also, the latest editor's draft of the specification is located here: http://dev.w3.org/csswg/cssom-view/

7 Posted by Anne van Kesteren on 1 March 2008 | Permalink

I'm not really sure what to do with the stuff on WindowView. If browsers could actually remove support the attributes you mention that would be great and the specification should drop them too, but it seems kind of unlikely.

getClientRects() should return multiple boxes for inline elements that generate multiple boxes (when wrapped on several lines for instance). Since both Firefox and Opera have been adding support for these methods (originally IE-only) it made sense to me to define them so everyone can converge.

x/y, not sure what to do with those. I left them in for now given the broad support and fixed the mistake you pointed out. clientX/clientY it is.

8 Posted by Masklinn on 1 March 2008 | Permalink

Anne, why would browsers need to remove support of the attributes? As long as they're not in the spec, can't they be considered as implementation details of browsers?

9 Posted by J. on 3 March 2008 | Permalink

"In my tests only IE sometimes reports more than one box, anyway"

in what version of IE?

10 Posted by Mark Wubben on 3 March 2008 | Permalink

getClientRects() and getBoundingClientRect() are tremendously useful, but unfortunately rather broken in IE. There's a bunch of workarounds in the Xopus codebase dealing with it, perhaps I oughta write it down in a blog post.

Basically, you can get phantom boxes. Or boxes with incorrect coordinates.

11 Posted by ppk on 3 March 2008 | Permalink

Yes, I already suspected phantom boxes. More information would be greatly appreciated.

Thanks,

12 Posted by Stefan Ledent on 13 March 2008 | Permalink

Remark to comment 3:
I don't think that you can assume that the element that is visible is the element onto which the object should be dropped. I suspect that most of the time you would drop onto one of its parent container and not the element itself!

13 Posted by Stefan Ledent on 13 March 2008 | Permalink

The mouse events properties: clientX/clientY are not scaled correctly in IE7 when pageZoom is active. Unlike in Firefox3 which also support pageZooming.
This result in dragdrop behaviour broken on IE7 with pageZooming <> 100%!

14 Posted by Robert O'Callahan on 14 March 2008 | Permalink

Finding the position using getBoundingClientRect or getClientRects is a lot faster than using a loop over the offsetParent chain. (Interesting note: advanced AJAX apps had started to use a Gecko-only trick, "boxObject", to get the viewport offset without using a loop, just for speed. We adopted getBoundingClientRect as a cross-browser solution instead.)

Inline elements often generate multiple boxes, and in Gecko block elements in multicolumn layouts can also generate multiple boxes.

Introducing viewportOffsetX/Y attributes would have the problem that they're not supported at all in IE.

15 Posted by ppk on 20 March 2008 | Permalink

I think I was wrong in dissing the getBoundingClientRect() method; it seems that it makes more sense that I initially thought.

I'm going to re-do all CSS OM tests (also in IE8b1, Saf 3.1 and FF3b4) and when I've done that I'm going to re-judge getBoundingClientRect().

16 Posted by James Vickers on 20 March 2008 | Permalink

We should definitely have a way to get element(s) behind the element being dragged. getElementsByPoint would be one way, but one possibility would be to have a parameter to exclude element(s). Either a single element, or an array of them would be passed in, and these would be excluded from the result.

excluded: a single element, or an array of them

getElementByPoint(excluded)

getElementsByPoint(excluded)