A bad idea: GUIDs as DOM elements IDs

Apr 21, 2016  

I am using GUIDs as ids for HTML elements. Sometimes my code works, sometimes it does not.

<span id="b61efa7a-a7a4-4cc1-bc3c-9dffc724d541">Hello</span>
<span id="441c901f-3b33-4c8b-829f-c8ba297b0f14">world</span>

In this example, I can select Hello, but not world using the JavaScript document API.

Format of an HTML5 id

The MDN documentation for the id global attribute states that the ID should start with a letter for compatibility, but that this restriction has been lifted in HTML 5.

Indeed, checking the W3C HTML5 specification for the id attribute, I see this:

Note: There are no other restrictions on what form an ID can take; in particular, IDs can consist of just digits, start with a digit, start with an underscore, consist of just punctuation, etc.

In my example, world is identified by an ID starting with a digit, and that seems to be the problem.

Numeric IDs should work, but they don’t

When working with IDs in Chrome, I systematically get an error when the ID selector starts with a digit, as some manual testing proves:

document.querySelector ('#foo'); // returns null
document.querySelector ('#123'); // throws a DOMException, not a valid selector

So I went on and double checked the CSS3 ID selectors specification of the W3C. It does not define what a valid ID is, but simply points to the CSS 2.1 documentation on CSS identifiers.

In CSS, idenfiers […] cannot not start with a digit, two hyphens or a hyphen followed by a digit.

This is in contradiction with the HTML 5 DOM. And querySelector still sticks to the old CSS 2.1 rules.

Workaround

Thankfully, there is a workaround (see also this StackOverflow question), which is to escape the first digit:

document.querySelector ('#\31 23'); // selects id 123

The \31 escape maps to digit 1.

Note the space after the \31. Without the space, the string would be parsed as \3123 which would map to Unicode 0C33 ళ TELUGU LETTER LLA.

Fix

I don’t like workarounds. If I can live without them, I feel more comfortable. So what are my other options?

  • Do not use GUIDs ⇒ that would require quite some bit of work in my libraries; it is too much effort.
  • Prefix the GUIDs when using them as HTML5 ids ⇒ that would be require changes in my parsing code to ensure that I strip the prefix when reading an id attribute.
  • Filter the GUIDs to discard those which start with a digit ⇒ this is easy, but will require possibly multiple attempts before a suitable GUID is generated.
  • Patch the GUIDs to ensure they start with a digit ⇒ since most of the bits in a GUID are random, I could set bit 7 and 6 of byte 3, which would guarantee that my GUIDs always start with c, d, e or f.
  • Rewrite a GUID generator ⇒ it is probably not worth the effort which guarantees that the first character is a letter.

I could be even less subtle and simply replace the first letter of the generated GUID with an f.

Implemented solution

public static System.Guid EnsureStartsWithLetter(System.Guid guid)
{
    var bytes = guid.ToByteArray ();

    if ((bytes[3] & 0xf0) < 0xa0)
    {
        bytes[3] |= 0xc0;
        return new System.Guid (bytes);
    }
    return guid;
}