Why is not an error caused by an incorrect html-tag?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP


Why is not an error caused by an incorrect html-tag?



In the HTML specification there is a concept called custom elements. There is a definite expression to which the names of these elements should follow. But, however, after opening the editor in the browser, we can safely write elements that do not follow these rules, or simply create a simple page with elements that do not follow this rule. For example, <redcar> </redcar>. Why is this allowed and does not cause any errors? After all, if we write something like this: <~hello> </~hello> then the opening tag will be treated as text, and the closing tag will be commented out. In any case, you need specific links that will explain this behavior.


<redcar> </redcar>


<~hello> </~hello>



A valid custom element name is a sequence of characters name that
meets all of the following requirements:



PotentialCustomElementName ::= [a-z] (PCENChar)* '-' (PCENChar)*


[a-z] (PCENChar)* '-' (PCENChar)*



PCENChar ::= "-" | "." | [0-9] | "_" | [a-z] | #xB7 | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x203F-#x2040] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]


"-" | "." | [0-9] | "_" | [a-z] | #xB7 | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x203F-#x2040] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]



This uses the EBNF notation from the XML specification. [XML]


annotation-xml


color-profile


font-face


font-face-src


font-face-uri


font-face-format


font-face-name


missing-glyph





Could you please edit this to make it clearer what exact question you are asking.
– EmandM
Jul 5 at 4:10





It's very helpful that unknown tags are consumed the way they are. It makes it easy to add new elements of HTML, and browsers that don't yet understand the new elements can just accept them and continue. Because the tags form elements rather than just being discarded, it's possible to add polyfills to replicate the newly expected behaviour.
– Alohci
Jul 5 at 5:34




3 Answers
3



It's unclear what you'd consider an error.



HTML parsing is mainly oriented toward a never throw principle, and will try to convert everything to something valid.



In your specific case, what you created is an HTMLUnknownElement, and this follows the specs:



The element interface for an element with name name in the HTML
namespace is determined as follows:



If name is applet, bgsound, blink, isindex, keygen, multicol,
nextid, or spacer, then return HTMLUnknownElement.


applet


bgsound


blink


isindex


keygen


multicol


nextid


spacer



If name is acronym, basefont, big, center, nobr, noembed, noframes, plaintext, rb, rtc, strike, or tt, then return
HTMLElement.


acronym


basefont


big


center


nobr


noembed


noframes


plaintext


rb


rtc


strike


tt



If name is listing or xmp, then return HTMLPreElement.


listing


xmp



Otherwise, if this specification defines an interface appropriate for the element type corresponding to the local name
name, then return that interface.



If other applicable specifications define an appropriate interface for name, then return the interface they define.



If name is a valid custom element name, then return HTMLElement.



Return HTMLUnknownElement.



With <redcar></redcar> you gone the whole way until bullet #7.


<redcar></redcar>





@MaximPro this comes from a prior part of the specs, namely the tag name parsing. ~ character is not allowed in HTML tag name (might be in XML thought I don't remember correctly, but anyway not as a starting char). So you don't even enter this algorithm, because it is parsed as text. Now I'm not entirely sure why </[invalid-char] gets converted to comment... But this might be somewhere in between SGML and HTML specs.
– Kaiido
Jul 5 at 6:06


</[invalid-char]





@MaximPro -The "analysis stage" is called "Tokenization" and is described in full in the HTML5 spec
– Alohci
Jul 5 at 11:04





@Alohci Thanks, failed to find the time to find it, so what OP created with their tilde is html.spec.whatwg.org/multipage/…
– Kaiido
Jul 5 at 13:46







@MaximPro check the beginning of the document you linked to: "This section only applies to documents, authoring tools, and markup generators. In particular, it does not apply to conformance checkers; conformance checkers must use the requirements given in the next section ("parsing HTML documents")." This document describes writing HTML, what you are dealing with is parsing. Parsing is more flexible (because of the no throw rule I talked about), and will thus allow some characters out of the ASCII range inside a tag name, (probably because XML does allow it in its syntax).
– Kaiido
Jul 6 at 6:03





Writing in the other hand can be less flexible, e.g from DOM, it will throw. So to make things clearer, at parsing <f~oo> will work, but document.createElement('f~oo') will throw. And the rules for writing a tag name in HTML are clear: only ASCII alphanumerics.
– Kaiido
Jul 6 at 6:05




<f~oo>


document.createElement('f~oo')



I agree to @Kaiido.
I would just explain why <~hello> </~hello> throws no error: a parser works eating character by character or token by token (it depends from how the programmer wants to do), but anyway, after a < the browser expects a set of valid characters followed by a > to declare a opening tag. If between < and > there is an invalid set of characters, simply it isn't a tag, so the browser parses it as a text node. Regarding the closing tag, I think probably that behavior can change among different browsers. Anyway, the one you're using simply after a < and a / expect necessarily a valid set of characters followed by a >, otherwise it wouldn't consider it as a text node because of the invalid token </ joined to invalid characters, so the browser comments it since it doesn't close any tag.


<~hello> </~hello>


<


>


<


>


<


/


>


</



Of course errors are caused. You can find them here:



https://validator.w3.org/



If the errors were fatal, i.e. after an error the browser would stop rendering the page, Google would be out of business because so few web sites would be on-line that one could easily memorize them.






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

How to scale/resize CVPixelBufferRef in objective C, iOS

Stripe::AuthenticationError No API key provided. Set your API key using “Stripe.api_key = ”

SVG with two text elements. When one resizes due to textLength - how to resize the other one to the same character size