[[#text-html|text/html]] MIME type, then it will be processed as an
HTML document by Web browsers. This specification defines the latest version of the HTML syntax,
known simply as "HTML".
The second concrete syntax is the XHTML syntax, which is an application of XML. When a document
is transmitted with an XML MIME type, such as
application/xhtml+xml, then it is treated as an
XML document by Web browsers, to be parsed by an XML processor. Authors are reminded that the
processing for XML and HTML differs; in particular, even minor syntax errors will prevent a
document labeled as XML from being rendered fully, whereas they would be ignored in the HTML
syntax. This specification defines the latest version of the XHTML syntax, known simply as
"XHTML".
The DOM, the HTML syntax, and the XHTML syntax cannot all represent the same content. For
example, namespaces cannot be represented using the HTML syntax, but they are supported in the
DOM and in the XHTML syntax. Similarly, documents that use the <{noscript}> feature can
be represented using the HTML syntax, but cannot be represented with the DOM or in the XHTML
syntax. Comments that contain the string "-->" can only be represented in the
DOM, not in the HTML and XHTML syntaxes.
foo attribute's value must be a valid integer" is a
requirement on producers, as it lays out the allowed values; in contrast, the requirement "the
foo attribute's value must be parsed using the rules for parsing integers"
is a requirement on consumers, as it describes how to process the content.
This is a note.
This is an example.
This is an open issue.
This is a warning.
interface Example {
// this is an IDL definition
};
method( [ optionalArgument ] )/* this is a CSS fragment */The defining instance of a term is marked up like this. Uses of that term are marked up like [=this=] or like this. The defining instance of an element, attribute, or API is marked up like
this. References to that element, attribute, or API
are marked up like <{this}>.
Other code fragments are marked up like this.
Byte sequences with bytes in the range 0x00 to 0x7F, inclusive, are marked up like `this`.
Variables are marked up like this.
In an algorithm, steps in synchronous sections are marked with ⌛.
In some cases, requirements are given in the form of lists with conditions and corresponding
requirements. In such cases, the requirements that apply to a condition are always the first
set of requirements that follow the condition, even in the case of there being multiple sets of
conditions for those requirements. Such cases are presented as follows:
This is a paragraph of text.
=" character. The attribute value can remain unquoted if it doesn't contain
[=space characters=] or any of " ' ` =
< or >. Otherwise, it has to be quoted using either single or
double quotes. The value, along with the "=" character, can be omitted altogether if
the value is the empty string.
html#text: ⏎␣␣#text: Document title#text: ⏎␣#text: ⏎␣#text: ⏎␣␣#text: Document heading#text: ⏎␣␣#text: This is a paragraph of text.#text: ⏎␣␣#text: ⏎␣␣another-html-document.html"
#text: Link text for a link to another-html-document.html#text: ⏎␣␣#text: ⏎␣␣#comment: this is a comment #text: ⏎␣⏎
var a = document.links[0]; // obtain the first link in the document
a.href = 'sample.html'; // change the destination URL of the link
a.protocol = 'https'; // change just the scheme part of the URL
a.setAttribute('href', 'http://example.com/'); // change the content attribute directly
Since DOM trees are used as the way to represent HTML documents when they are processed and
presented by implementations (especially interactive implementations like Web browsers), this
specification is mostly phrased in terms of DOM trees, instead of the markup described above.
The document has yellow text and a blue background.
http://example.com/message.cgi?say=%3Cscript%3Ealert%28%27Oh%20no%21%27%29%3C/script%3EIf the attacker then convinced a victim user to visit this page, a script of the attacker's choosing would run on the page. Such a script could do any number of hostile actions, limited only by what the site offers: if the site is an e-commerce shop, for instance, such a script could cause the user to unknowingly make many unwanted purchases. This is called a cross-site scripting attack.
onload attribute
to run arbitrary script.
* When allowing URLs to be provided (e.g., for links), the scheme of each URL also needs to
be explicitly safelisted, as there are many schemes that can be abused. The most prominent
example is "javascript:", but user agents can implement (and indeed, have
historically implemented) others.
* Allowing a <{base}> element to be inserted means any <{script}> elements in the page with
relative links can be hijacked, and similarly that any form submissions can get redirected
to a hostile site.
: Cross-site request forgery (CSRF)
:: If a site allows a user to make form submissions with user-specific side-effects, for example
posting messages on a forum under the user's name, making purchases, or applying for a
passport, it is important to verify that the request was made by the user intentionally,
rather than by another site tricking the user into making the request unknowingly.
This problem exists because HTML forms can be submitted to other origins.
Sites can prevent such attacks by populating forms with user-specific tokens that are
not predictable.
: Clickjacking
:: A page that provides users with an interface to perform actions that the user might not
wish to perform needs to be designed so as to avoid the possibility that users can be
tricked into activating the interface.
One way that a user could be so tricked is if a hostile site places the victim site in a
small <{iframe}> and then convinces the user to click, for instance by having the user play
a reaction game. Once the user is playing the game, the hostile site can quickly position
the <{iframe}> under the mouse cursor just as the user is about to click, thus tricking the
user into clicking the victim site's interface.
To avoid this, sites that do not expect to be used in frames are encouraged to restrict
their usage within frames with a mechanism such as Content Security Policy's
frame-ancestors directive [[CSP3]], or the HTTP "x-frame-options" header
defined in [[rfc7034]].
A different method of compromising the user's security involves interacting with their
physical environment. For example, the <{audio}> element could be used to play audio
that interacts with a user's speech enabled devices.
This could be done in such a way that the user is unaware that it is happening, as in the
dolphin attack. Alternatively, malicious content might target users suspected to have
a limited hearing range or to be relying on an audio interface such as a screen reader,
as determined by fingerprinting users.
load event. The event could fire as soon as the element
has been parsed, especially if the image has already been cached (which is common).
Here, the author uses the onload
handler on an <{img}> element to catch the load event:

load event would be fired in
between, leading it to be missed:
<font color=""> throughout requires changes across the entire site,
whereas a similar change to a site based on CSS can be done by changing a single file.
: Larger document sizes
:: Presentational markup tends to be much more redundant, and thus results in larger document
sizes.
For those reasons, presentational markup has been removed from HTML in this version. This
change should not come as a surprise; HTML 4.0 deprecated presentational markup many years ago
and provided a mode (HTML Transitional) to help authors move away from presentational markup;
later, XHTML 1.1 went further and obsoleted those features altogether.
The only remaining presentational markup features in HTML are the <{global/style}> attribute
and the <{style}> element. Use of the <{global/style}> attribute is somewhat discouraged in
production environments, but it can be useful for rapid prototyping (where its rules can be
directly moved into a separate style sheet later) and for providing specific styles in unusual
cases where a separate style sheet would be inconvenient. Similarly, the <{style}> element can
be useful in syndication or for page-specific styles, but in general an external style sheet
is likely to be more convenient when the styles apply to multiple pages.
It is also worth noting that some elements that were previously presentational have been
redefined in this specification to be media-independent: <{b}>, <{i}>, <{hr}>, <{s}>,
<{small}>, and <{u}>.