Q: Why use named entities vs numeric entities?

Named entities (&, <, ©) are more readable when reading the source. Numeric entities (&, <, ©) work for any character and avoid the need to remember entity names. For the five reserved characters, named entities are more common. For other characters (typographic, mathematical, currency), either works; numeric is more portable across encoders and decoders.

Q: What characters need encoding outside the basic five?

Only the five reserved characters strictly need encoding for HTML content: & " '. Other characters (em dashes, smart quotes, accented letters) can appear directly in HTML if your page declares UTF-8 (which all modern pages do via the meta charset declaration). Encoding them anyway works and is sometimes used for maximum compatibility with old systems, but is not required for modern web.

Question 1

Is HTML entity encoding enough to prevent XSS?

Accepted Answer

For text content placed between HTML tags, yes. For attribute values, you also need to consider the quote style and which characters need encoding. For JavaScript event handlers or style attributes, HTML entity encoding alone is not sufficient and dedicated context-aware escaping is required. For modern web frameworks (React, Vue, Angular), the framework handles encoding automatically when you bind a variable to a text node; manual encoding is rarely needed.

Question 2

Why use named entities vs numeric entities?

Accepted Answer

Named entities (&, <, &copy;) are more readable when reading the source. Numeric entities (&#38;, &#60;, &#169;) work for any character and avoid the need to remember entity names. For the five reserved characters, named entities are more common. For other characters (typographic, mathematical, currency), either works; numeric is more portable across encoders and decoders.

Question 3

What characters need encoding outside the basic five?

Accepted Answer

Only the five reserved characters strictly need encoding for HTML content: & < > " '. Other characters (em dashes, smart quotes, accented letters) can appear directly in HTML if your page declares UTF-8 (which all modern pages do via the meta charset declaration). Encoding them anyway works and is sometimes used for maximum compatibility with old systems, but is not required for modern web.

Question 4

Does this handle Unicode code points above U+FFFF?

Accepted Answer

Yes. Code points in the supplementary planes (emoji, ancient scripts) require numeric entities and are handled correctly by this encoder and decoder. The encoder outputs them as numeric entities; the decoder accepts both decimal (&#128512;) and hexadecimal (&#x1F600;) forms.

HTML Entity Encoder and Decoder (Named + Numeric)

Frequently asked questions

Related tools