HTML Decode - Entity Decoder

Convert HTML entities back to readable characters

100% client-side ยท your data never leaves your browser
0 characters
โ„น๏ธ Common HTML Entities
& โ†’ &
< โ†’ &lt;
> โ†’ &gt;
" โ†’ &quot;
' โ†’ &#39;
/ โ†’ &#x2F;

๐Ÿ“– How to Use

  1. Paste HTML with entities in the input field
  2. The HTML entities are automatically decoded
  3. Entity codes (&lt;, &gt;, &amp;, etc.) are converted back to characters
  4. Click "Copy" to copy the decoded HTML to your clipboard
  5. Use "Load Example" to see how HTML decoding works

About the HTML Decoder

The HTML Decoder is a free tool that instantly converts HTML entities back to their original characters, making encoded content human-readable again. Whether you're processing web-scraped data, parsing API responses, debugging stored content, or working with RSS feeds, this tool provides real-time HTML decoding with complete client-side processing. All decoding happens locally in your browser - no data is transmitted to servers, ensuring complete privacy. Essential for developers working with encoded data, content managers processing imported content, and anyone needing to view the actual text behind HTML entities. Use with caution on untrusted content.

Key Features:

  • Real-time decoding as you paste with instant results
  • 100% client-side processing - your data never leaves your browser
  • No registration, login, or installation required
  • Decodes all entity types: named (<), decimal (<), hexadecimal (<)
  • Handles special characters, symbols, and international text
  • Switch between decode and encode modes seamlessly
  • Load examples to understand entity formats
  • Mobile-friendly responsive interface for decoding on any device

Common Use Cases:

  • Web Scraping: Decode scraped HTML content to extract plain text and see actual characters
  • API Response Processing: Decode HTML entities from API responses to display content properly
  • Database Content: Decode stored encoded data to view or export in plain text format (also try our HTML Encoder)
  • RSS/XML Feed Parsing: Decode entities in feed content for display in non-HTML contexts
  • Content Migration: Decode legacy content during migrations or platform changes

HTML Decoding Examples & Implementation

Decoding Scraped Web Content

// Scraped HTML with entities:
&lt;h1&gt;Welcome to Our Site&lt;/h1&gt;
&lt;p&gt;Special offer: 50&#37; off!&lt;/p&gt;

// Decoded output:
<h1>Welcome to Our Site</h1>
<p>Special offer: 50% off!</p>

// JavaScript decoding:
const encoded = '&lt;h1&gt;Title&lt;/h1&gt;';
const textarea = document.createElement('textarea');
textarea.innerHTML = encoded;
const decoded = textarea.value;
console.log(decoded); // "<h1>Title</h1>"

Use case: Decode scraped HTML to extract readable text content

Processing API Responses with Entities

// API returns encoded content:
{
  "title": "Best Pizza &amp; Pasta",
  "description": "&quot;Authentic Italian&quot; cuisine",
  "special": "Buy 2 &amp; get 1 free!"
}

// Decoded for display:
{
  "title": "Best Pizza & Pasta",
  "description": "\"Authentic Italian\" cuisine",
  "special": "Buy 2 & get 1 free!"
}

// Decode all fields:
const decoded = Object.fromEntries(
  Object.entries(apiData).map(([k, v]) => [k, decodeHTML(v)])
);

Use case: Decode API responses before displaying to users

Python HTML Entity Decoding

// Python decoding:
from html import unescape

encoded = "&lt;p&gt;Hello &amp; Welcome!&lt;/p&gt;"
decoded = unescape(encoded)
print(decoded)  # "<p>Hello & Welcome!</p>"

// Decode in BeautifulSoup (web scraping):
from bs4 import BeautifulSoup

html = '&lt;div&gt;Content&lt;/div&gt;'
soup = BeautifulSoup(html, 'html.parser')
text = soup.get_text()  # Auto-decodes entities

// Safe decoding and sanitization:
import html
from bleach import clean

decoded = html.unescape(encoded_data)
safe_html = clean(decoded, tags=['p', 'br'], strip=True)

Use case: Decode HTML entities in Python for data processing or web scraping

Decoding Special Characters and Symbols

// Encoded symbols:
Copyright &copy; 2024 | Price: &euro;50 | Rating: &#9733;&#9733;&#9733;&#9733;&#9734;

// Decoded output:
Copyright ยฉ 2024 | Price: โ‚ฌ50 | Rating: โ˜…โ˜…โ˜…โ˜…โ˜†

// Common entity mappings:
&copy;   โ†’ ยฉ  (copyright)
&reg;    โ†’ ยฎ  (registered)
&trade;  โ†’ โ„ข  (trademark)
&euro;   โ†’ โ‚ฌ  (euro)
&pound;  โ†’ ยฃ  (pound)
&nbsp;   โ†’ (non-breaking space)
&#9733;  โ†’ โ˜…  (star decimal)
&#x2605; โ†’ โ˜…  (star hexadecimal)

Use case: Decode special symbols and international currency characters

Node.js HTML Entity Decoding

// Using 'he' library (recommended):
const he = require('he');

const encoded = '&lt;p&gt;Hello &amp; goodbye&lt;/p&gt;';
const decoded = he.decode(encoded);
console.log(decoded); // "<p>Hello & goodbye</p>"

// Using 'html-entities':
const {decode} = require('html-entities');
const result = decode(encoded);

// Express.js middleware for decoding:
app.use((req, res, next) => {
  if (req.body.content) {
    req.body.content = he.decode(req.body.content);
  }
  next();
});

Use case: Decode HTML entities in Node.js applications

Safe Decoding with Content Sanitization

// UNSAFE - Never do this with untrusted content:
const decoded = decodeHTML(userInput);
document.body.innerHTML = decoded; // XSS vulnerability!

// SAFE - Decode and sanitize:
import DOMPurify from 'dompurify';

const encoded = '&lt;script&gt;alert("XSS")&lt;/script&gt;';
const decoded = decodeHTML(encoded); // "<script>alert("XSS")</script>"
const safe = DOMPurify.sanitize(decoded); // Script removed
document.body.innerHTML = safe; // Safe to use

// Or display as plain text:
const decoded = decodeHTML(encoded);
document.getElementById('output').textContent = decoded; // Safe

Use case: Safely decode and display untrusted content without XSS risks

โ“ Frequently Asked Questions

What is HTML decoding and how does it work?

HTML decoding is the reverse process of HTML encoding - it converts HTML entities back to their original characters. Browsers automatically decode entities like &lt;, &gt;, &amp;, &quot;, and &#39; when rendering HTML, displaying them as <, >, &, ", and ' respectively. HTML decoding tools or functions read entity codes (both named entities like &nbsp; and numeric entities like &#60;), look up their corresponding characters, and replace the entity with the actual character. This process makes encoded text human-readable and restores the original content format. Decoding is necessary when working with stored encoded data, scraped content, or API responses that return HTML entities.

When should I decode HTML entities?

Decode HTML entities when working with web-scraped data that contains entities, processing API responses that return encoded HTML, reading data from databases where content was stored in encoded format, parsing RSS/XML feeds with encoded content, converting stored encoded text for display in non-HTML contexts (like plain text or PDF), or when debugging to see the actual characters behind the entities. Always decode when you need to see the original text, but be cautious about where you display decoded content - never insert decoded untrusted content directly into HTML without re-encoding or sanitization.

Is it safe to decode HTML from untrusted sources?

Decoding HTML from untrusted sources is potentially dangerous! The decoded content may contain malicious scripts, event handlers, or other XSS attack vectors. For example, decoding &lt;script&gt;alert("XSS")&lt;/script&gt; produces executable code. If you must decode untrusted content: decode it safely, never render it directly in HTML, re-encode it before display, use a content sanitization library (like DOMPurify), display it as plain text only, or validate against a whitelist of allowed characters. The decoding process itself is safe (it happens in your browser), but what you do with the decoded result determines security. Only decode content from trusted sources if you plan to display it as HTML.

What is the difference between HTML decoding and unescaping?

HTML decoding and HTML unescaping are the same process - the terms are used interchangeably. Both refer to converting HTML entities back to their original characters. Some programming languages use "decode" (like Python's html.unescape()), while others use "unescape" (like JavaScript), but they perform the same function. The process reverses HTML encoding/escaping by converting entity codes to actual characters. Whether you call it decoding or unescaping depends on the framework or library you're using, but the result is identical: &amp; becomes &, &lt; becomes <, and so on.

Can I decode all types of HTML entities?

Yes! This tool decodes all standard HTML entity types: named entities (like &lt;, &copy;, &nbsp;), decimal numeric entities (like &#60;, &#169;), and hexadecimal numeric entities (like &#x3C;, &#xA9;). Named entities work for ~250 predefined character names. Numeric entities can represent any Unicode character using its code point. The tool automatically detects the entity format and decodes appropriately. However, malformed entities (incomplete patterns, invalid codes, or typos like &ltt; instead of &lt;) won't decode and will remain as-is. Modern browsers are forgiving with entities, but this tool follows strict HTML5 entity decoding standards.

Why do some websites store HTML entities in databases?

Websites store HTML entities in databases for several reasons: preventing XSS attacks (storing user input as entities prevents script injection), maintaining data integrity (entities preserve special characters without breaking database queries), ensuring safe display (content can be safely inserted into HTML without additional encoding), cross-platform compatibility (entities work consistently across different character encodings), and legacy system support (older systems that couldn't handle UTF-8 relied on entities). However, modern best practice is to store raw UTF-8 text in databases and encode only when rendering to HTML. This makes data more portable and easier to use in non-HTML contexts like JSON APIs or plain text exports.

How do I decode HTML entities in JavaScript?

JavaScript doesn't have a built-in decodeHTML() function, but you can decode entities using the browser's DOM. Create a temporary element, set its innerHTML to the encoded string, then read textContent to get decoded text. Example: const decoded = new DOMParser().parseFromString(encoded, "text/html").documentElement.textContent; Or use a simpler approach: const textarea = document.createElement("textarea"); textarea.innerHTML = encoded; const decoded = textarea.value; Both methods leverage the browser's built-in entity decoding. For Node.js, use the html-entities package or the native he library. Never use this to decode untrusted content that will be inserted back into HTML (also try our <a href="/html-encode" class="text-primary-light dark:text-primary-dark hover:underline">HTML Encoder</a>).

What happens if I decode entities twice?

Double-decoding HTML entities can cause issues, especially with ampersands. For example: &amp;lt; (encoded) โ†’ &lt; (first decode) โ†’ < (second decode). If you decode twice, double-encoded entities become their literal characters, which might break HTML structure or create security vulnerabilities. This is particularly problematic when working with user input that might already contain entities. Always track whether content is encoded or decoded, decode only once, and avoid round-trip encoding/decoding cycles. If you're unsure whether content is encoded, check for entity patterns (&...; format) before decoding. Modern frameworks typically handle this automatically to prevent double-encoding/decoding issues.

๐Ÿ”— Related Tools