Engineering Publication · Syed Omar Ibn Feroj← Back to portfolio
</>
Engineering Notes · Open Knowledge Repository
HTML5 — Engineering
Notes for the Document
A working reference for HTML as a platform — semantic structure that machines understand, forms that validate themselves, the browser APIs worth knowing, and the accessibility and loading details that separate a page from a product.
28
Chapters
Living
Standard
Living
Document
Free
License
Ch. 01
Document & Doctype
The five lines every HTML document needs, and why each one is there.
1.1 — The minimum valid document
HTML
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Page</title>
</head>
<body>…</body>
</html>A missing or malformed
<!doctype html> drops the browser into "quirks mode", where the box model and many CSS behaviours silently change. It must be the very first thing in the file — before any comment or whitespace-significant content.Ch. 02
The Head
Metadata the user never sees but every machine reads.
2.1 — The non-negotiable head
HTML
<meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Specific, unique page title</title> <meta name="description" content="…"> <link rel="canonical" href="https://example.com/page">
Without the viewport meta, mobile browsers render at a 980px virtual width and shrink — your responsive CSS never engages. It's the single most important line for mobile.
Ch. 03
Text Content
Headings, paragraphs, and using the right element for meaning.
3.1 — Structure is semantics, not size
HTML
<h1>One per page — the document's title</h1> <h2>Section</h2> <h3>Sub-section</h3> <p>Prose. <strong>Importance</strong>, <em>emphasis</em>.</p> <code>inline code</code> <pre>block, whitespace kept</pre> <blockquote cite="…">quoted</blockquote>
Don't skip heading levels for visual size (h1 → h4) — assistive tech builds a document outline from them. Style with CSS; choose the element by meaning.
Ch. 04
Links
The anchor — and the security attribute people forget.
4.1 — Targets, rels, and safe new tabs
HTML
<a href="/about">internal</a> <a href="#section">in-page</a> <a href="https://x.com" target="_blank" rel="noopener noreferrer">external</a> <a href="mailto:x@y.com">email</a> <a href="tel:+880…">call</a>
target="_blank" without rel="noopener" gives the opened page access to window.opener — a tab-napping vector. Modern browsers imply noopener, but set it explicitly for older engines.Ch. 05
Lists
Ordered, unordered, and the description list nobody uses.
5.1 — Three kinds, each with a job
HTML
<ul><li>unordered</li></ul> <ol start="3"><li>ordered, starts at 3</li></ol> <dl><dt>Term</dt><dd>Definition</dd></dl>
Navigation menus should be a
<ul> inside <nav> — screen readers announce "list, N items", giving users a way to skip it.Ch. 06
Tables
For tabular data only — never for layout.
6.1 — A correct, accessible table
HTML
<table> <caption>Q1 revenue</caption> <thead><tr><th scope="col">Month</th><th scope="col">USD</th></tr></thead> <tbody><tr><th scope="row">Jan</th><td>1200</td></tr></tbody> </table>
scope on header cells is what lets a screen reader say "Jan, USD, 1200". A table without th/scope is an unreadable grid to non-visual users. Never use tables to position content — that's CSS's job.Ch. 07
Semantic Sectioning
header, nav, main, article, section, aside, footer.
7.1 — Landmarks the browser and AT understand
HTML
<body>
<header>…</header>
<nav aria-label="Primary">…</nav>
<main>
<article><h2>Post</h2>…</article>
</main>
<footer>…</footer>
</body>There must be exactly one
<main> per page and it must not be nested in article/aside. section needs a heading to be meaningful — otherwise use a plain div.Ch. 08
Images & srcset
Responsive images and the attribute that prevents layout shift.
8.1 — width/height, srcset, lazy
HTML
<img src="a.jpg" alt="Meaningful description"
width="800" height="600" loading="lazy"
srcset="a-400.jpg 400w, a-800.jpg 800w"
sizes="(max-width: 600px) 400px, 800px">Always set
width and height (or aspect-ratio CSS) — without them the image has zero size until it loads, causing a Cumulative Layout Shift that ruins Core Web Vitals. alt="" (empty) for purely decorative images; never omit the attribute.Ch. 09
Audio & Video
Native media, captions, and the autoplay reality.
9.1 — Sources, tracks, controls
HTML
<video controls preload="metadata" poster="p.jpg"> <source src="v.webm" type="video/webm"> <source src="v.mp4" type="video/mp4"> <track kind="captions" src="en.vtt" srclang="en" default> </video>
Browsers block autoplay with sound. Autoplay only works if the video is
muted (and ideally playsinline on iOS). Captions via <track> are an accessibility and SEO requirement, not an extra.Ch. 10
SVG & Picture
Vector graphics inline, and art-directed responsive images.
10.1 — Inline SVG and <picture>
HTML
<svg viewBox="0 0 24 24" width="24" aria-hidden="true"> <path d="M4 12h16"/> </svg> <picture> <source media="(min-width: 800px)" srcset="wide.jpg"> <img src="narrow.jpg" alt="…"> </picture>
Inline SVG can be styled and animated with CSS and scripted via the DOM — an
<img>-loaded SVG cannot. Use inline for icons you theme, <img> for static illustrations.Ch. 11
Embedding
iframe sandboxing — the security control most pages skip.
11.1 — Sandbox third-party frames
HTML
<iframe src="https://widget.example" sandbox="allow-scripts allow-same-origin" loading="lazy" title="Pricing widget"></iframe>
An unsandboxed third-party iframe can run scripts, navigate your top window, submit forms and trigger downloads. Start from
sandbox="" (everything off) and add the minimum allow-* tokens. Always give frames a title for AT.Ch. 12
Inputs
The right input type does validation, UX and mobile keyboards for free.
12.1 — Type carries behaviour
HTML
<label for="e">Email</label> <input id="e" type="email" autocomplete="email" required> <input type="number" min="1" max="10" step="1"> <input type="date"> <input type="search"> <input type="tel">
Every input needs an associated
<label> (via for/id or wrapping). A placeholder is not a label — it disappears on focus and is invisible to many screen readers. Correct type + autocomplete also fixes the mobile keyboard and autofill.Ch. 13
Validation
Constraint validation — the browser validates before your JS runs.
13.1 — Built-in constraints + the API
HTML
<input required minlength="3" pattern="[a-z0-9]+"
title="lowercase letters and digits only">JavaScript
if (!input.checkValidity()) { msg.textContent = input.validationMessage; } input.setCustomValidity("already taken"); // custom rule
Client validation is UX, never security — it's trivially bypassed. Re-validate every field on the server.
novalidate on the form disables the native bubbles when you render your own messages.Ch. 14
Submission
method, enctype, and FormData.
14.1 — The enctype that breaks file uploads
HTML
<form method="post" action="/submit"
enctype="multipart/form-data">
<input type="file" name="doc">
</form>A file input only uploads if the form's
enctype is multipart/form-data. With the default URL-encoding the server receives just the filename, not the bytes — a classic "uploads silently empty" bug.Ch. 15
The DOM
The live tree — query it efficiently.
15.1 — Select, create, mutate
JavaScript
const el = document.querySelector(".card"); const all = document.querySelectorAll("li"); // static NodeList const n = document.createElement("p"); n.textContent = "safe — no HTML parsing"; el.append(n);
innerHTML with any untrusted string is an XSS hole. Use textContent for text; build elements with createElement; sanitise if you genuinely must inject HTML.Ch. 16
Events
Delegation, and the listener leak.
16.1 — Delegate instead of N listeners
JavaScript
list.addEventListener("click", (e) => { const item = e.target.closest("li"); if (item) select(item.dataset.id); }); // one listener handles all current + future <li>
Listeners on elements you later remove keep those elements alive (leak) unless removed or the node is fully detached. Delegation on a stable ancestor avoids both the leak and re-binding after DOM updates.
Ch. 17
Web Storage
localStorage / sessionStorage — synchronous, string-only, small.
17.1 — Set, get, the quota trap
JavaScript
localStorage.setItem("theme", "dark"); const t = localStorage.getItem("theme") ?? "light"; localStorage.setItem("u", JSON.stringify(obj)); // objects must be serialised
Web Storage is synchronous (blocks the main thread), ~5 MB, string-only, and readable by any script on the origin — never store tokens or PII.
setItem throws when the quota is exceeded or in some private-mode browsers; wrap it in try/catch.Ch. 18
History & Routing
pushState and the SPA navigation contract.
18.1 — Update the URL without a reload
JavaScript
history.pushState({ page: 2 }, "", "/list?page=2"); window.addEventListener("popstate", (e) => render(e.state));
pushState does not fire popstate — only back/forward and history.back() do. You must render after both the push and the popstate, or the back button shows a stale view.Ch. 19
Fetch
The modern request API — and the two things it doesn't do.
19.1 — Check ok; abort on timeout
JavaScript
const c = new AbortController(); const t = setTimeout(() => c.abort(), 5000); const r = await fetch(url, { signal: c.signal }); if (!r.ok) throw new Error(`HTTP ${r.status}`);
fetch only rejects on network failure — a 404 or 500 still resolves. You must check response.ok yourself. And there is no built-in timeout; an AbortController is the only way to bound it.Ch. 20
Templates & Components
<template>, custom elements, the shadow DOM.
20.1 — Inert markup + a web component
JavaScript
class HelloBox extends HTMLElement { connectedCallback() { this.attachShadow({ mode: "open" }).innerHTML = "<p>hi</p>"; } } customElements.define("hello-box", HelloBox);
<template> content is parsed but inert — not rendered, scripts don't run, images don't load — until you clone it into the live DOM. Custom element names must contain a hyphen.Ch. 21
Device APIs
Geolocation, clipboard, share — all gated by permission and HTTPS.
21.1 — Permission-gated, secure-context only
JavaScript
await navigator.clipboard.writeText("copied"); if (navigator.share) await navigator.share({ url }); navigator.geolocation.getCurrentPosition(ok, err);
These APIs require a secure context (HTTPS or localhost) and most require a user gesture and an explicit permission grant. Feature-detect (
if (navigator.share)) and always provide a fallback — never assume availability.Ch. 22
Accessibility
Semantics first; ARIA only to fill gaps.
22.1 — The first rule of ARIA
HTML
<button>Save</button> <!-- focusable, keyboard, role: free --> <div role="button" tabindex="0">Save</div> <!-- now reimplement all of it -->
The first rule of ARIA is: don't use ARIA if a native element does the job. A
<div role=button> needs manual tabindex, Enter/Space handling and focus styles — all of which <button> gives you free. Reach for ARIA only for genuinely custom widgets.Ch. 23
data-* & Attributes
Custom data without invalid markup.
23.1 — dataset round-trips
HTML
<li data-id="42" data-user-role="admin">…</li>
JavaScript
li.dataset.id; // "42" (always a string) li.dataset.userRole; // "admin" (kebab → camel)
data-* values are always strings — data-id comes back as "42", not 42. Don't store large JSON blobs in attributes; it bloats the DOM and HTML-escaping bites you.Ch. 24
Loading Performance
The script attributes that decide first paint.
24.1 — defer, async, preload, fetchpriority
HTML
<script src="app.js" defer></script> <!-- parse-blocking? no --> <script src="analytics.js" async></script> <link rel="preload" href="hero.webp" as="image" fetchpriority="high">
A plain
<script> in <head> blocks HTML parsing until it downloads and executes — the classic slow first paint. Use defer (runs in order, after parse) for app code, async for independent third-party scripts.Ch. 25
SEO & Metadata
What crawlers and social platforms actually read.
25.1 — Title, description, OG, structured data
HTML
<meta property="og:title" content="…">
<meta property="og:image" content="https://…/card.png">
<meta name="twitter:card" content="summary_large_image">
<script type="application/ld+json">{ "@type": "Article" }</script>One unique, descriptive
<title> and meta description per page does more for SEO than any tag soup. OG/Twitter tags control the share-card preview; absolute image URLs only.Ch. 26
Security
CSP, referrer policy, and the headers that matter.
26.1 — Defense in the document
HTML
<meta http-equiv="Content-Security-Policy" content="default-src 'self'; script-src 'self'"> <meta name="referrer" content="strict-origin-when-cross-origin">
A real Content-Security-Policy (ideally an HTTP header, not just a meta tag) is the strongest single defence against XSS — it stops injected inline scripts from executing even if markup escaping fails. Avoid
'unsafe-inline'; it defeats the purpose.Ch. 27
Common Pitfalls
The recurring HTML mistakes.
27.1 — The list
- Missing
doctype→ quirks mode - No viewport meta → broken mobile
imgwithoutwidth/height→ layout shift- Placeholder used as a label → inaccessible
div role=buttoninstead ofbuttontarget=_blankwithoutrel=noopener- Blocking
<script>in head, nodefer - File input form without
multipart/form-data
Ch. 28
Best Practices
The defaults that make HTML a product, not a draft.
28.1 — The short list
- Semantic element by meaning; style with CSS
- One
h1, no skipped heading levels, onemain - Every input labelled; every image has
alt - Set image/embed dimensions; lazy-load below the fold
deferapp scripts; preload the LCP asset- Validate on the server regardless of client constraints
- Ship a CSP; sandbox third-party iframes
- Test with a keyboard only and a screen reader once
REF
HTML Cheatsheet
The head boilerplate you should be able to type from memory.
Document skeleton
HTML
<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Unique title</title> <meta name="description" content="…"> <link rel="canonical" href="https://example.com/"> <link rel="icon" href="/favicon.svg"> </head> <body> … <script src="app.js" defer></script> </body> </html>