Base64 Encoding Explained: How It Works and When to Use It
Base64 is one of those things every developer encounters but few truly understand. You see it in JWT tokens, data URIs, API headers, and email attachments. This guide explains what Base64 encoding actually does, how the algorithm works step by step, and when you should (and should not) use it.
What Base64 Is (and What It Is Not)
Base64 is a binary-to-text encoding scheme defined in RFC 4648. It converts binary data into a string of 64 printable ASCII characters: A–Z, a–z, 0–9,+, and /, plus = for padding.
Base64 is encoding, not encryption. Anyone can decode a Base64 string instantly without a key. Never use Base64 to protect sensitive data. If you need to protect data, use proper encryption (AES-256, RSA) as recommended by OWASP.
Why Base64 Exists
Many protocols and formats — email (SMTP), JSON, XML, URLs — are designed to carry text, not raw binary data. If you try to embed a JPEG image directly in a JSON string, the binary bytes will corrupt the text. Base64 solves this by representing binary data using only printable characters that are safe in any text context.
The original use case was MIME email encoding (RFC 2045), which needed to transmit binary attachments through text-only email servers. The same principle applies everywhere binary data needs to travel through a text channel.
How the Algorithm Works
Base64 processes input in groups of 3 bytes (24 bits). Each group is split into four 6-bit values, and each 6-bit value maps to one character from the Base64 alphabet (Table 1 in RFC 4648).
Here is a concrete example. Encoding the text Hi!:
- Convert to bytes:
H=72,i=105,!=33 - Binary:
01001000 01101001 00100001(24 bits total) - Split into 6-bit groups:
010010000110100100100001 - Decimal values: 18, 6, 36, 33
- Map to Base64 alphabet:
S,G,k,h - Result:
SGkh
Padding
When the input length is not divisible by 3, padding is required:
- 1 remaining byte → 2 Base64 characters +
== - 2 remaining bytes → 3 Base64 characters +
= - 0 remaining bytes → no padding
This is why you see = or == at the end of Base64 strings — they are not part of the data, just alignment markers.
Size Overhead
Base64 output is always ceil(n/3) × 4 bytes, approximately 33% larger than the input. A 100KB image becomes roughly 133KB when Base64-encoded in a data URI. For MIME-encoded email, line breaks are added every 76 characters, increasing the overhead slightly further.
Base64 vs Base64URL
Standard Base64 uses + and /, which have special meaning in URLs. RFC 4648 Section 5 defines a URL-safe variant that replaces + with - and / with _, and omits padding = characters.
| Feature | Base64 Standard | Base64URL |
|---|---|---|
| Specification | RFC 4648 §4 | RFC 4648 §5 |
| Special characters | + and / | - and _ |
| Padding | = required | Omitted |
| Use case | MIME email, data URIs | JWT, URL parameters |
Real-World Use Cases
- HTTP Basic Authentication — The
Authorization: Basicheader containsusername:passwordencoded in Base64. This is encoding, not security — always use HTTPS alongside it. - Data URIs — Embed small images directly in HTML or CSS:
data:image/png;base64,iVBORw0KGgo.... Useful for icons under 2KB to avoid extra HTTP requests. - JWT tokens — The header and payload sections of a JSON Web Token are Base64URL-encoded JSON objects.
- Email attachments — MIME uses Base64 to encode binary attachments for transmission through text-only SMTP servers.
- API payloads — When an API needs to accept binary data (file uploads, images) in a JSON body, Base64 encoding is the standard approach.
Browser Implementation
JavaScript provides btoa() (binary to ASCII) and atob() (ASCII to binary) for Base64 encoding. However, btoa() throws an InvalidCharacterError for characters above U+00FF. The correct UTF-8-safe pattern is:
// Encode (UTF-8 safe)
const encoded = btoa(
new TextEncoder()
.encode(text)
.reduce((s, b) => s + String.fromCharCode(b), "")
);
// Decode (UTF-8 safe)
const decoded = new TextDecoder().decode(
Uint8Array.from(atob(encoded), (c) => c.charCodeAt(0))
);When NOT to Use Base64
- Large files — The 33% overhead is significant for files over a few KB. Use multipart form uploads or binary protocols instead.
- Security — Base64 provides zero security. Use encryption for sensitive data.
- Performance-critical paths — Encoding and decoding add CPU overhead. For high-throughput systems, transmit binary directly when the protocol supports it.
Key Takeaways
- Base64 converts binary to text using a 64-character alphabet
- It adds ~33% size overhead — use it for small payloads
- Base64URL (with
-and_) is the right choice for URLs and JWTs - It is encoding, never encryption
- Browser
btoa()needs a UTF-8 workaround for non-ASCII text
Want to try it out? Our Base64 Encoder and Decoder handles UTF-8 correctly, detects invalid input, and runs entirely in your browser.
Frequently Asked Questions
- Is Base64 encoding the same as encryption?
- No. Base64 is encoding, not encryption. Anyone can decode a Base64 string instantly without a key. Never use Base64 to protect sensitive data — use proper encryption (AES, RSA) instead.
- Why does Base64 make data larger?
- Base64 converts every 3 bytes of input into 4 ASCII characters, resulting in approximately 33% size overhead. This is the trade-off for ensuring the data can be safely transmitted through text-only channels.
- What is the difference between Base64 and Base64URL?
- Standard Base64 uses + and / characters, which have special meaning in URLs. Base64URL (RFC 4648 Section 5) replaces + with - and / with _, and omits padding = characters. JWT tokens use Base64URL encoding.