Toolverse

JSON vs YAML: Syntax, Trade-Offs, and the Norway Problem

7 min read

A new developer joins your team and opens a Kubernetes deployment file for the first time. It is 200 lines of YAML with nested maps, multi-line strings, and anchor references. They ask: "Why don't we just use JSON?" The answer involves trade-offs between human readability, machine parseability, and the sharp edges hiding in both formats.

JSON and YAML: Same Data, Different Syntax

JSON (JavaScript Object Notation, RFC 8259) and YAML (YAML Ain't Markup Language, spec 1.2.2) both represent the same data structures: mappings (objects), sequences (arrays), and scalar values (strings, numbers, booleans, null). In fact, every valid JSON document is also valid YAML — the YAML 1.2 spec was deliberately designed as a superset of JSON.

The difference is syntax. JSON uses braces {}, brackets [], and mandatory double-quoted keys. YAML uses indentation and colons, with optional quoting. A simple configuration in JSON:

{
  "server": {
    "port": 8080,
    "host": "0.0.0.0",
    "ssl": true
  }
}

The same in YAML:

server:
  port: 8080
  host: 0.0.0.0
  ssl: true

Four lines instead of seven. No braces, no quotes, no commas. For configuration files that humans read and edit daily, YAML is significantly more scannable.

Where YAML Wins: Configuration and DevOps

YAML dominates the DevOps ecosystem. Kubernetes, Docker Compose, Ansible, GitHub Actions, CircleCI, Helm, and Prometheus all use YAML as their primary configuration format. The reasons are practical:

  • Comments — YAML supports # comments. JSON does not (RFC 8259 explicitly forbids them). For configuration files that need inline documentation, this alone is decisive.
  • Multi-line strings — YAML's | (literal block) and > (folded block) scalars let you embed scripts, certificates, or SQL queries without escape character gymnastics.
  • Anchors and aliases — YAML's & and * syntax lets you define a value once and reference it elsewhere, reducing duplication in large configs.
  • Less visual noise — no trailing commas to debug, no bracket matching, fewer characters per line of meaningful data.

Where JSON Wins: APIs and Data Exchange

JSON is the default serialization format for REST APIs, with over 83% of public APIs using it as their primary response format (Postman State of APIs report, 2023). JSON's strengths:

  • Unambiguous parsing — JSON has one way to represent each value. YAML has many: yes, Yes, YES, true, and on all parse as boolean true. This implicit type coercion has caused real bugs — the infamous "Norway problem" where the country code NO is parsed as boolean false.
  • Speed — JSON parsers are simpler and faster. Python's json.loads() is 5-10× faster than PyYAML's safe_load() on equivalent documents.
  • Universal support — every programming language ships with a JSON parser. YAML requires a third-party library in most languages.
  • No indentation sensitivity — a misplaced space in YAML changes the data structure silently. JSON's braces make nesting explicit.

YAML Gotchas That Bite in Production

YAML's flexibility creates a category of bugs that do not exist in JSON:

  • Implicit typingversion: 3.10 parses as the float 3.1, not the string "3.10". Python 3.10 users discovered this in CI configs. Always quote version strings: version: "3.10".
  • The Norway problem — country codes like NO, DE (parsed as a sexagesimal number in some parsers), and time-like strings like 22:22 get misinterpreted. YAML 1.2 tightened the rules, but many parsers still default to 1.1 behavior.
  • Indentation errors — tabs are forbidden in YAML (spec section 5.5). Mixed tabs and spaces cause parser errors that are invisible in some editors.
  • Security: YAML deserialization — YAML supports custom type tags that can trigger code execution. Python's yaml.load() without Loader=SafeLoader is a known remote code execution vector (CVE-2017-18342). Always use safe loading functions.

Choosing the Right Format

The choice is context-dependent:

  • API request/response bodies → JSON. It is the universal standard and faster to parse.
  • Configuration files edited by humans → YAML. Comments and readability justify the complexity.
  • Data interchange between services → JSON. Unambiguous parsing prevents silent type coercion bugs.
  • Complex templating (Helm, Ansible) → YAML with anchors, but consider a JSON schema for validation.

Many teams use both: YAML for source configuration files and JSON as the wire format. Kubernetes kubectl accepts both and can convert between them with -o json / -o yaml.

Key Takeaways

  • YAML is a superset of JSON — every JSON file is valid YAML, but not vice versa.
  • Use JSON for APIs and machine-to-machine data; use YAML for human-edited configuration files.
  • Always quote strings in YAML that could be misinterpreted as numbers, booleans, or null ("3.10", "NO", "null").
  • Never use yaml.load() without a safe loader — it is a remote code execution risk.
  • Validate YAML configs with JSON Schema to catch structural errors before deployment.

Need to convert between formats? Use our JSON to YAML Converter to switch between JSON and YAML instantly in your browser. For formatting raw JSON, try the JSON formatting guide.

Try it yourself

Put what you learned into practice with our free tool.

Open Tool

Frequently Asked Questions

Is every JSON file valid YAML?
Yes. YAML 1.2 was designed as a strict superset of JSON. Any valid JSON document can be parsed by a YAML 1.2 compliant parser. The reverse is not true — YAML features like comments, anchors, and unquoted strings have no JSON equivalent.
What is the Norway problem in YAML?
In YAML 1.1, the string 'NO' (the ISO country code for Norway) is parsed as boolean false because YAML treats 'no', 'NO', and 'No' as false values. This caused real bugs in internationalization configs. YAML 1.2 restricts booleans to only 'true' and 'false', but many parsers still default to 1.1 rules.
Why is YAML considered less safe than JSON?
YAML supports custom type tags that can instantiate arbitrary objects during deserialization. In Python, calling yaml.load() without SafeLoader can execute arbitrary code (CVE-2017-18342). JSON parsing is inherently safer because the format only supports primitive data types.