JSON Support Comparison

JSONB vs MySQL JSON: indexing, querying, performance traps, and migration implications.

On this page

JSON Support: Similar Feature Name, Very Different Tradeoffs

Both Postgres and MySQL support JSON, but their practical capabilities and performance characteristics differ. The biggest production mistake is treating JSON as a free schema shortcut.

JSON can be useful, but it can also create hidden performance debt and observability gaps.

JSON vs JSONB in Postgres

Postgres supports two JSON-related types:

JSON: stores the original text representation.
JSONB: stores a binary representation optimized for querying and indexing.

Production rule: if you query JSON fields, JSONB is typically the right choice.

MySQL JSON Type

MySQL provides a native JSON type. In practice, you use JSON functions to extract paths and filter or project values.

MySQL JSON querying can be effective, but indexing strategy and function usage determine whether queries are efficient.

Querying: Operators and Path Expressions

Postgres JSONB provides rich operators for containment and path traversal. MySQL uses JSON_EXTRACT-style functions and path expressions.

Production implication: if your query patterns are heavy on containment and flexible path filters, Postgres JSONB often offers stronger ergonomics and indexing options.

Indexing JSON: The Real Make-or-Break

JSON performance is usually decided by indexing, not by JSON parsing alone.

Postgres: GIN and Expression Indexes

Common Postgres JSONB indexing patterns include:

GIN indexes for containment queries
Expression indexes for specific extracted paths

-- Example: index a specific JSON path as text
CREATE INDEX idx_events_type
ON events ((payload->>'type'));

GIN indexes can accelerate broad JSON containment searches, but they have costs: index size and write amplification.

MySQL: Generated Columns for Indexing

A common MySQL strategy is to create generated columns extracting JSON paths, then index those columns.

-- Conceptual pattern: generated column for JSON path
ALTER TABLE events
  ADD COLUMN event_type VARCHAR(50)
  GENERATED ALWAYS AS (JSON_UNQUOTE(JSON_EXTRACT(payload, '$.type'))) STORED,
  ADD INDEX idx_event_type (event_type);

This turns JSON querying into indexed relational access at the cost of schema work and storage.

Partial Updates: Why JSON Can Be Expensive

A critical production difference: updating a small piece of JSON often rewrites a large value.

Large JSON documents increase write amplification.
TOAST/overflow (Postgres) or large page storage effects (MySQL) can appear.

When JSON documents grow, updates can become unexpectedly expensive.

Schema Evolution: Why JSON Feels Great Until It Hurts

JSON allows flexible schema, but production systems still need:

Validation
Defaults
Backward compatibility
Query performance stability

If you skip schema discipline, you will accumulate inconsistent shapes and expensive query logic.

When JSON Is a Good Fit

Semi-structured event payloads for logging/analytics ingestion
Feature flags or settings with limited query needs
Rapid prototyping where strict schema is premature
Storing a “raw input” alongside normalized fields

When JSON Is a Bad Fit

High-QPS filtering and joins on JSON fields
Strong integrity requirements (FKs, constraints) inside JSON
Frequent partial updates to large JSON documents
Need for global reporting and indexing across many fields

In those cases, normalize or use generated/materialized columns deliberately.

Observability and Data Quality Pitfalls

JSON increases risk of silent drift:

Different producers emit different shapes
Missing fields break queries unexpectedly
Type inconsistencies (string vs number) appear over time

Production fix: validate JSON at ingestion and enforce schema versioning within payload.

Failure Modes in Production

Unindexed JSON filters: full scans and latency spikes.
GIN bloat: large JSONB + heavy writes → huge indexes.
Generated column drift: JSON paths change, columns stop matching reality.
Large document rewrites: small updates become expensive.
Schema chaos: inconsistent payload shapes break analytics.

Operational Checklist

Define which JSON fields are query-critical; index them intentionally.
Prefer Postgres JSONB when you need rich JSON querying and indexing.
In MySQL, use generated columns for indexed access to key paths.
Limit JSON document size; avoid storing huge blobs if updates are frequent.
Validate JSON on write; enforce schema version inside payload.
Monitor query plans; ensure JSON predicates use indexes.
Measure index growth (GIN in Postgres, generated columns in MySQL).
Normalize fields that become frequently queried over time.
Load test JSON-heavy writes to catch rewrite amplification early.
Document JSON usage rules to prevent uncontrolled sprawl.

Summary

Postgres JSONB offers powerful operators and GIN/expression indexing, making it strong for queried JSON workloads, but it can produce large indexes and write amplification. MySQL JSON often relies on generated columns for indexable access. In both systems, JSON is useful for semi-structured data, but production-grade design requires intentional indexing, validation, and a plan to normalize fields that become hot.

← Storage Engine Differences

Indexing Capability Differences →