Is mysql good for json?

I’ve been using this semi-relational, or SemiSQL, if you’d like, approach to SQL since around 2009. At first, I’d store the data as text or longtext types using PHP’s serialize(), but then I quickly moved to JSON.

In late 2015, with MySQL 5.7.8, our prayers were answered and the native support for JSON datatype was introduced. 🙏

Let’s have a look at the best practices using JSON in MySQL 8.

Is mysql good for json?

Photo by Samuel Zeller on Unsplash, edit ΔO

SemiSQL vs NoSQL

For most cases, I’d recommend Mongo DB as NoSQL storage for JSON data. It’s a great and scalable database system featuring plenty of filtering, sorting, indexing, projecting and even aggregating functions, which can supply for basic relations.

However, if your project uses heavy relational logic, MySQL with SemiSQL approach may suit you better. Or maybe your project already uses MySQL and switching databases is simply too much work.

With MySQL’s JSON datatype and plenty of JSON functions, you gain advantage using both non-relational (NoSQL) and relational (SQL) database features where needed. All without the need to maintain the data integrity manually 🙏. At least for 1:1 relations.

Creating JSON storage

Creating a key-value table for JSON storage is simple

CREATE TABLE listings (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
data JSON NOT NULL
);
INSERT INTO listings (data) VALUES
('{
"title": "First Listing",
"public": false,
"text": "Lorem ipsum",
"address": "Klimentská 1215/25, 110 00, Prague 1"
}
'),
('{
"title": "Second Listing",
"public": true,
"text": "Dolor sit amet",
"address": "Rybná 716/24, 110 00, Prague 1"
}
');

Read the full data

SELECT id, data FROM listings;1  {"text": "Lorem ipsum", "title": "First Listing", "public": false, "address": "Klimentská 1215/25, 110 00, Prague 1"}2  {"text": "Dolor sit amet", "title": "Second Listing", "public": true, "address": "Rybná 716/24, 110 00, Prague 1"}

Read partial data, preserving JSON types

SELECT id, data->'$.title' as 'title', data->'$.public' as 'public'
FROM listings;
1 "First Listing" false (bool)
2 "Second Listing" true (bool)

Read partial data, unquoting JSON types

SELECT
id,
data->>'$.title' 'title',
data->>'$.public' 'public'
FROM listings;
1 First Listing false (string)
2 Second Listing true (string)

These -> and ->> are shortcuts for JSON_EXTRACT(column, path) and JSON_UNQUOTE(JSON_EXTRACT(column, path)) functions respectively.

I do not recommend to unquote other types than strings, especially booleans, as this would break logical operations.

SELECT
id,
data->>'$.title' 'title',
data->'$.public' 'public'
FROM listings;
1 First Listing false (bool)
2 Second Listing true (bool)

Updating data

You can update the whole data object simply via

UPDATE listings SET data = '{ ... }' WHERE id = 1;

Or update single keys via

UPDATE listings
SET data = JSON_SET(data, '$.address', CAST('{
"street": "Klimentská 1215/25",
"city": "Prague 1",
"zip": "CZ-11000",
"gps": [50.0920, 14.4304]
}
' AS JSON))
WHERE id = 1;

You can also use JSON data for filters, like this:

UPDATE listings
SET data = JSON_SET(data, '$.address', CAST('{
"street": "Rybná 716/24",
"city": "Prague 1",
"zip": "CZ-11000",
"gps": [50.0903, 14.4260]
}
' AS JSON))
WHERE data->>'$.title' = 'Second Listing';

For multi–row update, I prefer to

# for updating the whole data packagesINSERT INTO listings (id, data) VALUES
(1, '{ ... }'),
(2, '{ ... }')
ON DUPLICATE KEY UPDATE data = VALUES(data);
# for updating only specified keys
# unfortunately, this breaks on ACID checks, as described below
INSERT INTO listings (id, data) VALUES
(1, '{ "public": false }'),
(2, '{ "public": true, "reserved": false }')
ON DUPLICATE KEY UPDATE data = JSON_MERGE_PATCH(data, VALUES(data));

Indexes and virtual (generated) columns

The true beauty of MySQL’s JSON functions lies with generated columns.

ALTER TABLE listingsADD title VARCHAR(255)
AS (data->>'$.title') STORED
AFTER id,
ADD public BIT
AS (data->'$.public' = true) STORED
AFTER title,
ADD country VARCHAR(2)
AS (SUBSTRING(data->>'$.address.zip', 1, 2)) STORED
NOT NULL
AFTER public,
ADD city VARCHAR(15)
AS (data->>'$.address.city') STORED
NOT NULL
AFTER country,
ADD gps POINT
AS (POINT(data->'$.address.gps[0]', data->'$.address.gps[1]')) STORED
NOT NULL
AFTER city;

NOT NULL columns will also ACID–check JSON’s integrity 👍

Now you can select data like this

SELECT id, title, data->>'$.text' 'text', country, city
FROM listings
WHERE public AND title LIKE 'Second%';
2 Second Listing Dolor sit amet CZ Prague 1

Indexes

Always a good idea to add indexes, even redundant ones. Especially if your table hits considerable more reads than writes, what, in most cases, happen.

ALTER TABLE listings
ADD INDEX (public DESC,country),
ADD INDEX (public DESC),
ADD INDEX (country),
ADD INDEX (city),
ADD INDEX (title),
ADD SPATIAL (gps);

All indexes require GENERATED colums to be STORED, as in not VIRTUAL. Also, SPATIAL index only supports NON NULL columns.

Projection columns

List–view projection

Let’s say your JSON object in data column is heavier than that. Let’s say:

UPDATE listings
SET data = JSON_SET(data,
'$.contact', CAST('{
"name": "Lorem ipsum",
"phone": "+000000000000",
"email": ""
}
' AS JSON),
'$.photos', CAST('[
{"url": "...", "width": 3240, "height": 1620},
{"url": "...", "width": 2880, "height": 1620},
{"url": "...", "width": 1620, "height": 3240}
]
' AS JSON)
# plus plenty of other data
)
WHERE id = 2;

You’d want to load the data column for viewing the entry’s detail, but for list views, it’s better not to load the whole big packages.

Let’s make a projection for what you need only, using a VIRTUAL column.

ALTER TABLE listingsADD listing JSON AS
(JSON_OBJECT(
'title', data->'$.title',
'photo', data->'$.photos[0]',
'address', CONCAT_WS(', ',
data->>'$.address.street',
data->>'$.address.city',
SUBSTRING(data->>'$.address.zip', 1, 2)
),
'gps', data->'$.address.gps'
))

VIRTUAL AFTER gps;

Virtual columns are not stored, obviously, and are only generated when selected.

SELECT listing
FROM listings
WHERE public;
2 {
"gps": [50.0903, 14.426],
"photo": {"url": "...", "width": 3240, "height": 1620},
"title": "Second Listing",
"address": "Klimentská 1215/25, Prague 1, CZ"
}

Fulltext projections

Let’s extract a fulltext string from data and store them for fulltext index.

ALTER TABLE listingsADD ft VARCHAR(255) AS
(IF(public, CONCAT_WS(' ',
data->'$.title',
data->'$.text',
JSON_EXTRACT(data->'$.address', '$.*'),
JSON_EXTRACT(data->'$.contact', '$.*')
), NULL))

STORED AFTER listing,
ADD FULLTEXT (ft);

Now you can perform fulltext–optimized search like this:

SELECT id, listing
FROM listings
WHERE MATCH(ft) AGAINST('Kliment*' IN BOOLEAN MODE);
2 {
"gps": [50.0903, 14.426],
"photo": {"url": "...", "width": 3240, "height": 1620},
"title": "Second Listing",
"address": "Klimentská 1215/25, Prague 1, CZ"
}

Datetime and timestamp

These are not generated columns in true sense of the phrase, but I take them for good praxis anyways.

ALTER TABLE listingsADD dt DATETIME
DEFAULT CURRENT_TIMESTAMP
AFTER data,
ADD INDEX (dt DESC), # optionally ADD ts TIMESTAMP
DEFAULT CURRENT_TIMESTAMP
ON UPDATE CURRENT_TIMESTAMP
AFTER dt;

Column dt keeps the row’s insert datetime, while ts shows the row’s last update timestamp.

Overall table structure

If TLDR, the full structure of a smart JSON-based table for semi-relational purposes would look like this

CREATE TABLE listings (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255) AS
(data->>'$.title') STORED,
public BIT AS
(data->'$.public' = true) STORED,
country VARCHAR(2) AS
(SUBSTRING(data->>'$.address.zip', 1, 2)) STORED NOT NULL,
city VARCHAR(15) AS
(data->>'$.address.city') STORED NOT NULL,
gps POINT AS
(POINT(data->'$.address.gps[0]', data->'$.address.gps[1]'))
STORED NOT NULL,
listing JSON AS
(JSON_OBJECT(
'title', data->'$.title',
'photo', data->'$.photos[0]',
'address', CONCAT_WS(', ',
data->>'$.address.street',
data->>'$.address.city',
SUBSTRING(data->>'$.address.zip', 1, 2)
),
'gps', data->'$.address.gps'
)) VIRTUAL,
ft VARCHAR(255) AS
(IF(public, CONCAT_WS(' ',
data->'$.title',
data->'$.text',
JSON_EXTRACT(data->'$.address', '$.*'),
JSON_EXTRACT(data->'$.contact', '$.*')
), NULL)) STORED,
data JSON NOT NULL,
dt DATETIME
DEFAULT CURRENT_TIMESTAMP NOT NULL,
ts TIMESTAMP
DEFAULT CURRENT_TIMESTAMP
ON UPDATE CURRENT_TIMESTAMP NOT NULL,

KEY (public DESC, country),
KEY (public DESC),
KEY (country),
KEY (city),
KEY (title),
SPATIAL KEY (gps),
KEY (dt DESC),
FULLTEXT KEY (ft)
);

Quirks to watch for

Retyping

Casting JSON types into MySQL types can be tricky from time to time. Especially using ->> (JSON_UNQUOTE). In example, booleans go like this

{"public": true}       SELECT data->'$.public'              true
{"public": "true"} SELECT data->'$.public' "true"
{"public": "true"} SELECT data->>'$.public' true
{"public": true} SELECT (data->'$.public' = true) 1
{"public": true} SELECT (data->>'$.public' = true) 0
{"public": "true"} SELECT (data->>'$.public' = true) 0

Combined use of bits (1, 0) and booleans (true, false) for logical operations might lead to unexpected results.

Order inside JSON objects

Will not be preserved.
{ "title": ... , "public": ... , "text": ... }
will become
{ "text": ... , "title": ... , "public": ... }

If I got you worried now, order in arrays will be preserved, of course
[5, 2, 1] will be [5, 2, 1] no matter what 😇

Not a big deal, but better to keep this in mind.

Which database is best for JSON data?

MongoDB is the most popular document database used by companies like Google, Facebook, and Forbes. MongoDB stores data in a binary encoded JSON format (BSON) to offer better data types support, improved indexing and querying.

What is the drawback of JSON in MySQL?

The drawback? If your JSON has multiple fields with the same key, only one of them, the last one, will be retained. The other drawback is that MySQL doesn't support indexing JSON columns, which means that searching through your JSON documents could result in a full table scan.

Which is faster JSON or MySQL?

From the tests I did Inserting or storing data MySQL was quicker and the code is cleaner even though you still have to design the database layout and schema it's still worth it to use MySQL. Reading and fetching data JSON was quicker, i was surprised!

Which MySQL version support JSON?

MySQL provides supports for native JSON data type from version 5.7. 8 that stores JSON document in an internal format, which enables quick and efficient read access to document objects.