从Mysql8.0.17开始,支持在json列上添加多值索引。多值索引会为一条记录添加多条索引记录,查找时,通过索引能快速定位到记录。
要使用多值索引,先通过select version()看一下版本是否支持。
JSON字段的创建和查询
Mysql的json类型是schemaless的,可以插入任意符合json格式的数据,包括数组和对象。
例如,有一张用户表,包括用户标签(tag)、扩展信息(extInfo)都是json类型。tag存放的是字符串数组,extInfo存放的是一个对象,包含age(数值)和address(字符串)。
create table demo_user ( name varchar(64), tag json, extInfo json );
在不加索引的情况下,可以插入任意符合json格式的数据,每行记录的内容都可以不一样。
insert into demo_user(name, tag, extInfo) values ('小明', '["t1","t2"]', '{"age": 20, "address": "a1"}'); insert into demo_user(name, tag, extInfo) values ('小红', '["t1","t2"]', '{"age": 20, "address": 20}'); insert into demo_user(name, tag, extInfo) values ('小李', '["t1","t2"]', '{"age": "aaa", "address": "a1"}'); insert into demo_user(name, tag, extInfo) values ('小花', '{"age": "aaa", "address": "a1"}', '{"age": "aaa", "address": "a1"}'); select * from demo_user; +------+-------------------------------+-------------------------------+ |name |tag |extInfo | +------+-------------------------------+-------------------------------+ |小明 |["t1", "t2"] |{"age": 20, "address": "a1"} | |小红 |["t1", "t2"] |{"age": 20, "address": 20} | |小李 |["t1", "t2"] |{"age": "aaa", "address": "a1"}| |小花 |{"age": "aaa", "address": "a1"}|{"age": "aaa", "address": "a1"}| +------+-------------------------------+-------------------------------+
还没添加索引,索引按照tag、age、adress查询是会走全表扫描。
explain select * from demo_user where json_contains(tag, '"t1"'); +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+ |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra | +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+ |1 |SIMPLE |demo_user|null |ALL |null |null|null |null|4 |100 |Using where| +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+ explain select * from demo_user where json_contains(extInfo->'$.age', '20'); +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+ |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra | +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+ |1 |SIMPLE |demo_user|null |ALL |null |null|null |null|4 |100 |Using where| +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+ explain select * from demo_user where json_contains(extInfo->'$.address', '"a1"'); +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+ |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra | +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+ |1 |SIMPLE |demo_user|null |ALL |null |null|null |null|4 |100 |Using where| +--+-----------+---------+----------+----+-------------+----+-------+----+----+--------+-----------+
添加多值索引
添加多值索引的方式和添加普通索引类似,可以通过create table、alter table或create index添加索引,唯一的区别是对索引列的描述有差别。json是一个schemaless,在创建索引的时候要明确要为哪个key建立索引,并且告知列是什么类型。
通过cast(expr AS type [ARRAY])的方式将json转成指定类型的ARRAY,column是需要加索引的json列,type是json里key的类型。
type的可选值有很多,我们需要用到字符串和整形,其他的在官方文档里有描述[cast函数]
- char(n),转成varchar类型的字符串
- unsigned,转成无符号类型的bigint
-- 将tag里的值转成字符类型 alter table demo_user add index tag( (cast(tag as char(64) array)) ); -- 将extInfo里的age转成无符号整形 alter table demo_user add index age( (cast(extInfo->'$.age' as unsigned array)) ); -- 将extInfo里的address转成字符类型 alter table demo_user add index address( (cast(extInfo->'$.address' as char(255) array)) );
这时候,因为之前插入的数据可能无法转成对应的类型,就会在建立索引的时候报错。
例如,在给tag添加索引时,因为有第四条记录的tag值是{"age": "aaa", "address": "a1"},所以提示JSON Object类型无法转成array类型。
alter table demo_user add index tag( (cast(tag as char(64) array)) ); --- [42000][1235] This version of MySQL doesn't yet support 'CAST-ing JSON OBJECT type to array'
给age添加索引时,提示数据截断,因为第三条里age值为"aaa",转成整数是为空。
alter table demo_user add index age( (cast(extInfo->'$.age' as unsigned array)) ); --- [22001][3903] Data truncation: Invalid JSON value for CAST for functional index 'age'.
当我们把数据修正后就能正常添加上索引,再看看执行explain看一下查询是否走索引:
explain select * from demo_user where json_contains(tag, '"t1"'); +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+ |id|select_type|table |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra | +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+ |1 |SIMPLE |demo_user|null |range|tag |tag|259 |null|4 |100 |Using where| +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+ explain select * from demo_user where json_contains(extInfo->'$.age', '20'); +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+ |id|select_type|table |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra | +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+ |1 |SIMPLE |demo_user|null |range|age |age|9 |null|2 |100 |Using where| +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+ explain select * from demo_user where json_contains(extInfo->'$.address', '"a1"'); +--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+ |id|select_type|table |partitions|type |possible_keys|key |key_len|ref |rows|filtered|Extra | +--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+ |1 |SIMPLE |demo_user|null |range|address |address|1023 |null|3 |100 |Using where| +--+-----------+---------+----------+-----+-------------+-------+-------+----+----+--------+-----------+
可以看到3个查询语句都走到了索引。
除了json_contains能走索引,member of()和json_overlaps()也能用到索引,但是这几个函数的功能是不一样的,具体要查询文档说明。
explain select * from demo_user where 't1' member of (tag); +--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+ |id|select_type|table |partitions|type|possible_keys|key|key_len|ref |rows|filtered|Extra | +--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+ |1 |SIMPLE |demo_user|null |ref |tag |tag|259 |const|4 |100 |Using where| +--+-----------+---------+----------+----+-------------+---+-------+-----+----+--------+-----------+ explain select * from demo_user where JSON_OVERLAPS(tag, '"t1"'); +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+ |id|select_type|table |partitions|type |possible_keys|key|key_len|ref |rows|filtered|Extra | +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+ |1 |SIMPLE |demo_user|null |range|tag |tag|259 |null|4 |100 |Using where| +--+-----------+---------+----------+-----+-------------+---+-------+----+----+--------+-----------+
总结
json列能添加多值索引,提高了json的查询性能。同时因为索引要指定列的类型,索引列就不能使用任意类型,这会失去一部分schemaless的能力。需要注意的是,使用cast函数添加索引时,要选择合适type,不然会出现类型转换失败的异常。
参考
JSON类型文档:MySQL :: MySQL 8.3 Reference Manual :: 11.5 The JSON Data Type
多值索引文档:MySQL :: MySQL 8.3 Reference Manual :: 13.1.15 CREATE INDEX Statement
cast函数文档:MySQL :: MySQL 8.3 Reference Manual :: 12.10 Cast Functions and Operators