MySQL分片
分表规范
- 分表主键全局唯一,不宜过长
- 禁止跨库事务
- 分表操作必须带有分片字段
- 不对分片字段模糊查询
- 优先垂直拆分
- 时间范围查询不宜过长,1-2分片表
- 单裤表300内
- 避免跨库join
- 优化顺序
1优化sql和索引 2加缓存 redis 3主从复制,主主复制,读写分离 4mysql自带分区表 5垂直拆分 6水平拆分
分区
分区只有一张表,逻辑上一张表没变,物理上把数据分配到不同的磁盘里面去。
分区表中无法使用外键约束
分区侧重提高读写性能,分表侧重提高并发性能。
Range分区
数据根据store_id不同,放到不同分区,< 6 , 6 ~ 11 , > 11
create table test( id int not null, report_updated timestamp not null default current_timestamp on update current_timestamp store_id int not null default 0 test_id int not null default 0 test_date date not null )engine=myisam default charset=utf8 partition by range(store_id)( partition p0 values less than (6), partition p1 values less than (11), partition p4 values less than MAXVALUE
根据TIMESTAMP范围
partition by range(unix_timestamp(report_updated))( partition p0 values less than (unix_timestamp('2008-01-01 00:00:00')), partition p1 values less than (unix_timestamp('2008-04-01 00:00:00')), partition p9 values less than maxvalue );
根据date
添加COLUMNS关键字可定义非integer范围及多列范围,不过需要注意COLUMNS括号内只能是列名,不支持函数;多列范围时,多列范围必须呈递增趋势:partition by range columns(test_date)( partition p0 values less than ('1960-01-01'), partition p1 values less than ('1970-01-01'), partition p4 values less than maxvalue )
根据多列范围
partition by range columns(store_id,test_id)( partition p0 values less than (0,10), partition p1 values less than (10,20), partition p5 values less than (maxvalue,maxvalue) )
根据年份
partition by range(year(test_date))( partition p0 values less than (1991), partition p1 values less than (1996), partition p4 values less than MAXVALUE ); 删除旧数据就可以alter table staff drop partition p0直接删除
List分区
drop table if exists staff;
create table staff(
id int not null,
store_id int not null default 0
)
partition by list(store_id)(
partition pNorth values in (3,5,6,9,17),
partition pEast values in (1,2,10,11,19,20),
partition pCentral values in (7,8,15,16)
);
数据根据类别放入不同分区,如果数据不在上面的范围,插入会报错
Hash分区
Hash分区主要用来确保数据在预先确定数目的分区中平均分布,括号内只能是整数列或返回确定整数的函数
LINEAR关键字表示使用线性Hash分区
create table test_linear(
id int not null default 0
)engine=innodb default charset=utf8
partition by linear hash(id)
partitions 10; // 分区数量
Key分区
Key分区与Hash相似,只是Hash函数不同
create table staff(
id int not null,
store_id int not null default 0
)
partition by linear key(store_id)
partitions 4;
水平拆分
水平分表
以某字段为依据,按照一定规律策略把表分成多个表
例:user -> user0(user_id%2=0放这里), user1(user_id%2=1放这里)
每个表结构一样,每个表数据不同,所有表合起来是全量数据
单表数据太大,影响效率,可以考虑这样分表
水平分库
以某字段为依据,按照一定规律策略把库分成多个库
db0(user0,user1) -> db0(…) , db1(user0,user1) | user_id%2=0放到db0,=1放到db1
每个库结构一样,每个库数据不同,合起来是全量数据
系统整体并发高,单机瓶颈考虑分库
垂直拆分
垂直分表
将表的字段按照访问频率分成多个表
例:user(id,name,address) -> user_base(id,name) , user_other(id,address)
每个表结构不同
每个表的数据也不同,一般每个表的字段至少有一列交集,一般是主键,用于关联数据;
表数据不大,字段多,冷热数据放一起,考虑垂直分表
拆分后,全量数据需要关联两表取数据,别用join,join会将两个表耦合在一起(必须在一个数据库实例上)。
关联数据,应该在业务Service层做文章,分别获取主表和扩展表数据然后用关联字段关联得到全部数据。
垂直分库
以表为依据,按照业务不同,将表放到不同数据库中
例:db(product,config) -> db_config(config) , db_product(product)
每个库结构不同,数据不同
所有库合起来是全量数据
系统整体并发高,能够按业务抽象模块,考虑垂直分库
ShardingSphere
分布式的数据库生态系统
https://shardingsphere.apache.org
inline 行表达式分片策略
<!-- druid -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid-spring-boot-starter</artifactId>
<version>1.2.6</version>
</dependency>
<!--sharding jdbc springboot-->
<dependency>
<groupId>org.apache.shardingsphere</groupId>
<artifactId>sharding-jdbc-spring-boot-starter</artifactId>
<version>4.0.1</version>
</dependency>
// 启动类
// 去除掉对DruidDataSourceAutoConfigure的自动配置
@SpringBootApplication(exclude = {DruidDataSourceAutoConfigure.class})
- 配置
spring: main: allow-bean-definition-overriding: true shardingsphere: # 打印sql props: sql: show: true datasource: names: m1,m2 # 数据源,库 m1: type: com.alibaba.druid.pool.DruidDataSource driver-class-name: com.mysql.cj.jdbc.Driver url: jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8 username: zwq password: zwqzwq m2: type: com.alibaba.druid.pool.DruidDataSource driver-class-name: com.mysql.cj.jdbc.Driver url: jdbc:mysql://localhost:3306/test2?serverTimezone=GMT%2B8 username: zwq password: zwqzwq sharding: tables: # 分表 course: # 逻辑表:course表 actual-data-nodes: m$->{1..2}.course_$->{1..2} # 真正的表:这里指m1,m2库的course_1和course_2 key-generator: # 主键生成:雪花 column: id type: SNOWFLAKE props: worker: id: 1 table-strategy: # 分表策略 inline: sharding-column: id # 分片字段 algorithm-expression: course_$->{id%2+1} # 策略:计算id%2+1 比如id:1 -> 1%2+1=2 -> 放到course_2 database-strategy: # 分库策略 inline: sharding-column: id # 分片字段 algorithm-expression: m$->{id%2+1} # 策略:计算id%2+1 比如id:1 -> 1%2+1=2 -> 放到m2
@Data @TableName("course") public class Course { @TableId("id") private Long id; @TableField("name") private String name; @TableField("score") private Integer score; @TableField("user_id") private Long userId; }
- 插入代码
@RunWith(SpringRunner.class) @SpringBootTest class ShardingSphereApplicationTests { @Resource CourseMapper courseMapper; @Test void contextLoads() { for (int i = 0; i < 10; i++) { Course course = new Course(); course.setScore(i*i); course.setName("name" + i); course.setUserId((long) (i*i)); courseMapper.insert(course); } } } // 分库会根据id奇偶放到不同数据库m1,m2 // 分表会根据id奇偶放到不同表course_1,course_2
// 查询 @Test void query() { QueryWrapper queryWrapper = new QueryWrapper(); queryWrapper.orderByDesc("id"); queryWrapper.eq("id", 1663463025083662338L); List<Course> courseList = courseMapper.selectList(queryWrapper); courseList.forEach(item -> { System.out.println(item); }); } // inline策略不支持范围查询 @Test void queryByRange() { QueryWrapper queryWrapper = new QueryWrapper(); queryWrapper.orderByDesc("id"); // 会报错 queryWrapper.between("id", 1663463022755823617L, 1663463025083662338L); List<Course> courseList = courseMapper.selectList(queryWrapper); courseList.forEach(item -> { System.out.println(item); }); }
standard 标准分片策略
支持精确查询,范围查询
- 配置文件
spring: main: allow-bean-definition-overriding: true shardingsphere: # 打印sql props: sql: show: true datasource: names: m1,m2 # 数据源,库 m1: type: com.alibaba.druid.pool.DruidDataSource driver-class-name: com.mysql.cj.jdbc.Driver url: jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8 username: zwq password: zwqzwq m2: type: com.alibaba.druid.pool.DruidDataSource driver-class-name: com.mysql.cj.jdbc.Driver url: jdbc:mysql://localhost:3306/test2?serverTimezone=GMT%2B8 username: zwq password: zwqzwq sharding: tables: # 分表 course: # 逻辑表:course表 actual-data-nodes: m$->{1..2}.course_$->{1..2} # 真正的表:这里指m1,m2库的course_1和course_2 key-generator: # 主键生成:雪花 column: id type: SNOWFLAKE props: worker: id: 1 table-strategy: # 分表策略 standard: sharding-column: id # 分片字段 precise-algorithm-class-name: com.zwq.sharding.standard.MyPreciseTable # 精确查询 分表策略 range-algorithm-class-name: com.zwq.sharding.standard.MyRangeTable # 范围查询 分表策略 database-strategy: # 分库策略 standard: sharding-column: id # 分片字段 precise-algorithm-class-name: com.zwq.sharding.standard.MyPreciseDS # 精确查询 分库策略 range-algorithm-class-name: com.zwq.sharding.standard.MyRangeDS # 范围查询 分库策略
- 自定义配置策略类
public class MyPreciseTable implements PreciseShardingAlgorithm<Long> { @Override public String doSharding(Collection<String> collection, PreciseShardingValue<Long> preciseShardingValue) { // collection: 可用的真实表:course_1,course_2 // 精确查询分表策略 // 逻辑表名字 String logicTableName = preciseShardingValue.getLogicTableName(); // 分片字段的名字(id) String id = preciseShardingValue.getColumnName(); // 分片字段的值 Long idValue = preciseShardingValue.getValue(); // 实现策略:course_$->{id%2+1} BigInteger value = BigInteger.valueOf(idValue); BigInteger modValue = value.mod(new BigInteger("2")).add(new BigInteger("1")); String key = logicTableName + "_" + modValue; // 真实表是否存在 if (collection.contains(key)) { return key; } throw new UnsupportedOperationException("route " + key + " is not supported, please check your config"); } }
public class MyPreciseDS implements PreciseShardingAlgorithm<Long> { @Override public String doSharding(Collection<String> collection, PreciseShardingValue<Long> preciseShardingValue) { // collection: 可用的真实表:course_1,course_2 // 精确查询分表策略 // 分片字段的值 Long idValue = preciseShardingValue.getValue(); // 实现策略:m$->{id%2+1} BigInteger value = BigInteger.valueOf(idValue); BigInteger modValue = value.mod(new BigInteger("2")).add(new BigInteger("1")); String key = "m" + modValue; // 真实库是否存在 if (collection.contains(key)) { return key; } throw new UnsupportedOperationException("route " + key + " is not supported, please check your config"); } }
public class MyRangeTable implements RangeShardingAlgorithm<Long> { @Override public Collection<String> doSharding(Collection collection, RangeShardingValue rangeShardingValue) { // 逻辑表名字 String logicTableName = rangeShardingValue.getLogicTableName(); // 上限,下限select * ... from ... between 1 and 100 -> 那下面分别是1和100 Comparable upperEndpoint = rangeShardingValue.getValueRange().upperEndpoint(); Comparable lowerEndpoint = rangeShardingValue.getValueRange().lowerEndpoint(); // 范围查询 需要查所有表,这里返回所有真实表 return Arrays.asList(logicTableName + "_1", logicTableName + "_2"); } }
public class MyRangeDS implements RangeShardingAlgorithm<Long> { @Override public Collection<String> doSharding(Collection<String> collection, RangeShardingValue<Long> rangeShardingValue) { // 逻辑表名字 String logicTableName = rangeShardingValue.getLogicTableName(); // 上限,下限select * ... from ... between 1 and 100 -> 那下面分别是1和100 Comparable upperEndpoint = rangeShardingValue.getValueRange().upperEndpoint(); Comparable lowerEndpoint = rangeShardingValue.getValueRange().lowerEndpoint(); // 范围查询 需要查所有库,这里返回所有数据源 return Arrays.asList("m1", "m2"); } }
- 范围查询
@Test void queryByRange() { QueryWrapper queryWrapper = new QueryWrapper(); queryWrapper.orderByDesc("id"); queryWrapper.between("id", 1663463022755823617L, 1663463025083662338L); List<Course> courseList = courseMapper.selectList(queryWrapper); courseList.forEach(item -> { System.out.println(item); }); }
complex 复合分片策略
支持对多个分片健操作,多个分片字段类型需要一致
spring:
main:
allow-bean-definition-overriding: true
shardingsphere:
# 打印sql
props:
sql:
show: true
datasource:
names: m1,m2 # 数据源,库
m1:
type: com.alibaba.druid.pool.DruidDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8
username: zwq
password: zwqzwq
m2:
type: com.alibaba.druid.pool.DruidDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://localhost:3306/test2?serverTimezone=GMT%2B8
username: zwq
password: zwqzwq
sharding:
tables: # 分表
course: # 逻辑表:course表
actual-data-nodes: m$->{1..2}.course_$->{1..2} # 真正的表:这里指m1,m2库的course_1和course_2
key-generator: # 主键生成:雪花
column: id
type: SNOWFLAKE
props:
worker:
id: 1
table-strategy: # 分表策略
complex:
sharding-columns: id, user_id # 分片字段
algorithm-class-name: com.zwq.sharding.complex.MyComplexTable # 分表策略
database-strategy: # 分库策略
complex:
sharding-columns: id, user_id # 分片字段
algorithm-class-name: com.zwq.sharding.complex.MyComplexDS # 分库策略
- 自定义策略
public class MyComplexDS implements ComplexKeysShardingAlgorithm<Long> { @Override public Collection<String> doSharding(Collection<String> collection, ComplexKeysShardingValue<Long> complexKeysShardingValue) { Range<Long> id = complexKeysShardingValue.getColumnNameAndRangeValuesMap().get("id"); Collection<Long> userId = complexKeysShardingValue.getColumnNameAndShardingValuesMap().get("user_id"); List<String> res = new ArrayList(); for (Long item: userId) { BigInteger bigInteger = BigInteger.valueOf(item); // 策略取模加一 BigInteger target = bigInteger.mod(new BigInteger("2")).add(new BigInteger("1")); res.add("m" + target); } return res; } }
public class MyComplexTable implements ComplexKeysShardingAlgorithm<Long> { @Override public Collection<String> doSharding(Collection<String> collection, ComplexKeysShardingValue<Long> complexKeysShardingValue) { Range<Long> id = complexKeysShardingValue.getColumnNameAndRangeValuesMap().get("id"); Collection<Long> userId = complexKeysShardingValue.getColumnNameAndShardingValuesMap().get("user_id"); List<String> res = new ArrayList(); for (Long item: userId) { BigInteger bigInteger = BigInteger.valueOf(item); // 策略取模加一 BigInteger target = bigInteger.mod(new BigInteger("2")).add(new BigInteger("1")); res.add(complexKeysShardingValue.getLogicTableName() + "_" + target); } return res; } }
hint
通过Hint指定分片值而非从SQL中提取分片值的方式进行分片的策略
spring:
main:
allow-bean-definition-overriding: true
shardingsphere:
# 打印sql
props:
sql:
show: true
datasource:
names: m1,m2 # 数据源,库
m1:
type: com.alibaba.druid.pool.DruidDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8
username: zwq
password: zwqzwq
m2:
type: com.alibaba.druid.pool.DruidDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://localhost:3306/test2?serverTimezone=GMT%2B8
username: zwq
password: zwqzwq
sharding:
tables: # 分表
course: # 逻辑表:course表
actual-data-nodes: m$->{1..2}.course_$->{1..2} # 真正的表:这里指m1,m2库的course_1和course_2
key-generator: # 主键生成:雪花
column: id
type: SNOWFLAKE
props:
worker:
id: 1
table-strategy: # 分表策略
hint:
algorithm-class-name: com.zwq.sharding.hint.MyHint
public class MyHint implements HintShardingAlgorithm<Integer> {
@Override
public Collection<String> doSharding(Collection<String> collection, HintShardingValue<Integer> hintShardingValue) {
String key = hintShardingValue.getLogicTableName() + "_" + hintShardingValue.getValues().toArray()[0];
if (collection.contains(key)) {
return Arrays.asList(key);
}
throw new UnsupportedOperationException("route " + key + " is not supported, please check your config");
}
}
- 指定查course_2
@Test void queryHint() { HintManager manager = HintManager.getInstance(); manager.addTableShardingValue("course", 2); QueryWrapper queryWrapper = new QueryWrapper(); List<Course> courseList = courseMapper.selectList(null); courseList.forEach(item -> { System.out.println(item); }); manager.close(); }
广播表
所有库的表数据保持一致
spring:
sharding:
tables: # 分表
dict:
key-generator: # 主键生成:雪花
column: id
type: SNOWFLAKE
broadcast-tables: dict # 广播表:所有库的dict表数据保持一致
@Data
public class Dict {
@TableId("id")
private Long id;
@TableField("value")
private String value;
@TableField("status")
private Integer status;
}
@Test
void addDict() {
Dict dict = new Dict();
dict.setStatus(1);
dict.setValue("生效");
dictMapper.insert(dict);
}
所有库的dict表都会插入同样数据
绑定表
绑定表,逻辑表直接left join关联会产生笛卡尔集,查询出多余数据
spring:
shardingsphere:
sharding:
binding-tables[0]: user, dict
添加绑定表后解决笛卡尔集问题,正常的外连接:user1 left join dict1, user2 left join dict2