MySQL分片


MySQL分片

分表规范

  1. 分表主键全局唯一,不宜过长
  2. 禁止跨库事务
  3. 分表操作必须带有分片字段
  4. 不对分片字段模糊查询
  5. 优先垂直拆分
  6. 时间范围查询不宜过长,1-2分片表
  7. 单裤表300内
  8. 避免跨库join
  • 优化顺序
    1优化sql和索引 2加缓存 redis 3主从复制,主主复制,读写分离 4mysql自带分区表 5垂直拆分 6水平拆分

分区

分区只有一张表,逻辑上一张表没变,物理上把数据分配到不同的磁盘里面去。
分区表中无法使用外键约束
分区侧重提高读写性能,分表侧重提高并发性能。

Range分区

  • 数据根据store_id不同,放到不同分区,< 6 , 6 ~ 11 , > 11

    create table test(
        id int not null,
    	report_updated timestamp not null default current_timestamp on update current_timestamp
        store_id int not null default 0
    	test_id int not null default 0
    	test_date date not null
    )engine=myisam default charset=utf8
    partition by range(store_id)(
        partition p0 values less than (6),
        partition p1 values less than (11),
        partition p4 values less than MAXVALUE 
  • 根据TIMESTAMP范围

    partition by range(unix_timestamp(report_updated))(
      partition p0 values less than (unix_timestamp('2008-01-01 00:00:00')),
      partition p1 values less than (unix_timestamp('2008-04-01 00:00:00')),
      partition p9 values less than maxvalue
    );
  • 根据date
    添加COLUMNS关键字可定义非integer范围及多列范围,不过需要注意COLUMNS括号内只能是列名,不支持函数;多列范围时,多列范围必须呈递增趋势:

    partition by range columns(test_date)(
      partition p0 values less than ('1960-01-01'),
      partition p1 values less than ('1970-01-01'),
      partition p4 values less than maxvalue
    )
  • 根据多列范围

    partition by range columns(store_id,test_id)(
      partition p0 values less than (0,10),
      partition p1 values less than (10,20),
      partition p5 values less than (maxvalue,maxvalue)
    )
  • 根据年份

    partition by range(year(test_date))(
      partition p0 values less than (1991),
      partition p1 values less than (1996),
      partition p4 values less than MAXVALUE
    );
    删除旧数据就可以alter table staff drop partition p0直接删除

List分区

drop table if exists staff;
create table staff(
  id int not null,
  store_id int not null default 0
)
partition by list(store_id)(
  partition pNorth values in (3,5,6,9,17),
  partition pEast values in (1,2,10,11,19,20),
  partition pCentral values in (7,8,15,16)
);
数据根据类别放入不同分区,如果数据不在上面的范围,插入会报错

Hash分区

Hash分区主要用来确保数据在预先确定数目的分区中平均分布,括号内只能是整数列或返回确定整数的函数
LINEAR关键字表示使用线性Hash分区

create table test_linear(
  id int not null default 0
)engine=innodb default charset=utf8
partition by linear hash(id) 
partitions 10;  // 分区数量

Key分区

Key分区与Hash相似,只是Hash函数不同

create table staff(
  id int not null,
  store_id int not null default 0
)
partition by linear key(store_id)
partitions 4;

水平拆分

水平分表

以某字段为依据,按照一定规律策略把表分成多个表
例:user -> user0(user_id%2=0放这里), user1(user_id%2=1放这里)
每个表结构一样,每个表数据不同,所有表合起来是全量数据
单表数据太大,影响效率,可以考虑这样分表

水平分库

以某字段为依据,按照一定规律策略把库分成多个库
db0(user0,user1) -> db0(…) , db1(user0,user1) | user_id%2=0放到db0,=1放到db1
每个库结构一样,每个库数据不同,合起来是全量数据
系统整体并发高,单机瓶颈考虑分库

垂直拆分

垂直分表

将表的字段按照访问频率分成多个表
例:user(id,name,address) -> user_base(id,name) , user_other(id,address)
每个表结构不同
每个表的数据也不同,一般每个表的字段至少有一列交集,一般是主键,用于关联数据;
表数据不大,字段多,冷热数据放一起,考虑垂直分表
拆分后,全量数据需要关联两表取数据,别用join,join会将两个表耦合在一起(必须在一个数据库实例上)。
关联数据,应该在业务Service层做文章,分别获取主表和扩展表数据然后用关联字段关联得到全部数据。

垂直分库

以表为依据,按照业务不同,将表放到不同数据库中
例:db(product,config) -> db_config(config) , db_product(product)
每个库结构不同,数据不同
所有库合起来是全量数据
系统整体并发高,能够按业务抽象模块,考虑垂直分库

ShardingSphere

分布式的数据库生态系统
https://shardingsphere.apache.org

inline 行表达式分片策略

<!-- druid -->
   <dependency>
       <groupId>com.alibaba</groupId>
       <artifactId>druid-spring-boot-starter</artifactId>
       <version>1.2.6</version>
   </dependency>
   <!--sharding jdbc springboot-->
   <dependency>
       <groupId>org.apache.shardingsphere</groupId>
       <artifactId>sharding-jdbc-spring-boot-starter</artifactId>
       <version>4.0.1</version>
   </dependency>
// 启动类
// 去除掉对DruidDataSourceAutoConfigure的自动配置
@SpringBootApplication(exclude = {DruidDataSourceAutoConfigure.class})
  • 配置
    spring:
      main:
        allow-bean-definition-overriding: true
      shardingsphere:
        # 打印sql
        props:
          sql:
            show: true
        datasource:
          names: m1,m2 # 数据源,库
          m1:
            type: com.alibaba.druid.pool.DruidDataSource
            driver-class-name: com.mysql.cj.jdbc.Driver
            url: jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8
            username: zwq
            password: zwqzwq
          m2:
            type: com.alibaba.druid.pool.DruidDataSource
            driver-class-name: com.mysql.cj.jdbc.Driver
            url: jdbc:mysql://localhost:3306/test2?serverTimezone=GMT%2B8
            username: zwq
            password: zwqzwq
        sharding:
          tables: # 分表
            course: # 逻辑表:course表
              actual-data-nodes: m$->{1..2}.course_$->{1..2} # 真正的表:这里指m1,m2库的course_1和course_2
              key-generator: # 主键生成:雪花
                column: id
                type: SNOWFLAKE
                props:
                  worker:
                    id: 1
              table-strategy: # 分表策略
                inline:
                  sharding-column: id # 分片字段
                  algorithm-expression: course_$->{id%2+1} # 策略:计算id%2+1 比如id:1 -> 1%2+1=2 -> 放到course_2
              database-strategy: # 分库策略
                inline:
                  sharding-column: id # 分片字段
                  algorithm-expression: m$->{id%2+1} # 策略:计算id%2+1 比如id:1 -> 1%2+1=2 -> 放到m2
    
    @Data
    @TableName("course")
    public class Course {
        @TableId("id")
        private Long id;
        @TableField("name")
        private String name;
        @TableField("score")
        private Integer score;
        @TableField("user_id")
        private Long userId;
    }
  • 插入代码
    @RunWith(SpringRunner.class)
    @SpringBootTest
    class ShardingSphereApplicationTests {
        @Resource
        CourseMapper courseMapper;
        @Test
        void contextLoads() {
            for (int i = 0; i < 10; i++) {
                Course course = new Course();
                course.setScore(i*i);
                course.setName("name" + i);
    			course.setUserId((long) (i*i));
                courseMapper.insert(course);
            }
        }
    }
    // 分库会根据id奇偶放到不同数据库m1,m2
    // 分表会根据id奇偶放到不同表course_1,course_2
    // 查询
    @Test
       void query() {
           QueryWrapper queryWrapper = new QueryWrapper();
           queryWrapper.orderByDesc("id");
           queryWrapper.eq("id", 1663463025083662338L);
           List<Course> courseList = courseMapper.selectList(queryWrapper);
           courseList.forEach(item -> {
               System.out.println(item);
           });
       }
    // inline策略不支持范围查询
    @Test
       void queryByRange() {
           QueryWrapper queryWrapper = new QueryWrapper();
           queryWrapper.orderByDesc("id");
    	// 会报错
           queryWrapper.between("id", 1663463022755823617L, 1663463025083662338L);
           List<Course> courseList = courseMapper.selectList(queryWrapper);
           courseList.forEach(item -> {
               System.out.println(item);
           });
       }

standard 标准分片策略

支持精确查询,范围查询

  • 配置文件
    spring:
      main:
        allow-bean-definition-overriding: true
      shardingsphere:
        # 打印sql
        props:
          sql:
            show: true
        datasource:
          names: m1,m2 # 数据源,库
          m1:
            type: com.alibaba.druid.pool.DruidDataSource
            driver-class-name: com.mysql.cj.jdbc.Driver
            url: jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8
            username: zwq
            password: zwqzwq
          m2:
            type: com.alibaba.druid.pool.DruidDataSource
            driver-class-name: com.mysql.cj.jdbc.Driver
            url: jdbc:mysql://localhost:3306/test2?serverTimezone=GMT%2B8
            username: zwq
            password: zwqzwq
        sharding:
          tables: # 分表
            course: # 逻辑表:course表
              actual-data-nodes: m$->{1..2}.course_$->{1..2} # 真正的表:这里指m1,m2库的course_1和course_2
              key-generator: # 主键生成:雪花
                column: id
                type: SNOWFLAKE
                props:
                  worker:
                    id: 1
              table-strategy: # 分表策略
                standard:
                  sharding-column: id # 分片字段
                  precise-algorithm-class-name: com.zwq.sharding.standard.MyPreciseTable # 精确查询 分表策略
                  range-algorithm-class-name: com.zwq.sharding.standard.MyRangeTable # 范围查询 分表策略
              database-strategy: # 分库策略
                standard:
                  sharding-column: id # 分片字段
                  precise-algorithm-class-name: com.zwq.sharding.standard.MyPreciseDS # 精确查询 分库策略
                  range-algorithm-class-name: com.zwq.sharding.standard.MyRangeDS # 范围查询 分库策略
  • 自定义配置策略类
    public class MyPreciseTable implements PreciseShardingAlgorithm<Long> {
        @Override
        public String doSharding(Collection<String> collection, PreciseShardingValue<Long> preciseShardingValue) {
            // collection: 可用的真实表:course_1,course_2
            // 精确查询分表策略
            // 逻辑表名字
            String logicTableName = preciseShardingValue.getLogicTableName();
            // 分片字段的名字(id)
            String id = preciseShardingValue.getColumnName();
            // 分片字段的值
            Long idValue = preciseShardingValue.getValue();
            //  实现策略:course_$->{id%2+1}
            BigInteger value = BigInteger.valueOf(idValue);
            BigInteger modValue = value.mod(new BigInteger("2")).add(new BigInteger("1"));
            String key = logicTableName + "_" + modValue;
            // 真实表是否存在
            if (collection.contains(key)) {
                return key;
            }
            throw new UnsupportedOperationException("route " + key + " is not supported, please check your config");
        }
    }
    public class MyPreciseDS implements PreciseShardingAlgorithm<Long> {
        @Override
        public String doSharding(Collection<String> collection, PreciseShardingValue<Long> preciseShardingValue) {
            // collection: 可用的真实表:course_1,course_2
            // 精确查询分表策略
            // 分片字段的值
            Long idValue = preciseShardingValue.getValue();
            //  实现策略:m$->{id%2+1}
            BigInteger value = BigInteger.valueOf(idValue);
            BigInteger modValue = value.mod(new BigInteger("2")).add(new BigInteger("1"));
            String key = "m" + modValue;
            // 真实库是否存在
            if (collection.contains(key)) {
                return key;
            }
            throw new UnsupportedOperationException("route " + key + " is not supported, please check your config");
        }
    }
    public class MyRangeTable implements RangeShardingAlgorithm<Long> {
        @Override
        public Collection<String> doSharding(Collection collection, RangeShardingValue rangeShardingValue) {
            // 逻辑表名字
            String logicTableName = rangeShardingValue.getLogicTableName();
            // 上限,下限select * ... from ... between 1 and 100 -> 那下面分别是1和100
            Comparable upperEndpoint = rangeShardingValue.getValueRange().upperEndpoint();
            Comparable lowerEndpoint = rangeShardingValue.getValueRange().lowerEndpoint();
            // 范围查询 需要查所有表,这里返回所有真实表
            return Arrays.asList(logicTableName + "_1", logicTableName + "_2");
        }
    }
    public class MyRangeDS implements RangeShardingAlgorithm<Long> {
        @Override
        public Collection<String> doSharding(Collection<String> collection, RangeShardingValue<Long> rangeShardingValue) {
            // 逻辑表名字
            String logicTableName = rangeShardingValue.getLogicTableName();
            // 上限,下限select * ... from ... between 1 and 100 -> 那下面分别是1和100
            Comparable upperEndpoint = rangeShardingValue.getValueRange().upperEndpoint();
            Comparable lowerEndpoint = rangeShardingValue.getValueRange().lowerEndpoint();
            // 范围查询 需要查所有库,这里返回所有数据源
            return Arrays.asList("m1", "m2");
        }
    }
  • 范围查询
    @Test
       void queryByRange() {
           QueryWrapper queryWrapper = new QueryWrapper();
           queryWrapper.orderByDesc("id");
           queryWrapper.between("id", 1663463022755823617L, 1663463025083662338L);
           List<Course> courseList = courseMapper.selectList(queryWrapper);
           courseList.forEach(item -> {
               System.out.println(item);
           });
       }

complex 复合分片策略

支持对多个分片健操作,多个分片字段类型需要一致

spring:
  main:
    allow-bean-definition-overriding: true
  shardingsphere:
    # 打印sql
    props:
      sql:
        show: true
    datasource:
      names: m1,m2 # 数据源,库
      m1:
        type: com.alibaba.druid.pool.DruidDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        url: jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8
        username: zwq
        password: zwqzwq
      m2:
        type: com.alibaba.druid.pool.DruidDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        url: jdbc:mysql://localhost:3306/test2?serverTimezone=GMT%2B8
        username: zwq
        password: zwqzwq
    sharding:
      tables: # 分表
        course: # 逻辑表:course表
          actual-data-nodes: m$->{1..2}.course_$->{1..2} # 真正的表:这里指m1,m2库的course_1和course_2
          key-generator: # 主键生成:雪花
            column: id
            type: SNOWFLAKE
            props:
              worker:
                id: 1
          table-strategy: # 分表策略
            complex:
              sharding-columns: id, user_id # 分片字段
              algorithm-class-name: com.zwq.sharding.complex.MyComplexTable # 分表策略
          database-strategy: # 分库策略
            complex:
              sharding-columns: id, user_id # 分片字段
              algorithm-class-name: com.zwq.sharding.complex.MyComplexDS # 分库策略
  • 自定义策略
    public class MyComplexDS implements ComplexKeysShardingAlgorithm<Long> {
        @Override
        public Collection<String> doSharding(Collection<String> collection, ComplexKeysShardingValue<Long> complexKeysShardingValue) {
            Range<Long> id = complexKeysShardingValue.getColumnNameAndRangeValuesMap().get("id");
            Collection<Long> userId = complexKeysShardingValue.getColumnNameAndShardingValuesMap().get("user_id");
            List<String> res = new ArrayList();
            for (Long item: userId) {
                BigInteger bigInteger = BigInteger.valueOf(item);
                // 策略取模加一
                BigInteger target = bigInteger.mod(new BigInteger("2")).add(new BigInteger("1"));
                res.add("m" + target);
            }
            return res;
        }
    }
    public class MyComplexTable implements ComplexKeysShardingAlgorithm<Long> {
        @Override
        public Collection<String> doSharding(Collection<String> collection, ComplexKeysShardingValue<Long> complexKeysShardingValue) {
            Range<Long> id = complexKeysShardingValue.getColumnNameAndRangeValuesMap().get("id");
            Collection<Long> userId = complexKeysShardingValue.getColumnNameAndShardingValuesMap().get("user_id");
            List<String> res = new ArrayList();
            for (Long item: userId) {
                BigInteger bigInteger = BigInteger.valueOf(item);
                // 策略取模加一
                BigInteger target = bigInteger.mod(new BigInteger("2")).add(new BigInteger("1"));
                res.add(complexKeysShardingValue.getLogicTableName() + "_" + target);
            }
            return res;
        }
    }

hint

通过Hint指定分片值而非从SQL中提取分片值的方式进行分片的策略

spring:
  main:
    allow-bean-definition-overriding: true
  shardingsphere:
    # 打印sql
    props:
      sql:
        show: true
    datasource:
      names: m1,m2 # 数据源,库
      m1:
        type: com.alibaba.druid.pool.DruidDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        url: jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8
        username: zwq
        password: zwqzwq
      m2:
        type: com.alibaba.druid.pool.DruidDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        url: jdbc:mysql://localhost:3306/test2?serverTimezone=GMT%2B8
        username: zwq
        password: zwqzwq
    sharding:
      tables: # 分表
        course: # 逻辑表:course表
          actual-data-nodes: m$->{1..2}.course_$->{1..2} # 真正的表:这里指m1,m2库的course_1和course_2
          key-generator: # 主键生成:雪花
            column: id
            type: SNOWFLAKE
            props:
              worker:
                id: 1
          table-strategy: # 分表策略
            hint:
              algorithm-class-name: com.zwq.sharding.hint.MyHint
public class MyHint implements HintShardingAlgorithm<Integer> {
    @Override
    public Collection<String> doSharding(Collection<String> collection, HintShardingValue<Integer> hintShardingValue) {
        String key = hintShardingValue.getLogicTableName() + "_" + hintShardingValue.getValues().toArray()[0];
        if (collection.contains(key)) {
            return Arrays.asList(key);
        }
        throw new UnsupportedOperationException("route " + key + " is not supported, please check your config");
    }
}
  • 指定查course_2
    @Test
       void queryHint() {
           HintManager manager = HintManager.getInstance();
           manager.addTableShardingValue("course", 2);
           QueryWrapper queryWrapper = new QueryWrapper();
           List<Course> courseList = courseMapper.selectList(null);
           courseList.forEach(item -> {
               System.out.println(item);
           });
           manager.close();
       }

广播表

所有库的表数据保持一致

spring:
    sharding:
      tables: # 分表
        dict:
          key-generator: # 主键生成:雪花
            column: id
            type: SNOWFLAKE
      broadcast-tables: dict # 广播表:所有库的dict表数据保持一致
@Data
public class Dict {
    @TableId("id")
    private Long id;
    @TableField("value")
    private String value;
    @TableField("status")
    private Integer status;
}
@Test
void addDict() {
    Dict dict = new Dict();
    dict.setStatus(1);
    dict.setValue("生效");
    dictMapper.insert(dict);
}

所有库的dict表都会插入同样数据

绑定表

绑定表,逻辑表直接left join关联会产生笛卡尔集,查询出多余数据

spring:
  shardingsphere:
    sharding:
      binding-tables[0]: user, dict

添加绑定表后解决笛卡尔集问题,正常的外连接:user1 left join dict1, user2 left join dict2


  目录