mysql high-availability architecture is designed to handle high concurrency and large traffic!

mysql high-availability architecture is designed to handle high concurrency and large traffic!

Main introduction: introduction to replication function, mysql binary log, mysql replication topology, high-availability framework, single point of failure, read-write separation and load balancing introduction, etc.

Introduction to mysql replication function

mysql replication function provides sharing of read load

Problems solved by replication

Realize data distribution on different servers

Incremental use of binary log

Doesn't require much bandwidth

However, the use of row-based replication will put a certain pressure on bandwidth when making large-scale changes, especially in cross-IDC environments.

Realize data distribution on different servers

Achieve load balancing of data reading

Need to cooperate with other components to complete

Use DNS polling to connect program reads to different backup databases,

Use proxy methods such as LVS and haproxy

Non-sharing architecture, the same data is distributed on multiple servers

Enhanced data security

Use standby database backup to reduce the load of the main database

Replication is not a substitute for backup

Realize database high availability and failover

Realize online data upgrade

mysql binary log

mysql service layer log

Binary log

Slow check log

General log

mysql storage engine layer log

innodb log

Redo log

Rollback log

Records all modification events to the mysql database, including addition, deletion, modification, and check events, and modification events to the table structure

Binary log format

Segment-based format (record sql statement)

binlog_format = statement


The amount of log records is relatively small, saving disk and network I/O, only modify or insert one record


Context information must be recorded

Ensure that the execution results of the statement on the slave server and the master server are consistent

For specific functions such as uuid() and user(), such non-deterministic functions still cannot be replicated, which may cause inconsistent data in the primary and secondary servers replicated by mysql

Line-based format

binlog_format = ROW

When the same sql statement modifies 10,000 pieces of data, the segment-based log format will only record this sql statement, and the row-based log format will have 10,000 records to record the data modification of each row.


Make mysql master-slave replication more secure

Modifying each row of data is more efficient than segment-based replication

When the data in the database is modified by mistake, and there is no backup to restore, we can analyze the binary log and reverse the data modification operation recorded in the log to achieve the purpose of restoring the data.


Large log format

binlog_row_image = [full|minimal|noblob]

Mixed log format

binlog_format = mixed


According to the sql statement, the system decides to choose between segment-based and row-based log formats

The size of the data is determined by the SQL statement executed

The impact of mysql binary log format on replication

SQL statement-based replication (SBR)


The amount of logs generated is small, which saves network transmission i/o

It is not mandatory that the table definitions of the master and slave databases are exactly the same

More flexible than row-based replication


For non-deterministic events, the consistency of master-slave replication data cannot be guaranteed

Modifications to stored procedures, triggers, and custom functions may also cause data inconsistencies

Compared to the row-based copy method, more row locks are required when executed from above

Row-based replication


Can be applied to any sql replication including non-deterministic functions, stored procedures, etc.

Can reduce the use of database locks


The table structure of the master and slave data is required to be the same, otherwise the replication may be interrupted

Cannot execute trigger separately from above

How mysql replication works


The master writes changes to the binary log

Read the main binary log changes and write them to relay_log

Log point-based replication

GTID-based replication

Replay the log in relay_log from above

The log based on the sql segment is to re-execute the recorded sql from the library

The row-based log is to directly apply the changes to the database on the slave database

Log point-based replication

Configuration steps based on log point replication

Create a replication account on the main DB server

create user'repl' @'ip段'identified by'password'
grant replication slave on. to'repl' @'ip段'

Configure the slave database server

bin_log = mysql-bin
server_id = 101
relay_log = mysql-relay-bin
log_slave_update = on
read_only = on

Initialize data from the server

mysqldump --master-data=2 -single-transaction
xtrabackup --slave-info

Start the replication link

change master to master_host="master_host_ip",master_user='repl',master_password='password' master_log_file='mysql_log_file_name',master_log_pos=4;

Pros and cons


It is the earliest replication technology supported by mysql, with relatively few bugs

No restrictions on sql query

It's easier to deal with failures


Failover is difficult to regain the log point information of the new master

GTID-based replication

What is GTID

GTID is the global transaction id, which guarantees that a unique id can be generated in the replication cluster for each transaction submitted on the master; GTID=source_id:transaction_id

bin_log =/usr/local/mysql/log/mysql-bin
server_id = 100
gtid_mode = on

Start GTID-based replication

change master to master_host="master_host_ip",master_user='repl',master_password='password', master_auto_position = 1;

Pros and cons


It can be very convenient to professionally troubleshoot

The slave library will not lose any modifications on the master library


Fault handling is more complicated

There are certain restrictions on the executed sql

Questions to consider when choosing a copy mode

The mysql version used

Replication architecture and master-slave switching method

High-availability management components used

Support for the application

mysql replication topology

Before mysql5.7, a slave library can only have one master library

After mysql5.7, support one-slave multi-master architecture

One-master and multiple-slave replication topology


Simple configuration

Multiple slave libraries can be used to share the read load


Use different slave libraries for different businesses

Put a slave database to the remote IDC for disaster recovery

Share the read load of the main library

Master-master replication topology

Configuration considerations

It is best to separate the tables operated by the two masters

Use the following two parameters to control the generation of auto-increment id

auto_increment_increment = 2
auto_increment_offset = 1 | 2

Main matters of master-master replication configuration in master-slave mode

Only one main server provides external services

A server is in a read-only state and only used as a hot standby

The switch will only be performed when the main library that provides external services fails or is scheduled for maintenance

Make the original standby library the main library, and the original main library will become the new standby library, and handle the read-only or offline status, and then go online again after the maintenance is completed

Ensure that the initial data on the two servers are the same

Make sure that binlog has been started on the two servers and have different server_id

Enable read_only on the initial backup

You can also assign several slave libraries to the master library

Cascade replication

mysql replication performance optimization

Factors affecting master-slave delay

The time the main library writes to the binary log

Solution: Control the transaction size of the main database and split large transactions

Binary log transmission time

Solution: Use the mixed log format or set set binlog_row_image=minimal

By default, there is only one sql thread from the library, and the concurrent modification on the master becomes serial from the top

Solution: Use multi-threaded replication. In mysql5.7, SQL threads can be allocated according to the logic clock.

Configuration steps:

stop slave
set global slave_parallel_type ='logical_clock'
set global slave_parallel_workers = 4
start slave

mysql replication common problem handling

Master-slave replication errors caused by data corruption or loss

Errors caused by unexpected downtime of the master library or the slave library


Use skip binary log events

The method of injecting empty transactions first restores the interrupted replication link

Use other methods to compare the data on the master and slave servers

The binary log on the main library is corrupted

The relay log on the standby database is damaged

Master-slave replication error caused by data modification on the slave library

Problems that cannot be solved by mysql replication

Share the write load of the database

Automatic failover and master-slave switchover

Provide read-write separation function

High-availability framework what is high-availability

High availability HA (High Avalilability) refers to improving the availability of systems and applications by minimizing downtime caused by routine maintenance operations (planned) and sudden system crashes (unplanned)

Indicates a commonly used factor for high availability

Normal available time

Percentage of the whole year

Causes of system unavailability

Severe master-slave delay

Master-slave replication interrupted

A lot of blockage caused by locks

Server downtime caused by software and hardware failures, etc.

How to achieve high availability

Avoid factors that cause the system to be unavailable and reduce the time that the system is unavailable

Establish a complete monitoring and alarm system

Perform recovery test on backup data

Correctly configure the database environment

Archive and clean up unnecessary data

Increase system redundancy to ensure that it can be restored as soon as possible when the system is unavailable

Avoid a single point of failure

Master-slave switchover and failover

the reason

A server has run out of disk space,

Poor performance sql

Table structure and index are not optimized

Master-slave data is inconsistent

Human error

Single point of failure

Single point of failure means that there is only one component that provides the same function in a system. If this component fails, it will affect the normal use of the entire system. Each component that makes up the application system may become a single point.

How to avoid mysql single point of failure

Use sun shared storage or drdb disk replication to solve MySQL single point of failure



Use multi-write cluster or ndb cluster to solve MySQL single point of failure

How to solve the single point problem of the main server

After the main server is switched, how to notify the application of the ip address of the new main server

How to check if the mysql master server is available

How to deal with the kind of replication relationship between the slave server and the new master server

MMM architecture introduction

Multi-Master Replication Manager

What functions does MMM provide

MMM monitors the health of mysql master-slave replication

Failover and automatic configuration of other slave-to-master replication when the main library is down

How to find the log synchronization point of the new master library log point corresponding to the slave library

How to deal with inconsistent data in multiple slave libraries

Provides read and write virtual ip, when the main server has a problem, the virtual ip can be automatically migrated

MMM architecture

Resources required for MMM deployment

MMM advantages and disadvantages


Developed using perl scripting language and fully open source

Provides read and write vip (virtual ip) to make server role changes transparent to front-end applications

MMM provides delay monitoring from the server


The release time is relatively early and does not support the new replication function of MySQL

No read load function

When switching between master and slave, it is easy to cause data loss

MMM monitoring service has a single point of failure

Introduction to MHA architecture

Master High Avalilability

Provided functions

Monitor whether the main database service is available

When the master DB is unavailable, a new master database server is elected from multiple slave servers

Provides master-slave switching and failover functions

MHA master-slave switching process

Attempt to save the binary log from the failed primary database

Election of a new candidate master server from multiple candidate slave servers

Synchronize differential binary data between the alternate master server and other slave servers

Apply the binary log saved from the original db server

Introduction to read-write separation and load balancing

One of the main purposes of mysql master-slave replication configuration: in order to share the read load of the master library

Why should read and write separation

Can only be written on the master

Both master and slave read operations are possible

Two ways to separate read and write

Program to achieve read-write separation


The developer controls what kind of query is executed in the slave library, so it is more flexible

There is a program that directly connects to the database, so the performance loss is relatively small


Increase the workload of development and make the program code more complex

Think control is easy to make mistakes

Middleware realizes read-write separation


The middleware analyzes the query syntax and automatically completes the separation of reading and writing

Transparent to the program, no adjustments to existing programs


The middle layer is added, so there is a loss in query efficiency

Delay-sensitive services cannot be automatically executed in the main library

The difference between read-write separation and read load balancing

The problem of read-write separation is how to execute different SQL statements in different roles of the replication cluster

Read load balancing mainly solves how to share the same load for databases with the same role

How to achieve read load balancing







Reference: mysql high-availability architecture design to handle high concurrency and large traffic! -Cloud + Community-Tencent Cloud