MySQL Group Replication Configuration

MySQL Group Replication is a MySQL native solution providing high availability and automatic failover without external dependencies. It uses a group communication system to coordinate transactions across multiple server instances, ensuring consistency and enabling applications to detect primary failures and reconnect automatically. This comprehensive guide covers installation, configuration, operation, and best practices for deploying MySQL Group Replication in single-primary and multi-primary modes.

Overview and Topology Options

MySQL Group Replication provides two primary topologies: single-primary mode, where one server accepts write traffic while others serve as read replicas, and multi-primary mode, where all servers accept writes with automatic conflict resolution. Single-primary mode offers simpler application integration but accepts writes only on the primary. Multi-primary mode eliminates single points of failure for write operations but requires application awareness of distributed writes.

Both modes provide automatic failover, read-your-writes consistency guarantees, and transaction ordering. The group uses a distributed algorithm to reach consensus on the order and consistency of transactions, ensuring all servers execute transactions identically.

Installation and Prerequisites

Install MySQL 8.0 or later with Group Replication support. Ensure all nodes run the same MySQL version for compatibility:

# Ubuntu 20.04/22.04
curl -fsSL https://dev.mysql.com/get/mysql-apt-config_0.8.19-1_all.deb -o /tmp/mysql-apt-config.deb
sudo dpkg -i /tmp/mysql-apt-config.deb
sudo apt-get update
sudo apt-get install -y mysql-server mysql-client

# CentOS/RHEL 8/9
sudo dnf module enable mysql:8.0
sudo dnf install -y mysql-server mysql-client mysql-community-server

# Verify installation
mysql --version

Create a MySQL system user for replication:

sudo useradd -r -s /bin/false mysql 2>/dev/null || true

Initialize MySQL data directory:

# Ubuntu/Debian
sudo mysql_install_db --user=mysql --datadir=/var/lib/mysql

# RHEL/CentOS
sudo mysqld --initialize --user=mysql --datadir=/var/lib/mysql

Enable and start MySQL service:

sudo systemctl enable mysql
sudo systemctl start mysql

# Initial login (empty password or check /root/.mysql_secret)
sudo mysql -u root

Configure binary logging, required for Group Replication:

sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

Add these essential settings:

[mysqld]
# Enable binary logging
log_bin=mysql-bin
binlog_format=ROW
binlog_checksum=CRC32

# Group Replication settings
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE"
server_id=1
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
relay_log_recovery=ON

# Network settings
bind_address=192.168.1.10
report_host=192.168.1.10
report_port=3306

# Performance settings
transaction_write_set_extraction=XXHASH64
transaction_isolation=READ_COMMITTED
slave_parallel_workers=4
slave_parallel_type=LOGICAL_CLOCK

Restart MySQL to apply settings:

sudo systemctl restart mysql
sudo systemctl status mysql

Verify settings are applied:

mysql -u root -e "SHOW VARIABLES LIKE 'log_bin';"
mysql -u root -e "SHOW VARIABLES LIKE 'gtid_mode';"
mysql -u root -e "SHOW VARIABLES LIKE 'binlog_format';"

Single-Primary Mode Setup

Single-primary mode routes all writes through one primary server, with other servers handling reads. This simplifies application integration. Bootstrap the first node by enabling Group Replication:

mysql -u root << 'EOF'
-- Install Group Replication plugin
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

-- Create replication user
CREATE USER 'repl_user'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
GRANT CONNECTION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT GROUP_REPLICATION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

-- Configure replication user
CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='repl_password' 
  FOR CHANNEL 'group_replication_recovery';

-- Check plugin installation
SELECT PLUGIN_NAME, PLUGIN_STATUS FROM INFORMATION_SCHEMA.PLUGINS 
  WHERE PLUGIN_NAME='group_replication';
EOF

Configure Group Replication settings:

mysql -u root << 'EOF'
SET GLOBAL group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
SET GLOBAL group_replication_start_on_boot=OFF;
SET GLOBAL group_replication_consistency='EVENTUAL';
SET GLOBAL group_replication_recovery_use_ssl=ON;
SET GLOBAL group_replication_recovery_complete_at='TRANSACTIONS_CERTIFIED';

-- For single-primary mode
SET GLOBAL group_replication_single_primary_mode=ON;
SET GLOBAL group_replication_enforce_update_everywhere_checks=OFF;
EOF

Add to MySQL configuration file for persistence:

sudo nano /etc/mysql/mysql.conf.d/group-replication.cnf

Add:

[mysqld]
# Group Replication settings
plugin-load-add='group_replication.so'
group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee'
group_replication_start_on_boot=OFF
group_replication_consistency=EVENTUAL
group_replication_single_primary_mode=ON
group_replication_enforce_update_everywhere_checks=OFF
group_replication_exit_state_action=READ_ONLY

Bootstrap the group with the first node:

mysql -u root << 'EOF'
-- Bootstrap the group - ONLY on initial startup
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

-- Verify primary status
SELECT CHANNEL_NAME, MEMBER_ID, MEMBER_HOST, MEMBER_PORT, MEMBER_STATE, MEMBER_ROLE
  FROM performance_schema.replication_group_members;
EOF

Prepare the second node. First, copy binary logs from primary to ensure consistency:

# On secondary node, configure with unique server_id
sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

# Change server_id=2
# Change report_host=192.168.1.11

Perform initial synchronization:

# On secondary node, configure and start
mysql -u root << 'EOF'
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
GRANT CONNECTION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT GROUP_REPLICATION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='repl_password'
  FOR CHANNEL 'group_replication_recovery';

-- Configure with same group UUID as primary
SET GLOBAL group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
SET GLOBAL group_replication_single_primary_mode=ON;
SET GLOBAL group_replication_start_on_boot=OFF;

-- Join the group (primary will transfer state)
START GROUP_REPLICATION;

-- Verify member status
SELECT * FROM performance_schema.replication_group_members;
EOF

Add subsequent nodes to the group:

# On third node (192.168.1.12)
mysql -u root << 'EOF'
-- Same configuration as second node
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
GRANT CONNECTION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT GROUP_REPLICATION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='repl_password'
  FOR CHANNEL 'group_replication_recovery';

SET GLOBAL group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
SET GLOBAL group_replication_single_primary_mode=ON;
SET GLOBAL group_replication_start_on_boot=OFF;

START GROUP_REPLICATION;

-- Verify all three members joined
SELECT * FROM performance_schema.replication_group_members;
EOF

Multi-Primary Mode Configuration

Multi-primary mode allows writes on all nodes, enabling write redundancy and simplified disaster recovery. Configure for multi-primary setup:

mysql -u root << 'EOF'
-- Stop the group if running (all nodes)
STOP GROUP_REPLICATION;

-- Configure multi-primary mode
SET GLOBAL group_replication_single_primary_mode=OFF;
SET GLOBAL group_replication_enforce_update_everywhere_checks=ON;
SET GLOBAL group_replication_consistency='AFTER_PRIMARY_ORDER';

-- Restart the group
-- ONLY on primary during bootstrap:
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

-- On secondary nodes:
START GROUP_REPLICATION;

-- Verify multi-primary status
SELECT CHANNEL_NAME, MEMBER_ID, MEMBER_HOST, MEMBER_PORT, MEMBER_STATE, MEMBER_ROLE
  FROM performance_schema.replication_group_members;
EOF

Enable certification-based conflict detection:

mysql -u root << 'EOF'
-- Enable write set-based conflict detection
SET GLOBAL transaction_write_set_extraction='XXHASH64';

-- Configure applier settings for parallel execution
SET GLOBAL slave_parallel_workers=4;
SET GLOBAL slave_parallel_type='LOGICAL_CLOCK';

-- Verify settings
SHOW VARIABLES LIKE 'transaction_write_set_extraction';
SHOW VARIABLES LIKE 'slave_parallel%';
EOF

Member Management

Add and remove members from the group dynamically without full restart. Add a new node to an active group:

# New node configuration (configure MySQL as before with unique server_id)
mysql -u root << 'EOF'
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
GRANT CONNECTION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT GROUP_REPLICATION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='repl_password'
  FOR CHANNEL 'group_replication_recovery';

SET GLOBAL group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
SET GLOBAL group_replication_recovery_complete_at='TRANSACTIONS_CERTIFIED';

-- The new node will perform recovery by fetching state from group
START GROUP_REPLICATION;

-- Monitor recovery progress
SELECT * FROM performance_schema.replication_group_members;
EOF

Remove a node from the group:

# On the node to remove
mysql -u root << 'EOF'
STOP GROUP_REPLICATION;

-- On any other group member to expel the node
SELECT * FROM performance_schema.replication_group_members;
EOF

Monitor member join and removal:

# Monitor recovery process
mysql -u root << 'EOF'
SELECT * FROM performance_schema.replication_group_member_stats
  WHERE CHANNEL_NAME='group_replication_applier'\G

-- Check distributed recovery status
SELECT CHANNEL_NAME, COUNT_RECEIVED FROM performance_schema.replication_connection_status
  WHERE CHANNEL_NAME='group_replication_recovery';
EOF

Conflict Detection and Resolution

In multi-primary mode, MySQL automatically detects and handles conflicting transactions:

mysql -u root << 'EOF'
-- View certification-based conflict information
SELECT * FROM performance_schema.replication_group_member_stats
  WHERE MEMBER_ID = @@GLOBAL.server_uuid\G

-- Monitor certification events
SHOW STATUS LIKE 'Replica_parallel_workers';
SHOW STATUS LIKE 'Replica_exec_row_stmt_events_with_error';
EOF

Create test tables to simulate multi-primary writes:

mysql -u root << 'EOF'
CREATE DATABASE IF NOT EXISTS testdb;
USE testdb;

CREATE TABLE test_table (
  id INT PRIMARY KEY AUTO_INCREMENT,
  data VARCHAR(255) NOT NULL,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB;

-- Insert test data from multiple nodes
INSERT INTO test_table (data) VALUES ('test from node1');
INSERT INTO test_table (data) VALUES ('test from node2');

-- Verify replication consistency
SELECT * FROM test_table;
EOF

Monitor for conflict-related errors:

# Check for replication errors
mysql -u root << 'EOF'
SELECT CHANNEL_NAME, MEMBER_ID, MEMBER_STATE, MEMBER_ROLE, 
       COUNT_TRANSACTIONS_IN_QUEUE, COUNT_TRANSACTIONS_CHECKED,
       COUNT_TRANSACTIONS_ROWS_VALIDATING
  FROM performance_schema.replication_group_member_stats\G
EOF

Monitoring and Health Checks

Implement comprehensive monitoring of the Group Replication cluster:

#!/bin/bash
# /usr/local/bin/check-group-replication.sh

mysql -u root << 'EOF'
-- Cluster member status
SELECT MEMBER_ID, MEMBER_HOST, MEMBER_PORT, MEMBER_STATE, MEMBER_ROLE,
       MEMBER_VERSION FROM performance_schema.replication_group_members
  ORDER BY MEMBER_STATE;

-- Group replication stats
SELECT MEMBER_ID, COUNT_TRANSACTIONS_IN_QUEUE, 
       COUNT_TRANSACTIONS_CHECKED, COUNT_TRANSACTIONS_CONFLICTS_DETECTED,
       COUNT_TRANSACTIONS_ROLLED_BACK, COUNT_TRANSACTIONS_VALIDATING
  FROM performance_schema.replication_group_member_stats;

-- Replication lag
SELECT CHANNEL_NAME, THREAD_ID, SERVICE_STATE, LAST_ERROR_NUMBER,
       LAST_ERROR_MESSAGE FROM performance_schema.replication_applier_status;
EOF

Set up alerts for critical conditions:

#!/bin/bash
# /usr/local/bin/monitor-gr-health.sh

CRITICAL_THRESHOLD=10

# Check if any member is offline
OFFLINE_COUNT=$(mysql -u root -sN -e \
  "SELECT COUNT(*) FROM performance_schema.replication_group_members \
   WHERE MEMBER_STATE != 'ONLINE'")

if [ "$OFFLINE_COUNT" -gt 0 ]; then
  echo "ALERT: $OFFLINE_COUNT group members offline"
  exit 1
fi

# Check queue depth
QUEUE_DEPTH=$(mysql -u root -sN -e \
  "SELECT MAX(COUNT_TRANSACTIONS_IN_QUEUE) \
   FROM performance_schema.replication_group_member_stats")

if [ "$QUEUE_DEPTH" -gt "$CRITICAL_THRESHOLD" ]; then
  echo "WARNING: Transaction queue depth at $QUEUE_DEPTH"
fi

echo "Group Replication status: OK"
exit 0

Failover and Recovery

Group Replication handles automatic failover in single-primary mode when the primary fails:

# Simulate primary failure on node1
sudo systemctl stop mysql

# Monitor automatic failover on other nodes
mysql -u root -h 192.168.1.11 << 'EOF'
-- Watch member status change
SELECT MEMBER_ID, MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE
  FROM performance_schema.replication_group_members;
  
-- New primary should be automatically elected
EOF

Perform manual failover to a specific secondary:

# Verify current state
mysql -u root << 'EOF'
SELECT MEMBER_HOST, MEMBER_ROLE FROM performance_schema.replication_group_members;
EOF

# Initiate manual failover (primary action)
mysql -u root << 'EOF'
-- Primary initiates failover
GROUP_REPLICATION_SET_AS_PRIMARY('aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee');
EOF

Recover a failed or restarted node:

# Restart the failed primary
sudo systemctl start mysql

# Monitor recovery
mysql -u root << 'EOF'
-- Watch member rejoin group
SELECT MEMBER_ID, MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE
  FROM performance_schema.replication_group_members;

-- Verify recovery completion
SHOW MASTER STATUS\G
SHOW REPLICA STATUS FOR CHANNEL 'group_replication_recovery'\G
EOF

Performance Tuning

Optimize Group Replication for your workload:

# Configure applier threads for parallel execution
mysql -u root << 'EOF'
SET GLOBAL slave_parallel_workers=8;
SET GLOBAL slave_parallel_type='LOGICAL_CLOCK';
SET GLOBAL slave_preserve_commit_order=ON;

-- Verify settings
SHOW VARIABLES LIKE 'slave_parallel%';
EOF

Adjust flow control settings to prevent replication lag:

mysql -u root << 'EOF'
-- Relaxed flow control for better throughput
SET GLOBAL group_replication_flow_control_mode='QUOTA';
SET GLOBAL group_replication_flow_control_certifier_threshold=25000;
SET GLOBAL group_replication_flow_control_release_percent=10;

-- Or strict consistency
SET GLOBAL group_replication_consistency='STRONG';
EOF

Monitor and optimize based on actual workload:

# Create monitoring script
mysql -u root << 'EOF'
-- Track performance metrics
SHOW STATUS LIKE 'Threads_created';
SHOW STATUS LIKE 'Aborted%';
SHOW STATUS LIKE 'Bytes_received';
SHOW STATUS LIKE 'Questions';
EOF

Operational Procedures

Perform rolling restart for updates:

#!/bin/bash
# Restart group members sequentially

for node in "192.168.1.11" "192.168.1.12"; do
  echo "Restarting $node"
  ssh mysql@$node "sudo systemctl stop mysql"
  sleep 5
  
  # Apply updates
  ssh mysql@$node "sudo apt-get update && sudo apt-get upgrade -y"
  
  ssh mysql@$node "sudo systemctl start mysql"
  sleep 10
  
  # Verify node rejoined
  mysql -u root -h $node -e \
    "SELECT MEMBER_STATE FROM performance_schema.replication_group_members \
     WHERE MEMBER_HOST='$node';"
done

# Finally restart primary
ssh [email protected] "sudo systemctl restart mysql"

Backup the group consistently:

# Backup on a secondary node without disruption
mysql -u root -h 192.168.1.11 << 'EOF'
FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;
EOF

# In another session, copy data
mysqldump -h 192.168.1.11 -u root --all-databases --master-data=2 > backup.sql

# Unlock
mysql -u root -h 192.168.1.11 -e "UNLOCK TABLES;"

# Verify backup
grep -i "CHANGE MASTER" backup.sql

Conclusion

MySQL Group Replication provides a native, self-healing high availability solution without external dependencies. Choose single-primary mode for traditional replication patterns with automatic failover, or multi-primary mode for write scalability and true redundancy. Proper configuration of binary logging, GTID mode, and Group Replication settings ensures consistent data across all members. Regular monitoring, testing of failover scenarios, and documented operational procedures maintain cluster health and reliability. By understanding both the benefits and limitations of Group Replication, you can design database infrastructure that meets your application's availability and consistency requirements.

MySQL Group Replication configuration

En esta página