MySQL Group Replication Configuración

MySQL Group Replication is a MySQL native solution providing high availability and automatic failover without external dependencies. It uses a group communication system to coordinate transactions across multiple server instances, ensuring consistency and enabling applications to detect primary failures and reconnect automatically. Esta guía completa cubre installation, configuration, operation, and best practices for deploying MySQL Group Replication in single-primary and multi-primary modes.

Descripción General and Topology Options

MySQL Group Replication proporciona two primary topologies: single-primary mode, where one server accepts write traffic while others serve as read réplicas, and multi-primary mode, where all servers accept writes with automatic conflict resolution. Single-primary mode offers simpler application integration but accepts writes only on the primary. Multi-primary mode eliminates single points of failure for write operations but requires application awareness of distributed writes.

Both modes provide automatic failover, read-your-writes consistency guarantees, and transaction ordering. The group uses a distributed algorithm to reach consensus on the order and consistency of transactions, ensuring all servers execute transactions identically.

Instalación and Requisitos

Instala MySQL 8.0 or later with Group Replication support. Ensure all nodos run the same MySQL version for compatibility:

# Ubuntu 20.04/22.04
curl -fsSL https://dev.mysql.com/get/mysql-apt-config_0.8.19-1_all.deb -o /tmp/mysql-apt-config.deb
sudo dpkg -i /tmp/mysql-apt-config.deb
sudo apt-get update
sudo apt-get install -y mysql-server mysql-client

# CentOS/RHEL 8/9
sudo dnf module enable mysql:8.0
sudo dnf install -y mysql-server mysql-client mysql-community-server

# Verifica installation
mysql --version

Crea a MySQL system user for replication:

sudo useradd -r -s /bin/false mysql 2>/dev/null || true

Initialize MySQL data directory:

# Ubuntu/Debian
sudo mysql_install_db --user=mysql --datadir=/var/lib/mysql

# RHEL/CentOS
sudo mysqld --initialize --user=mysql --datadir=/var/lib/mysql

Habilita and start MySQL servicio:

sudo systemctl enable mysql
sudo systemctl start mysql

# Initial login (empty password or verifica /root/.mysql_secret)
sudo mysql -u root

Configura binary logging, required for Group Replication:

sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

Add these essential settings:

[mysqld]
# Habilita binary logging
log_bin=mysql-bin
binlog_format=ROW
binlog_checksum=CRC32

# Group Replication settings
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE"
server_id=1
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
relay_log_recovery=ON

# Red settings
bind_address=192.168.1.10
report_host=192.168.1.10
report_port=3306

# Performance settings
transaction_write_set_extraction=XXHASH64
transaction_isolation=READ_COMMITTED
slave_parallel_workers=4
slave_parallel_type=LOGICAL_CLOCK

Reinicia MySQL to apply settings:

sudo systemctl restart mysql
sudo systemctl status mysql

Verifica settings are applied:

mysql -u root -e "SHOW VARIABLES LIKE 'log_bin';"
mysql -u root -e "SHOW VARIABLES LIKE 'gtid_mode';"
mysql -u root -e "SHOW VARIABLES LIKE 'binlog_format';"

Single-Primary Mode Configuración

Single-primary mode routes all writes through one primary server, with other servers handling reads. This simplifies application integration. Bootstrap the first nodo by enabling Group Replication:

mysql -u root << 'EOF'
-- Instala Group Replication plugin
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

-- Crea replication user
CREATE USER 'repl_user'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
GRANT CONNECTION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT GROUP_REPLICATION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

-- Configura replication user
CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='repl_password' 
  FOR CHANNEL 'group_replication_recovery';

-- Check plugin installation
SELECT PLUGIN_NAME, PLUGIN_STATUS FROM INFORMATION_SCHEMA.PLUGINS 
  WHERE PLUGIN_NAME='group_replication';
EOF

Configura Group Replication settings:

mysql -u root << 'EOF'
SET GLOBAL group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
SET GLOBAL group_replication_start_on_boot=OFF;
SET GLOBAL group_replication_consistency='EVENTUAL';
SET GLOBAL group_replication_recovery_use_ssl=ON;
SET GLOBAL group_replication_recovery_complete_at='TRANSACTIONS_CERTIFIED';

-- For single-primary mode
SET GLOBAL group_replication_single_primary_mode=ON;
SET GLOBAL group_replication_enforce_update_everywhere_checks=OFF;
EOF

Add to MySQL configuration file for persistence:

sudo nano /etc/mysql/mysql.conf.d/group-replication.cnf

Add:

[mysqld]
# Group Replication settings
plugin-load-add='group_replication.so'
group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee'
group_replication_start_on_boot=OFF
group_replication_consistency=EVENTUAL
group_replication_single_primary_mode=ON
group_replication_enforce_update_everywhere_checks=OFF
group_replication_exit_state_action=READ_ONLY

Bootstrap the group with the first nodo:

mysql -u root << 'EOF'
-- Bootstrap the group - ONLY on initial startup
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

-- Verifica primary status
SELECT CHANNEL_NAME, MEMBER_ID, MEMBER_HOST, MEMBER_PORT, MEMBER_STATE, MEMBER_ROLE
  FROM performance_schema.replication_group_members;
EOF

Prepare the second nodo. First, copy binary logs from primary to asegúrate de que consistency:

# On secondary nodo, configure with unique server_id
sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

# Change server_id=2
# Change report_host=192.168.1.11

Perform initial synchronization:

# On secondary nodo, configure and start
mysql -u root << 'EOF'
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
GRANT CONNECTION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT GROUP_REPLICATION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='repl_password'
  FOR CHANNEL 'group_replication_recovery';

-- Configura with same group UUID as primary
SET GLOBAL group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
SET GLOBAL group_replication_single_primary_mode=ON;
SET GLOBAL group_replication_start_on_boot=OFF;

-- Join the group (primary will transfer state)
START GROUP_REPLICATION;

-- Verifica member status
SELECT * FROM performance_schema.replication_group_members;
EOF

Add subsequent nodos to the group:

# On third nodo (192.168.1.12)
mysql -u root << 'EOF'
-- Same configuration as second nodo
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
GRANT CONNECTION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT GROUP_REPLICATION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='repl_password'
  FOR CHANNEL 'group_replication_recovery';

SET GLOBAL group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
SET GLOBAL group_replication_single_primary_mode=ON;
SET GLOBAL group_replication_start_on_boot=OFF;

START GROUP_REPLICATION;

-- Verifica all three members joined
SELECT * FROM performance_schema.replication_group_members;
EOF

Multi-Primary Mode Configuración

Multi-primary mode allows writes on all nodos, enabling write redundancy and simplified disaster recovery. Configura for multi-primary setup:

mysql -u root << 'EOF'
-- Detén the group if running (all nodos)
STOP GROUP_REPLICATION;

-- Configura multi-primary mode
SET GLOBAL group_replication_single_primary_mode=OFF;
SET GLOBAL group_replication_enforce_update_everywhere_checks=ON;
SET GLOBAL group_replication_consistency='AFTER_PRIMARY_ORDER';

-- Reinicia the group
-- ONLY on primary during bootstrap:
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

-- On secondary nodos:
START GROUP_REPLICATION;

-- Verifica multi-primary status
SELECT CHANNEL_NAME, MEMBER_ID, MEMBER_HOST, MEMBER_PORT, MEMBER_STATE, MEMBER_ROLE
  FROM performance_schema.replication_group_members;
EOF

Habilita certification-based conflict detection:

mysql -u root << 'EOF'
-- Habilita write set-based conflict detection
SET GLOBAL transaction_write_set_extraction='XXHASH64';

-- Configura applier settings for parallel execution
SET GLOBAL slave_parallel_workers=4;
SET GLOBAL slave_parallel_type='LOGICAL_CLOCK';

-- Verifica settings
SHOW VARIABLES LIKE 'transaction_write_set_extraction';
SHOW VARIABLES LIKE 'slave_parallel%';
EOF

Member Gestión

Add and remove members from the group dynamically without full restart. Add a new nodo to an active group:

# New nodo configuration (configure MySQL as before with unique server_id)
mysql -u root << 'EOF'
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
GRANT CONNECTION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT GROUP_REPLICATION_ADMIN ON *.* TO 'repl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='repl_password'
  FOR CHANNEL 'group_replication_recovery';

SET GLOBAL group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
SET GLOBAL group_replication_recovery_complete_at='TRANSACTIONS_CERTIFIED';

-- The new nodo will perform recovery by fetching state from group
START GROUP_REPLICATION;

-- Monitorea recovery progress
SELECT * FROM performance_schema.replication_group_members;
EOF

Remueve a nodo from the group:

# On the nodo to remove
mysql -u root << 'EOF'
STOP GROUP_REPLICATION;

-- On any other group member to expel the nodo
SELECT * FROM performance_schema.replication_group_members;
EOF

Monitorea member join and removal:

# Monitorea recovery process
mysql -u root << 'EOF'
SELECT * FROM performance_schema.replication_group_member_stats
  WHERE CHANNEL_NAME='group_replication_applier'\G

-- Check distributed recovery status
SELECT CHANNEL_NAME, COUNT_RECEIVED FROM performance_schema.replication_connection_status
  WHERE CHANNEL_NAME='group_replication_recovery';
EOF

Conflict Detection and Resolution

In multi-primary mode, MySQL automatically detects and handles conflicting transactions:

mysql -u root << 'EOF'
-- View certification-based conflict information
SELECT * FROM performance_schema.replication_group_member_stats
  WHERE MEMBER_ID = @@GLOBAL.server_uuid\G

-- Monitorea certification events
SHOW STATUS LIKE 'Replica_parallel_workers';
SHOW STATUS LIKE 'Replica_exec_row_stmt_events_with_error';
EOF

Crea test tables to simulate multi-primary writes:

mysql -u root << 'EOF'
CREATE DATABASE IF NOT EXISTS testdb;
USE testdb;

CREATE TABLE test_table (
  id INT PRIMARY KEY AUTO_INCREMENT,
  data VARCHAR(255) NOT NULL,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB;

-- Insert test data from multiple nodos
INSERT INTO test_table (data) VALUES ('test from node1');
INSERT INTO test_table (data) VALUES ('test from node2');

-- Verifica replication consistency
SELECT * FROM test_table;
EOF

Monitorea for conflict-related errors:

# Check for replication errors
mysql -u root << 'EOF'
SELECT CHANNEL_NAME, MEMBER_ID, MEMBER_STATE, MEMBER_ROLE, 
       COUNT_TRANSACTIONS_IN_QUEUE, COUNT_TRANSACTIONS_CHECKED,
       COUNT_TRANSACTIONS_ROWS_VALIDATING
  FROM performance_schema.replication_group_member_stats\G
EOF

Monitoreo and Health Checks

Implement comprehensive monitoring of the Group Replication cluster:

#!/bin/bash
# /usr/local/bin/verifica-group-replication.sh

mysql -u root << 'EOF'
-- Cluster member status
SELECT MEMBER_ID, MEMBER_HOST, MEMBER_PORT, MEMBER_STATE, MEMBER_ROLE,
       MEMBER_VERSION FROM performance_schema.replication_group_members
  ORDER BY MEMBER_STATE;

-- Group replication stats
SELECT MEMBER_ID, COUNT_TRANSACTIONS_IN_QUEUE, 
       COUNT_TRANSACTIONS_CHECKED, COUNT_TRANSACTIONS_CONFLICTS_DETECTED,
       COUNT_TRANSACTIONS_ROLLED_BACK, COUNT_TRANSACTIONS_VALIDATING
  FROM performance_schema.replication_group_member_stats;

-- Replication lag
SELECT CHANNEL_NAME, THREAD_ID, SERVICE_STATE, LAST_ERROR_NUMBER,
       LAST_ERROR_MESSAGE FROM performance_schema.replication_applier_status;
EOF

Set up alerts for critical conditions:

#!/bin/bash
# /usr/local/bin/monitor-gr-health.sh

CRITICAL_THRESHOLD=10

# Check if any member is offline
OFFLINE_COUNT=$(mysql -u root -sN -e \
  "SELECT COUNT(*) FROM performance_schema.replication_group_members \
   WHERE MEMBER_STATE != 'ONLINE'")

if [ "$OFFLINE_COUNT" -gt 0 ]; then
  echo "ALERT: $OFFLINE_COUNT group members offline"
  exit 1
fi

# Check queue depth
QUEUE_DEPTH=$(mysql -u root -sN -e \
  "SELECT MAX(COUNT_TRANSACTIONS_IN_QUEUE) \
   FROM performance_schema.replication_group_member_stats")

if [ "$QUEUE_DEPTH" -gt "$CRITICAL_THRESHOLD" ]; then
  echo "WARNING: Transaction queue depth at $QUEUE_DEPTH"
fi

echo "Group Replication status: OK"
exit 0

Failover and Recovery

Group Replication handles automatic failover in single-primary mode when the primary fails:

# Simulate primary failure on node1
sudo systemctl stop mysql

# Monitorea automatic failover on other nodos
mysql -u root -h 192.168.1.11 << 'EOF'
-- Watch member status change
SELECT MEMBER_ID, MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE
  FROM performance_schema.replication_group_members;
  
-- New primary should be automatically elected
EOF

Perform manual failover to a specific secondary:

# Verifica current state
mysql -u root << 'EOF'
SELECT MEMBER_HOST, MEMBER_ROLE FROM performance_schema.replication_group_members;
EOF

# Initiate manual failover (primary action)
mysql -u root << 'EOF'
-- Primary initiates failover
GROUP_REPLICATION_SET_AS_PRIMARY('aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee');
EOF

Recover a failed or restarted nodo:

# Reinicia the failed primary
sudo systemctl start mysql

# Monitorea recovery
mysql -u root << 'EOF'
-- Watch member rejoin group
SELECT MEMBER_ID, MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE
  FROM performance_schema.replication_group_members;

-- Verifica recovery completion
SHOW MASTER STATUS\G
SHOW REPLICA STATUS FOR CHANNEL 'group_replication_recovery'\G
EOF

Performance Tuning

Optimiza Group Replication for your workload:

# Configura applier threads for parallel execution
mysql -u root << 'EOF'
SET GLOBAL slave_parallel_workers=8;
SET GLOBAL slave_parallel_type='LOGICAL_CLOCK';
SET GLOBAL slave_preserve_commit_order=ON;

-- Verifica settings
SHOW VARIABLES LIKE 'slave_parallel%';
EOF

Adjust flow control settings to prevent replication lag:

mysql -u root << 'EOF'
-- Relaxed flow control for better throughput
SET GLOBAL group_replication_flow_control_mode='QUOTA';
SET GLOBAL group_replication_flow_control_certifier_threshold=25000;
SET GLOBAL group_replication_flow_control_release_percent=10;

-- Or strict consistency
SET GLOBAL group_replication_consistency='STRONG';
EOF

Monitorea and optimize based on actual workload:

# Crea monitoring script
mysql -u root << 'EOF'
-- Track performance metrics
SHOW STATUS LIKE 'Threads_created';
SHOW STATUS LIKE 'Aborted%';
SHOW STATUS LIKE 'Bytes_received';
SHOW STATUS LIKE 'Questions';
EOF

Operational Procedures

Perform rolling restart for updates:

#!/bin/bash
# Reinicia group members sequentially

for nodo in "192.168.1.11" "192.168.1.12"; do
  echo "Restarting $nodo"
  ssh mysql@$nodo "sudo systemctl stop mysql"
  sleep 5
  
  # Apply updates
  ssh mysql@$nodo "sudo apt-get update && sudo apt-get upgrade -y"
  
  ssh mysql@$nodo "sudo systemctl start mysql"
  sleep 10
  
  # Verifica nodo rejoined
  mysql -u root -h $nodo -e \
    "SELECT MEMBER_STATE FROM performance_schema.replication_group_members \
     WHERE MEMBER_HOST='$nodo';"
done

# Finally restart primary
ssh [email protected] "sudo systemctl restart mysql"

Respalda the group consistently:

# Respalda on a secondary nodo without disruption
mysql -u root -h 192.168.1.11 << 'EOF'
FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;
EOF

# In another session, copy data
mysqldump -h 192.168.1.11 -u root --all-databases --master-data=2 > backup.sql

# Unlock
mysql -u root -h 192.168.1.11 -e "UNLOCK TABLES;"

# Verifica backup
grep -i "CHANGE MASTER" backup.sql

Conclusión

MySQL Group Replication proporciona a native, self-healing high availability solution without external dependencies. Choose single-primary mode for traditional replication patterns with automatic failover, or multi-primary mode for write scalability and true redundancy. Proper configuration of binary logging, GTID mode, and Group Replication settings ensures consistent data across all members. Regular monitoring, testing of failover scenarios, and documented operational procedures maintain cluster health and reliability. By understanding both the benefits and limitations of Group Replication, you can design database infrastructure that meets your application's availability and consistency requirements.

Configuración de replicación de grupos MySQL

On this page