Linux,  Redis,  中标麒麟

如何在中标麒麟v10 Linux服务器上安装配置3主3从的Redis cluster,以及解决Waiting for the cluster to join的问题

零 需求场景

可以一套CentOS Linux服务器上部署了3主3从的Redis cluster,需要对该3台服务器执行内存降配操作。需要先停止Redis cluster,然后降配内存,重启这3台虚拟机服务器,再启动3节点的Redis cluster。由于,我们之前没有执行过类似操作,我需要找到3台机器,执行Redis cluster的安装配置,以及关闭Redis cluster,再重启Redis cluster的模拟操作。

本文档用于记录,在3台中标麒麟v10的Linux服务器上执行:安装3主3从Redis cluster,以及如何关闭、启动Redis cluster的操作。以及在安装Redis cluster过程中遇到的问题:Waiting for the cluster to join

为了模拟和生产环境尽可能一致,这里指定Redis运行的端口分别是6001和7001。相当于,在每台机器上启动运行了2个Redis,一个运行在6001端口的master节点,另一个在7001端口上的slave节点。

一 机器信息

机器IP主机操作系统版本hostname机器配置端口
10.0.9.63Kylin Linux Advanced Server V10 (Lance)czmaster8C32G6001和7001
10.0.9.64Kylin Linux Advanced Server V10 (Lance)czworker18C16G6001和7001
10.0.9.65Kylin Linux Advanced Server V10 (Lance)czworker216C24G6001和7001

二 安装配置过程

注意📢:系列步骤1-6,需要分别在3台机器上都执行。step 7只需要在其中任意一台机器执行即可。

1 下载|解压Redis软件

cd /data/
wget http://download.redis.io/releases/redis-5.0.5.tar.gz
tar -zxvf redis-5.0.5.tar.gz 

2 编译|安装Redis

cd redis-5.0.5/
make     
make install PREFIX=/data/redis-5.0.5/

3 创建日志路径

#创建日志目录
mkdir -p /data/redis-5.0.5/logs

4 编辑配置文件

mkdir -p /data/redis-5.0.5/6001
mkdir -p /data/redis-5.0.5/7001
​
vi /data/redis-5.0.5/6001/redis.conf
#内容如下:
daemonize yes
masterauth ETsb&11p
requirepass ETsb&11p
pidfile /data/redis-5.0.5/pidfile/redis_6001.pid
port 6001
tcp-backlog 511
timeout 0
tcp-keepalive 0
loglevel notice
logfile /data/redis-5.0.5/logs/redis_6001.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /data/redis-5.0.5/6001/
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
cluster-require-full-coverage no
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
protected-mode no
​
​
vi /data/redis-5.0.5/7001/redis.conf
#内容如下:
daemonize yes
masterauth ETsb&11p
requirepass ETsb&11p
pidfile /data/redis-5.0.5/pidfile/redis_7001.pid
port 7001
tcp-backlog 511
timeout 0
tcp-keepalive 0
loglevel notice
logfile /data/redis-5.0.5/logs/redis_7001.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /data/redis-5.0.5/7001/
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
cluster-require-full-coverage no
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
protected-mode no

5 防火墙添加规则,放行6001、7001、16001、17001端口

firewall-cmd --zone=public --add-port=7001/tcp --permanent
firewall-cmd --zone=public --add-port=6001/tcp --permanent
firewall-cmd --zone=public --add-port=17001/tcp --permanent
firewall-cmd --zone=public --add-port=16001/tcp --permanent
firewall-cmd --reload
firewall-cmd --list-ports

注意:其中的16001和17001端口,分别用于Redis cluster节点之间内部通信使用的端口。6001和7001用于对外提供服务使用。默认情况下,Redis运行在6379端口,加上10000的端口号16379则是用于Redis集群通信的端口。

6 启动Redis服务

cd /data/redis-5.0.5/
bin/redis-server 6001/redis.conf
​
cd /data/redis-5.0.5/
bin/redis-server 7001/redis.conf

分别在每个机器上都需要分别执行上述命令。

[root@czmaster redis-5.0.5]# cd /data/redis-5.0.5/
[root@czmaster redis-5.0.5]# bin/redis-server 6001/redis.conf
[root@czmaster redis-5.0.5]#
[root@czmaster redis-5.0.5]# cd /data/redis-5.0.5/
[root@czmaster redis-5.0.5]# bin/redis-server 7001/redis.conf
[root@czmaster redis-5.0.5]# ps -ef|grep redis
root     2898809       1  0 11:51 ?        00:00:23 bin/redis-server *:6001 [cluster]
root     2898818       1  0 11:51 ?        00:00:26 bin/redis-server *:7001 [cluster]
root     2971444  660537  0 14:17 pts/0    00:00:00 grep redis
[root@czmaster redis-5.0.5]#

10.0.9.64:

[root@czworker1 redis-5.0.5]# cd /data/redis-5.0.5/
[root@czworker1 redis-5.0.5]# bin/redis-server 6001/redis.conf
[root@czworker1 redis-5.0.5]#
[root@czworker1 redis-5.0.5]# cd /data/redis-5.0.5/
[root@czworker1 redis-5.0.5]# bin/redis-server 7001/redis.conf                            
[root@czworker1 redis-5.0.5]# ps -ef|grep redis
root     3946026       1  0 11:51 ?        00:00:25 bin/redis-server *:6001 [cluster]
root     3946031       1  0 11:51 ?        00:00:26 bin/redis-server *:7001 [cluster]
root     3963948 3380448  0 14:16 pts/0    00:00:00 grep redis
[root@czworker1 redis-5.0.5]#

10.0.9.65:

[root@czworker2 redis-5.0.5]# cd /data/redis-5.0.5/
[root@czworker2 redis-5.0.5]# bin/redis-server 6001/redis.conf
[root@czworker2 redis-5.0.5]#
[root@czworker2 redis-5.0.5]# cd /data/redis-5.0.5/
[root@czworker2 redis-5.0.5]# bin/redis-server 7001/redis.conf
[root@czworker2 redis-5.0.5]# ps -ef|grep redis
root     2438906       1  0 11:51 ?        00:00:24 bin/redis-server *:6001 [cluster]
root     2438913       1  0 11:51 ?        00:00:27 bin/redis-server *:7001 [cluster]
root     2556635 3008695  0 14:17 pts/0    00:00:00 grep redis
[root@czworker2 redis-5.0.5]#

7 创建Redis cluster

只在10.0.9.63这台机器上执行:

[root@czmaster redis-5.0.5]# bin/redis-cli -a "ETsb&11p" --cluster create 10.0.9.63:6001 10.0.9.63:7001 10.0.9.64:6001 10.0.9.64:7001 10.0.9.65:6001 10.0.9.65:7001 --cluster-replicas 1
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.0.9.64:7001 to 10.0.9.63:6001
Adding replica 10.0.9.65:7001 to 10.0.9.64:6001
Adding replica 10.0.9.63:7001 to 10.0.9.65:6001
M: b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 10.0.9.63:6001
  slots:[0-5460] (5461 slots) master
S: 1f9b3ccd35c5b1e400f2c60fceff19cad556fa68 10.0.9.63:7001
  replicates 70acb193c5eb2d2b4e9354136ea4025810541405
M: 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 10.0.9.64:6001
  slots:[5461-10922] (5462 slots) master
S: fec892994c3f418e216b919994bf77b79a5e0218 10.0.9.64:7001
  replicates b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d
M: 70acb193c5eb2d2b4e9354136ea4025810541405 10.0.9.65:6001
  slots:[10923-16383] (5461 slots) master
S: d5ac772fa57f7ba65409e28de337877e04d4ae0b 10.0.9.65:7001
  replicates 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
...
>>> Performing Cluster Check (using node 10.0.9.63:6001)
M: b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 10.0.9.63:6001
  slots:[0-5460] (5461 slots) master
  1 additional replica(s)
S: d5ac772fa57f7ba65409e28de337877e04d4ae0b 10.0.9.65:7001
  slots: (0 slots) slave
  replicates 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111
S: 1f9b3ccd35c5b1e400f2c60fceff19cad556fa68 10.0.9.63:7001
  slots: (0 slots) slave
  replicates 70acb193c5eb2d2b4e9354136ea4025810541405
M: 70acb193c5eb2d2b4e9354136ea4025810541405 10.0.9.65:6001
  slots:[10923-16383] (5461 slots) master
  1 additional replica(s)
M: 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 10.0.9.64:6001
  slots:[5461-10922] (5462 slots) master
  1 additional replica(s)
S: fec892994c3f418e216b919994bf77b79a5e0218 10.0.9.64:7001
  slots: (0 slots) slave
  replicates b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
[root@czmaster redis-5.0.5]#  

至此,完成在3台服务器上创建3主3从模式的Redis cluster。

8 查看cluster信息

可以在任意一个节点上,分别访问6001端口和7001端口,来查看cluster信息。如下,在10.0.9.63机器上,分别执行:#6001端口,执行命令
bin/redis-cli -p 6001
​
auth ETsb&11p
​
cluster nodes
cluster info
exit
​
#7001端口,执行命令
bin/redis-cli -p 7001
​
auth ETsb&11p
​
cluster nodes
cluster info
exit
​
#6001端口,执行结果
[root@czmaster redis-5.0.5]# bin/redis-cli -p 6001
127.0.0.1:6001>
127.0.0.1:6001> auth ETsb&11p
OK
127.0.0.1:6001>
127.0.0.1:6001> cluster nodes
d5ac772fa57f7ba65409e28de337877e04d4ae0b 10.0.9.65:7001@17001 slave 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 0 1705302217859 6 connected
1f9b3ccd35c5b1e400f2c60fceff19cad556fa68 10.0.9.63:7001@17001 slave 70acb193c5eb2d2b4e9354136ea4025810541405 0 1705302216855 5 connected
70acb193c5eb2d2b4e9354136ea4025810541405 10.0.9.65:6001@16001 master - 0 1705302218000 5 connected 10923-16383
393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 10.0.9.64:6001@16001 master - 0 1705302218000 3 connected 5461-10922
fec892994c3f418e216b919994bf77b79a5e0218 10.0.9.64:7001@17001 slave b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 0 1705302218561 4 connected
b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 10.0.9.63:6001@16001 myself,master - 0 1705302217000 1 connected 0-5460
127.0.0.1:6001> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:21160
cluster_stats_messages_pong_sent:20906
cluster_stats_messages_sent:42066
cluster_stats_messages_ping_received:20901
cluster_stats_messages_pong_received:21160
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:42066
127.0.0.1:6001> exit
[root@czmaster redis-5.0.5]#
​
#7001端口,执行结果
[root@czmaster redis-5.0.5]# bin/redis-cli -p 7001
127.0.0.1:7001>
127.0.0.1:7001> auth ETsb&11p
OK
127.0.0.1:7001>
127.0.0.1:7001> cluster nodes
fec892994c3f418e216b919994bf77b79a5e0218 10.0.9.64:7001@17001 slave b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 0 1705302032527 4 connected
393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 10.0.9.64:6001@16001 master - 0 1705302031624 3 connected 5461-10922
1f9b3ccd35c5b1e400f2c60fceff19cad556fa68 10.0.9.63:7001@17001 myself,slave 70acb193c5eb2d2b4e9354136ea4025810541405 0 1705302031000 2 connected
70acb193c5eb2d2b4e9354136ea4025810541405 10.0.9.65:6001@16001 master - 0 1705302032127 5 connected 10923-16383
d5ac772fa57f7ba65409e28de337877e04d4ae0b 10.0.9.65:7001@17001 slave 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 0 1705302032000 6 connected
b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 10.0.9.63:6001@16001 master - 0 1705302032628 1 connected 0-5460
127.0.0.1:7001> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:5
cluster_stats_messages_ping_sent:20180
cluster_stats_messages_pong_sent:20366
cluster_stats_messages_meet_sent:3
cluster_stats_messages_sent:40549
cluster_stats_messages_ping_received:20363
cluster_stats_messages_pong_received:20183
cluster_stats_messages_meet_received:3
cluster_stats_messages_received:40549
127.0.0.1:7001> exit
[root@czmaster redis-5.0.5]#

9 关闭Redis cluster

需要分别在每个节点上,分别关闭6001和7001端口上的Redis服务,相当于关闭了Redis cluster。

#===关闭
bin/redis-cli -p 7001
auth ETsb&11p
shutdown save
exit
​
#关闭6001
bin/redis-cli -p 6001
auth ETsb&11p
shutdown save
exit
​
#校验端口是否存在
netstat -anp|grep 7001
netstat -anp|grep 6001

10 重启Redis cluster

在不重建Redis cluster的前提下,可以先执行上述的关闭Redis cluste操作;然后,依次在每台服务器上分别执行启动6001和7001端口上的Redis服务

#====启动
cd /data/redis-5.0.5/
bin/redis-server 6001/redis.conf
​
cd /data/redis-5.0.5/
bin/redis-server 7001/redis.conf
​
netstat -anp|grep 6001
netstat -anp|grep 7001

这样,会自动读取nodes.conf文件,并启动Redis cluster。

三 遇到以及如何解决Waiting for the cluster to join问题

在执行创建Redis cluster的过程中,曾经遇到过Waiting for the cluster to join,发现一直卡着,导致cluster一直创建不成功。后来身份分析,发现在防火墙上,因为没有放开16001和17001端口的访问策略,导致创建失败。解决办法是:先在3个节点上分别停止6001和7001端口的Redis服务;然后,分别删除3个节点上/data/redis-5.0.5/6001/nodes.conf和/data/redis-5.0.5/7001/nodes.conf的配置文件。如果不删除对应路径下的nodes.conf文件的话,那么当下次重启Redis服务时,默认情况下,会继续读取上次的nodes.conf配置文件,这样,如果重启之前cluster的状态有问题,那么重启之后,依然读取了有问题的cluster配置文件,cluster则依然是一个有问题的状态。

解决流程:先停止3个节点上的6001和7001端口Redis服务,然后删除nodes.conf,最后重新执行初始化Redis cluster的命令即可解决。

四 参考和链接

如何重启redis cluster

https://blog.csdn.net/justry_deng/article/details/89205155

如何解决:Waiting for the cluster to join问题

https://linux.m2osw.com/redis-infamous-waiting-cluster-join-message

留言