PostgreSQL pg_auto_failover 高可用 1:pg_auto_failover集羣搭建
PostgreSQL pg_auto_failover 高可用 2:pg_auto_failover集羣運維
pg_autoctl
+ create Create a pg_auto_failover node, or formation
+ drop Drop a pg_auto_failover node, or formation
+ config Manages the pg_autoctl configuration
+ show Show pg_auto_failover information
+ enable Enable a feature on a formation
+ disable Disable a feature on a formation
+ get Get a pg_auto_failover node, or formation setting
+ set Set a pg_auto_failover node, or formation setting
+ perform Perform an action orchestrated by the monitor
activate Activate a Citus worker from the Citus coordinator
run Run the pg_autoctl service (monitor or keeper)
stop signal the pg_autoctl service for it to stop
reload signal the pg_autoctl for it to reload its configuration
status Display the current status of the pg_autoctl service
help print help message
version print pg_autoctl version
pg_autoctl create
monitor Initialize a pg_auto_failover monitor node
postgres Initialize a pg_auto_failover standalone postgres node
coordinator Initialize a pg_auto_failover citus coordinator node
worker Initialize a pg_auto_failover citus worker node
formation Create a new formation on the pg_auto_failover monitor
pg_autoctl drop
monitor Drop the pg_auto_failover monitor
node Drop a node from the pg_auto_failover monitor
formation Drop a formation on the pg_auto_failover monitor
pg_autoctl config
check Check pg_autoctl configuration
get Get the value of a given pg_autoctl configuration variable
set Set the value of a given pg_autoctl configuration variable
pg_autoctl show
uri Show the postgres uri to use to connect to pg_auto_failover nodes
events Prints monitor's state of nodes in a given formation and group
state Prints monitor's state of nodes in a given formation and group
settings Print replication settings for a formation from the monitor
standby-names Prints synchronous_standby_names for a given group
file List pg_autoctl internal files (config, state, pid)
systemd Print systemd service file for this node
pg_autoctl enable
secondary Enable secondary nodes on a formation
maintenance Enable Postgres maintenance mode on this node
ssl Enable SSL configuration on this node
monitor Enable a monitor for this node to be orchestrated from
pg_autoctl disable
secondary Disable secondary nodes on a formation
maintenance Disable Postgres maintenance mode on this node
ssl Disable SSL configuration on this node
monitor Disable the monitor for this node
pg_autoctl get
+ node get a node property from the pg_auto_failover monitor
+ formation get a formation property from the pg_auto_failover monitor
pg_autoctl get node
replication-quorum get replication-quorum property from the monitor
candidate-priority get candidate property from the monitor
pg_autoctl get formation
settings get replication settings for a formation from the monitor
number-sync-standbys get number_sync_standbys for a formation from the monitor
pg_autoctl set
+ node set a node property on the monitor
+ formation set a formation property on the monitor
pg_autoctl set node
metadata set metadata on the monitor
replication-quorum set replication-quorum property on the monitor
candidate-priority set candidate property on the monitor
pg_autoctl set formation
number-sync-standbys set number-sync-standbys for a formation on the monitor
pg_autoctl perform
failover Perform a failover for given formation and group
switchover Perform a switchover for given formation and group
promotion Perform a failover that promotes a target node
1, pg_auto_failover故障轉移節點設置
通過pg_autoctl get formation settings查看集羣參數設置
root@ubuntu11:~# pg_autoctl get formation settings
Context | Name | Setting | Value
----------+----------+---------------------------+------------------------------------
formation | default | number_sync_standbys | 0
primary | ubuntu12 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_60)'
node | ubuntu13 | candidate priority | 50
node | ubuntu12 | candidate priority | 50
node | ubuntu13 | replication quorum | true
node | ubuntu12 | replication quorum | true
參數説明
number_sync_standbys:同步備庫的數量
number_sync_standbys表示設置強同步節點的數量,如果為0的話,表示當從節點故障之後,主節點依舊可以支持讀寫,否則,如果可用的備用節點數量小於number_sync_standbys設置的個數的時候,主節點的寫操作會被掛起。
number_sync_standbys的設置
pg_autoctl set formation number_sync_standbys 1
Replication quorum
此參數可設置為 true或 false,默認情況下為 true,表示有可能成為同步備庫。如果設置為 false,表示使用異步複製。
Candidate Priority
該參數表示候選備庫優先級,可以設為0 到100 間的任意值,缺省值是50。優先級越大,越容易備選擇成為主庫。如果設置為0,則不會選擇該節點為新主庫。
當備庫候選優先級相同時,Monitor節點將選擇LSN位置最高的的備庫。如果各個備庫的LSN位置相同,則隨機選擇一個備庫。
實際測試中到一個有意思的問題,主備節點都正常的情況下
1,從pg_auto_failover的維度看,synchronous_standby_names 為 'ANY 1 (pgautofailover_standby_60)'
2,從postgresql實例的維度看,synchronous_commit = on,synchronous_standby_names=ANY 1 (pgautofailover_standby_60)
這意味着主備節點是同步複製。
但是當關閉備節點來模擬備節點故障的時候,發現synchronous_standby_names為'',此時再從主節點查詢synchronous_standby_names,其值也變為'',意味着自動降級為異步複製。
select name,setting from pg_settings where name like '%synchronous_commit%';
name |setting|
------------------+-------+
synchronous_commit|on |
select name,setting from pg_settings where name like '%synchronous_standby_names%' ;
name |setting |
-------------------------+---------------------------------+
synchronous_standby_names|ANY 1 (pgautofailover_standby_60)|
如果想設置為強同步模式,通過pg_autoctl set formation number-sync-standbys 1設置,這時會發現,一主一備的模式是不支持的
2,pg_auto_failover故障轉移監控參數設置
如下是pgautofailover故障轉移相關的監控參數,參考https://pg-auto-failover.readthedocs.io/en/main/ref/configuration.html#configuration
select name, setting, unit, short_desc from pg_settings where name ~ 'pgautofailover.';
name |setting |unit|short_desc |
------------------------------------------------+--------+----+----------------------------------------------------------------------------------------------------+
pgautofailover.enable_sync_wal_log_threshold |16777216| |Don't enable synchronous replication until secondary xlog is within this many bytes of the primary's|
pgautofailover.health_check_max_retries |2 | |Maximum number of re-tries before marking a node as failed. |
pgautofailover.health_check_period |5000 |ms |Duration between each check (in milliseconds). |
pgautofailover.health_check_retry_delay |2000 |ms |Delay between consecutive retries. |
pgautofailover.health_check_timeout |5000 |ms |Connect timeout (in milliseconds). |
pgautofailover.node_considered_unhealthy_timeout|20000 |ms |Mark node unhealthy if last ping was over this long ago |
pgautofailover.primary_demote_timeout |30000 |ms |Give the primary this long to drain before promoting the secondary |
pgautofailover.promote_wal_log_threshold |16777216| |Don't promote secondary unless xlog is with this many bytes of the master |
pgautofailover.startup_grace_period |10000 |ms |Wait for at least this much time after startup before initiating a failover. |
3, pg_auto_failover主從切換
3.1 自動故障切換
通過重啓主節點的pgautofailover服務來模擬故障,發現會備用節點自動切換為主節點,原主節點啓動後作為備用節點加入集羣。
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu13 | 60 | ubuntu13:9300 | 9: 0/5039798 | read-only | secondary | secondary
ubuntu12 | 68 | ubuntu12:9300 | 9: 0/5039798 | read-write | primary | primary
root@ubuntu11:~#
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu13 | 60 | ubuntu13:9300 | 10: 0/5039960 | read-write | wait_primary | wait_primary
ubuntu12 | 68 | ubuntu12:9300 | 9: 0/5039810 | none | demoted | catchingup
root@ubuntu11:~#
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu13 | 60 | ubuntu13:9300 | 10: 0/5039A10 | read-write | primary | primary
ubuntu12 | 68 | ubuntu12:9300 | 10: 0/5039A10 | read-only | secondary | secondary
root@ubuntu11:~#
3.2 switchover手動切換
在待提升為主節點的備節點上,或者monitor節點上,執行pg_autoctl perform switchover,可以看到當前節點提升為主節點
在一主多從的情況下,可在任意節點執行pg_autoctl perform switchover,會根據Candidate Priority執行切換,Candidate Priority高的可以接替主節點
root@ubuntu12:~# pg_autoctl perform switchover
05:12:23 213044 INFO Targetting group 0 in formation "default"
05:12:23 213044 INFO Listening monitor notifications about state changes in formation "default" and group 0
05:12:23 213044 INFO Following table displays times when notifications are received
Time | Name | Node | Host:Port | Current State | Assigned State
---------+----------+-------+---------------+---------------------+--------------------
05:12:23 | ubuntu13 | 60 | ubuntu13:9300 | primary | draining
05:12:23 | ubuntu12 | 68 | ubuntu12:9300 | secondary | prepare_promotion
05:12:23 | ubuntu12 | 68 | ubuntu12:9300 | prepare_promotion | prepare_promotion
05:12:23 | ubuntu12 | 68 | ubuntu12:9300 | prepare_promotion | stop_replication
05:12:23 | ubuntu13 | 60 | ubuntu13:9300 | primary | demote_timeout
05:12:23 | ubuntu13 | 60 | ubuntu13:9300 | draining | demote_timeout
05:12:23 | ubuntu13 | 60 | ubuntu13:9300 | demote_timeout | demote_timeout
05:12:24 | ubuntu12 | 68 | ubuntu12:9300 | stop_replication | stop_replication
05:12:24 | ubuntu12 | 68 | ubuntu12:9300 | stop_replication | wait_primary
05:12:24 | ubuntu13 | 60 | ubuntu13:9300 | demote_timeout | demoted
05:12:24 | ubuntu13 | 60 | ubuntu13:9300 | demoted | demoted
05:12:24 | ubuntu12 | 68 | ubuntu12:9300 | wait_primary | wait_primary
05:12:24 | ubuntu13 | 60 | ubuntu13:9300 | demoted | catchingup
05:12:25 | ubuntu13 | 60 | ubuntu13:9300 | catchingup | catchingup
05:12:26 | ubuntu13 | 60 | ubuntu13:9300 | catchingup | secondary
05:12:26 | ubuntu13 | 60 | ubuntu13:9300 | secondary | secondary
05:12:26 | ubuntu12 | 68 | ubuntu12:9300 | wait_primary | primary
05:12:26 | ubuntu12 | 68 | ubuntu12:9300 | primary | primary
root@ubuntu12:~#
從monitor節點也可以看到主備身份發生了變化
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu13 | 60 | ubuntu13:9300 | 11: 0/5039CC0 | read-only | secondary | secondary
ubuntu12 | 68 | ubuntu12:9300 | 11: 0/5039CC0 | read-write | primary | primary
3.3 failover 切換
pg_autoctl perform failover需要在monitor節點上執行,其結果會將備用節點提升為主節點。
root@ubuntu11:~# pg_autoctl perform failover
05:15:57 774844 INFO Waiting 60 secs for a notification with state "primary" in formation "default" and group 0
05:15:57 774844 INFO Listening monitor notifications about state changes in formation "default" and group 0
05:15:57 774844 INFO Following table displays times when notifications are received
Time | Name | Node | Host:Port | Current State | Assigned State
---------+----------+-------+---------------+---------------------+--------------------
05:15:57 | ubuntu12 | 68 | ubuntu12:9300 | primary | draining
05:15:57 | ubuntu13 | 60 | ubuntu13:9300 | secondary | prepare_promotion
05:15:57 | ubuntu13 | 60 | ubuntu13:9300 | prepare_promotion | prepare_promotion
05:15:57 | ubuntu13 | 60 | ubuntu13:9300 | prepare_promotion | stop_replication
05:15:57 | ubuntu12 | 68 | ubuntu12:9300 | primary | demote_timeout
05:15:57 | ubuntu12 | 68 | ubuntu12:9300 | draining | demote_timeout
05:15:57 | ubuntu12 | 68 | ubuntu12:9300 | demote_timeout | demote_timeout
05:15:58 | ubuntu13 | 60 | ubuntu13:9300 | stop_replication | stop_replication
05:15:58 | ubuntu13 | 60 | ubuntu13:9300 | stop_replication | wait_primary
05:15:58 | ubuntu12 | 68 | ubuntu12:9300 | demote_timeout | demoted
05:15:58 | ubuntu12 | 68 | ubuntu12:9300 | demoted | demoted
05:15:58 | ubuntu13 | 60 | ubuntu13:9300 | wait_primary | wait_primary
05:15:58 | ubuntu12 | 68 | ubuntu12:9300 | demoted | catchingup
05:15:59 | ubuntu12 | 68 | ubuntu12:9300 | catchingup | catchingup
05:16:00 | ubuntu12 | 68 | ubuntu12:9300 | catchingup | secondary
05:16:00 | ubuntu12 | 68 | ubuntu12:9300 | secondary | secondary
05:16:00 | ubuntu13 | 60 | ubuntu13:9300 | wait_primary | primary
05:16:00 | ubuntu13 | 60 | ubuntu13:9300 | primary | primary
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu13 | 60 | ubuntu13:9300 | 12: 0/503A0E8 | read-write | primary | primary
ubuntu12 | 68 | ubuntu12:9300 | 12: 0/503A0E8 | read-only | secondary | secondary
root@ubuntu11:~#
root@ubuntu11:~#
4,pg_auto_failover添加節點
這裏初始化新的機器ubuntu14,IP地址為192.168.152.124。此時主機列表如下,依次修改前三台主機的hosts映射文件,增加ubuntu14的地址與主機名。
192.168.152.121 ubuntu11
192.168.152.122 ubuntu12
192.168.152.123 ubuntu13
192.168.152.124 ubuntu14
使用pg_autoctl 創建註冊到monitor節點。
/usr/local/pgsql16/server/bin/pg_autoctl create postgres --hostname ubuntu14 --name ubuntu14 --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --auth trust --ssl-self-signed --monitor 'postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require'
pg_auto_failover添加節點執行過程
root@ubuntu14:~# su - postgres
postgres@ubuntu14:~$
postgres@ubuntu14:~$
postgres@ubuntu14:~$ /usr/local/pgsql16/server/bin/pg_autoctl create postgres --hostname ubuntu14 --name ubuntu14 --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --auth trust --ssl-self-signed --monitor 'postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require'
06:09:25 2643 INFO Using default --ssl-mode "require"
06:09:25 2643 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
06:09:25 2643 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
06:09:25 2643 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details
06:09:25 2643 INFO Started pg_autoctl postgres service with pid 2645
06:09:25 2643 INFO Started pg_autoctl node-init service with pid 2646
06:09:25 2645 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v
06:09:25 2646 INFO Registered node 75 "ubuntu14" (ubuntu14:9300) in formation "default", group 0, state "wait_standby"
06:09:25 2646 INFO Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.state"
06:09:25 2646 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.init"
06:09:25 2646 INFO Successfully registered as "wait_standby" to the monitor.
06:09:25 2646 INFO FSM transition from "init" to "wait_standby": Start following a primary
06:09:25 2646 INFO Transition complete: current state is now "wait_standby"
06:09:26 2646 INFO FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
06:09:26 2646 INFO Initialising PostgreSQL as a hot standby
06:09:26 2646 INFO /usr/local/pgsql16/server/bin/pg_basebackup -w -d 'application_name=pgautofailover_standby_75 host=ubuntu13 port=9300 user=pgautofailover_replicator sslmode=require' --pgdata /usr/local/pgsql16/pg9300/backup/node_75 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_75
06:09:26 2646 INFO pg_basebackup: initiating base backup, waiting for checkpoint to complete
06:09:26 2646 INFO pg_basebackup: checkpoint completed
06:09:26 2646 INFO pg_basebackup: write-ahead log start point: 0/6000028 on timeline 12
06:09:26 2646 INFO pg_basebackup: starting background WAL receiver
06:09:26 2646 INFO 22759/22759 kB (100%), 0/1 tablespace (...backup/node_75/global/pg_control)
06:09:26 2646 INFO 22759/22759 kB (100%), 1/1 tablespace
06:09:26 2646 INFO pg_basebackup:
06:09:26 2646 INFO
06:09:26 2646 INFO write-ahead log end point: 0/6000100
06:09:26 2646 INFO pg_basebackup:
06:09:26 2646 INFO
06:09:26 2646 INFO waiting for background process to finish streaming ...
06:09:26 2646 INFO pg_basebackup:
06:09:26 2646 INFO
06:09:26 2646 INFO syncing data to disk ...
06:09:27 2646 INFO pg_basebackup:
06:09:27 2646 INFO
06:09:27 2646 INFO renaming backup_manifest.tmp to backup_manifest
06:09:27 2646 INFO pg_basebackup:
06:09:27 2646 INFO
06:09:27 2646 INFO base backup completed
06:09:27 2646 INFO Creating the standby signal file at "/usr/local/pgsql16/pg9300/data/standby.signal", and replication setup at "/usr/local/pgsql16/pg9300/data/postgresql-auto-failover-standby.conf"
06:09:27 2646 INFO Contents of "/usr/local/pgsql16/pg9300/data/postgresql-auto-failover-standby.conf" have changed, overwriting
06:09:27 2646 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu14"
06:09:27 2668 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
06:09:27 2646 INFO PostgreSQL started on port 9300
06:09:27 2646 INFO Fetched current list of 2 other nodes from the monitor to update HBA rules, including 2 changes.
06:09:27 2646 INFO Ensuring HBA rules for node 60 "ubuntu13" (ubuntu13:9300)
06:09:27 2646 INFO Ensuring HBA rules for node 68 "ubuntu12" (ubuntu12:9300)
06:09:27 2646 INFO Reloading Postgres configuration and HBA rules
06:09:27 2645 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2668
06:09:27 2646 INFO Transition complete: current state is now "catchingup"
06:09:27 2646 INFO keeper has been successfully initialized.
06:09:27 2643 WARN pg_autoctl service node-init exited with exit status 0
06:09:27 2645 INFO Postgres controller service received signal SIGTERM, terminating
06:09:27 2645 INFO Stopping pg_autoctl postgres service
06:09:27 2645 INFO /usr/local/pgsql16/server/bin/pg_ctl --pgdata /usr/local/pgsql16/pg9300/data --wait stop --mode fast
06:09:27 2643 INFO Stop pg_autoctl
postgres@ubuntu14:~$
然後創建systemctl服務
root@ubuntu11:~# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu13 | 60 | ubuntu13:9300 | 12: 0/7000060 | read-write | primary | primary
ubuntu12 | 68 | ubuntu12:9300 | 12: 0/7000060 | read-only | secondary | secondary
ubuntu14 | 75 | ubuntu14:9300 | 12: 0/7000060 | read-only | secondary | secondary
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu13 | 60 | ubuntu13:9300 | 12: 0/7000060 | read-write | primary | primary
ubuntu12 | 68 | ubuntu12:9300 | 12: 0/7000060 | read-only | secondary | secondary
ubuntu14 | 75 | ubuntu14:9300 | 12: 0/7000060 | read-only | secondary | secondary
root@ubuntu11:~#
root@ubuntu11:~#
與此同時,pg_auto_failover會自動將集羣升級為強同步模式(number_sync_standbys=1)
root@ubuntu11:~# pg_autoctl get formation settings
Context | Name | Setting | Value
----------+----------+---------------------------+---------------------------------------------------------------
formation | default | number_sync_standbys | 1
primary | ubuntu13 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_68, pgautofailover_standby_75)'
node | ubuntu13 | candidate priority | 50
node | ubuntu12 | candidate priority | 50
node | ubuntu14 | candidate priority | 50
node | ubuntu13 | replication quorum | true
node | ubuntu12 | replication quorum | true
node | ubuntu14 | replication quorum | true
root@ubuntu11:~#
如果不希望新節點的candidate-priority為默認的50,可以降低其candidate-priority優先級
在ubuntu14自身節點上執行pg_autoctl set node candidate-priority --name ubuntu14 10 --monitor 'postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require'
root@ubuntu14:~# pg_autoctl set node candidate-priority --name ubuntu14 10 --monitor 'postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require'
06:26:00 6614 INFO Waiting for the settings to have been applied to the monitor and primary node
06:26:00 6614 INFO New state is reported by node 60 "ubuntu13" (ubuntu13:9300): "apply_settings"
06:26:00 6614 INFO Setting goal state of node 60 "ubuntu13" (ubuntu13:9300) to primary after it applied replication properties change.
06:26:01 6614 INFO New state is reported by node 60 "ubuntu13" (ubuntu13:9300): "primary"
10
root@ubuntu14:~#
然後從monitor上查看
###candidate priority修改前
root@ubuntu11:~# pg_autoctl get formation settings
Context | Name | Setting | Value
----------+----------+---------------------------+---------------------------------------------------------------
formation | default | number_sync_standbys | 1
primary | ubuntu13 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_68, pgautofailover_standby_75)'
node | ubuntu13 | candidate priority | 50
node | ubuntu12 | candidate priority | 50
node | ubuntu14 | candidate priority | 50
node | ubuntu13 | replication quorum | true
node | ubuntu12 | replication quorum | true
node | ubuntu14 | replication quorum | true
###candidate priority修改後
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl get formation settings
Context | Name | Setting | Value
----------+----------+---------------------------+---------------------------------------------------------------
formation | default | number_sync_standbys | 1
primary | ubuntu13 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_68, pgautofailover_standby_75)'
node | ubuntu13 | candidate priority | 50
node | ubuntu12 | candidate priority | 50
node | ubuntu14 | candidate priority | 10
node | ubuntu13 | replication quorum | true
node | ubuntu12 | replication quorum | true
node | ubuntu14 | replication quorum | true
嘗試將ubuntu14節點的quorum設置為false時報錯,應該是條件不滿足
root@ubuntu14:~# pg_autoctl set node replication-quorum --name ubuntu14 false --pgdata /usr/local/pgsql16/pg9300/data/ --monitor 'postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require'
06:30:53 7313 WARN Given --monitor URI, the --pgdata option is ignored
06:30:53 7313 INFO Connecting to monitor at "postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require"
06:30:53 7313 ERROR Monitor ERROR: can't set replication quorum to false
06:30:53 7313 ERROR Monitor DETAIL: At least 2 standby nodes are required in formation default with number_sync_standbys = 1, and only 1 would be participating in the replication quorum
06:30:53 7313 ERROR SQL query: SELECT pgautofailover.set_node_replication_quorum($1, $2, $3)
06:30:53 7313 ERROR SQL params: 'default', 'ubuntu14', 'false'
06:30:53 7313 ERROR Failed to update node replication quorum on node "ubuntu14"in formation "default" for replication_quorum: "false"
06:30:53 7313 ERROR Failed to set "replication-quorum" to "false".
root@ubuntu14:~#
5,其他命令
配置文件查看pg_autoctl show file
monitor節點
root@ubuntu11:~# pg_autoctl show file
File | Path
--------+----------------
Config | /root/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg
Pid | /run/user/0/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.pid
root@ubuntu11:~#
數據節點
root@ubuntu12:~#
root@ubuntu12:~# pg_autoctl show file
File | Path
--------+----------------
Config | /root/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg
State | /root/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.state
Init | /root/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.init
Pid | /run/user/0/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.pid
從monitor中刪除節點
pg_autoctl drop node --hostname ubuntu13
清除.local中的文件
pg_autoctl drop node --destory
pg_autoctl
+ create Create a pg_auto_failover node, or formation
+ drop Drop a pg_auto_failover node, or formation
+ config Manages the pg_autoctl configuration
+ show Show pg_auto_failover information
+ enable Enable a feature on a formation
+ disable Disable a feature on a formation
+ get Get a pg_auto_failover node, or formation setting
+ set Set a pg_auto_failover node, or formation setting
+ perform Perform an action orchestrated by the monitor
activate Activate a Citus worker from the Citus coordinator
run Run the pg_autoctl service (monitor or keeper)
stop signal the pg_autoctl service for it to stop
reload signal the pg_autoctl for it to reload its configuration
status Display the current status of the pg_autoctl service
help print help message
version print pg_autoctl version
pg_autoctl create
monitor Initialize a pg_auto_failover monitor node
postgres Initialize a pg_auto_failover standalone postgres node
coordinator Initialize a pg_auto_failover citus coordinator node
worker Initialize a pg_auto_failover citus worker node
formation Create a new formation on the pg_auto_failover monitor
pg_autoctl drop
monitor Drop the pg_auto_failover monitor
node Drop a node from the pg_auto_failover monitor
formation Drop a formation on the pg_auto_failover monitor
pg_autoctl config
check Check pg_autoctl configuration
get Get the value of a given pg_autoctl configuration variable
set Set the value of a given pg_autoctl configuration variable
pg_autoctl show
uri Show the postgres uri to use to connect to pg_auto_failover nodes
events Prints monitor's state of nodes in a given formation and group
state Prints monitor's state of nodes in a given formation and group
settings Print replication settings for a formation from the monitor
standby-names Prints synchronous_standby_names for a given group
file List pg_autoctl internal files (config, state, pid)
systemd Print systemd service file for this node
pg_autoctl enable
secondary Enable secondary nodes on a formation
maintenance Enable Postgres maintenance mode on this node
ssl Enable SSL configuration on this node
monitor Enable a monitor for this node to be orchestrated from
pg_autoctl disable
secondary Disable secondary nodes on a formation
maintenance Disable Postgres maintenance mode on this node
ssl Disable SSL configuration on this node
monitor Disable the monitor for this node
pg_autoctl get
+ node get a node property from the pg_auto_failover monitor
+ formation get a formation property from the pg_auto_failover monitor
pg_autoctl get node
replication-quorum get replication-quorum property from the monitor
candidate-priority get candidate property from the monitor
pg_autoctl get formation
settings get replication settings for a formation from the monitor
number-sync-standbys get number_sync_standbys for a formation from the monitor
pg_autoctl set
+ node set a node property on the monitor
+ formation set a formation property on the monitor
pg_autoctl set node
metadata set metadata on the monitor
replication-quorum set replication-quorum property on the monitor
candidate-priority set candidate property on the monitor
pg_autoctl set formation
number-sync-standbys set number-sync-standbys for a formation on the monitor
pg_autoctl perform
failover Perform a failover for given formation and group
switchover Perform a switchover for given formation and group
promotion Perform a failover that promotes a target node