How To Failover Using Streaming Replication (No repmgr)
Table of Contents
- 0. Objective
- 1. Environment Setup
- 2. Simulate and Confirm Primary (rac1) Failure
- 3. Failover Procedure: Promote rac2 (Standby1) as New Primary
- 4. Verify Failover Operation (rac2 is primary)
- 5. Reconfigure rac3 (Standby2) to Point to New Primary (rac2)
- 6. Data Validation Test – Change 1
- 7. Data Validation Test – Change 2
- 8. Rebuild Old Primary (rac1) as Standby (Optional / Later Step)
- 9. Replication Verification – Change 3
- 10. Log Verification
0. Objective : Perform Failover from rac1 → rac2
PostgreSQL natively supports failover, not switchover (unlike Oracle Data Guard).
For managed or automated switchover, use tools such as repmgr or Patroni.
Notes:
- In case of an unplanned outage or disaster recovery, if the primary becomes unreachable, promote the standby to act as the new primary.
- The old primary must then be rebuilt as a standby using data from the new primary.
- For production environments, use replication slots to prevent WAL loss.
- Synchronous replication → Zero data loss (commit waits for standby acknowledgment).
- Asynchronous replication → Possible data loss during failover.
- Always check replication lag before performing a planned failover.
1. Environment Setup
Node | IP | Role | Sync Type | Purpose |
---|---|---|---|---|
rac1 | 192.168.2.21 | Primary | async | Old primary (will fail) |
rac2 | 192.168.2.22 | Standby | async | To be promoted as new primary |
rac3 | 192.168.3.20 | Standby | async | Will remain standby after failover |
2. Simulate and Confirm Primary (rac1) Failure
DISCLAIMER: The kill command given below is just for learning purposes and should only be used on testing systems. We will not take any responsibility of any consequences or loss of data caused by this command.
[root@rac1 data]# ps -ef | grep postgres postgres 12125 1 0 07:56 ? 00:00:00 /usr/pgsql-17/bin/postgres -D /pgData/pgsql17/data postgres 12127 12125 0 07:56 ? 00:00:00 postgres: logger postgres 12128 12125 0 07:56 ? 00:00:00 postgres: checkpointer postgres 12129 12125 0 07:56 ? 00:00:00 postgres: background writer postgres 12131 12125 0 07:56 ? 00:00:00 postgres: walwriter postgres 12132 12125 0 07:56 ? 00:00:00 postgres: autovacuum launcher postgres 12133 12125 0 07:56 ? 00:00:00 postgres: archiver last was 000000010000000000000017.00000028.backup postgres 12134 12125 0 07:56 ? 00:00:00 postgres: logical replication launcher postgres 12282 12125 0 08:08 ? 00:00:00 postgres: walsender repuser 192.168.2.22(51152) streaming 0/18023510 postgres 12283 12125 0 08:08 ? 00:00:00 postgres: walsender repuser 192.168.3.20(51354) streaming 0/18023510 root 15502 3678 0 12:52 pts/0 00:00:00 grep --color=auto postgres [root@rac1 data]# kill -9 12125 [root@rac1 data]# ps -ef | grep postgres root 15513 3678 0 12:52 pts/0 00:00:00 grep --color=auto postgres [root@rac1 data]# cd [root@rac1 ~]# [postgres@rac1 ~]$ psql psql: error: connection to server on socket "/run/postgresql/.s.PGSQL.5432" failed: Connection refused Is the server running locally and accepting connections on that socket? [postgres@rac1 ~]$
3. Failover Procedure: Promote rac2 (Standby1) as New Primary
3.1 Promote rac2 to Primary
# use pg_ctl promote -D /pgData/pgsql17/data (or) psql -U postgres -c "SELECT pg_promote();" to initiate failover
[postgres@rac2 ~]$ cat /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
Is the server running on that host and accepting TCP/IP connections?
2025-10-17 12:57:50.553 EDT [9070] LOG: waiting for WAL to become available at 0/18023528
2025-10-17 12:57:55.552 EDT [11746] FATAL: could not connect to the primary server: connection to server at "192.168.2.21", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2025-10-17 12:57:55.554 EDT [9070] LOG: waiting for WAL to become available at 0/18023528
2025-10-17 12:58:00.559 EDT [11747] FATAL: could not connect to the primary server: connection to server at "192.168.2.21", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2025-10-17 12:58:00.562 EDT [9070] LOG: waiting for WAL to become available at 0/18023528
2025-10-17 12:58:05.562 EDT [11748] FATAL: could not connect to the primary server: connection to server at "192.168.2.21", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2025-10-17 12:58:05.564 EDT [9070] LOG: waiting for WAL to become available at 0/18023528
[postgres@rac2 ~]$
[postgres@rac2 ~]$ psql -U postgres -c "SELECT pg_promote();"
pg_promote
------------
t <---- Failover completed successfully.
(1 row)
[postgres@rac2 ~]$
# Verify logs
[postgres@rac2 ~]$ cat /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
2025-10-17 12:59:30.609 EDT [9070] LOG: waiting for WAL to become available at 0/18023528
2025-10-17 12:59:32.986 EDT [9070] LOG: received promote request
2025-10-17 12:59:32.986 EDT [9070] LOG: redo done at 0/180234D8 system usage: CPU: user: 0.02 s, system: 0.02 s, elapsed: 17448.30 s
2025-10-17 12:59:32.986 EDT [9070] LOG: last completed transaction was at log time 2025-10-17 08:32:12.184495-04
2025-10-17 12:59:32.987 EDT [9070] LOG: selected new timeline ID: 2
2025-10-17 12:59:33.220 EDT [9070] LOG: archive recovery complete
2025-10-17 12:59:33.257 EDT [9068] LOG: checkpoint starting: force
2025-10-17 12:59:33.261 EDT [9064] LOG: database system is ready to accept connections
2025-10-17 12:59:33.320 EDT [9068] LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.017 s, total=0.064 s; sync files=2, longest=0.012 s, average=0.009 s; distance=0 kB, estimate=39825 kB; lsn=0/180235A0, redo lsn=0/18023548
2025-10-17 13:04:33.420 EDT [9068] LOG: checkpoint starting: time
2025-10-17 13:04:33.543 EDT [9068] LOG: checkpoint complete: wrote 1 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.101 s, sync=0.005 s, total=0.123 s; sync files=1, longest=0.005 s, average=0.005 s; distance=1 kB, estimate=35843 kB; lsn=0/18023C68, redo lsn=0/18023C10
[postgres@rac2 ~]$
3.2 Update pg_hba.conf
# Update pg_hba.conf with standby servers IP's
FROM:
[postgres@rac2 ~]$ cat /pgData/pgsql17/data/pg_hba.conf
..
..
# Allow standby servers for replication
host replication repuser 192.168.2.22/32 scram-sha-256
host replication repuser 192.168.3.20/32 scram-sha-256
[postgres@rac2 ~]$
TO:
[postgres@rac2 ~]$ cat /pgData/pgsql17/data/pg_hba.conf
..
..
# Allow standby servers for replication
host replication repuser 192.168.2.21/32 scram-sha-256
host replication repuser 192.168.3.20/32 scram-sha-256
[postgres@rac2 ~]$
3.3 Reset primary_conninfo in postgresql.auto.conf
# Update primary_conninfo info on postgresql.auto.conf
# Primary node will not have primary_conninfo entires, hence we are removing from primary node.
FROM:
[postgres@rac2 ~]$ cat /pgData/pgsql17/data/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=repuser password=repuser channel_binding=prefer host=192.168.2.21 port=5432 sslmode=prefer sslnegotiation=postgres sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable'
primary_slot_name = 'rac2_standby_slot'
[postgres@rac2 ~]$
TO:
[postgres@rac2 ~]$ psql -U postgres -c "ALTER SYSTEM RESET primary_conninfo;"
ALTER SYSTEM
[postgres@rac2 ~]$ psql -U postgres -c "ALTER SYSTEM RESET primary_slot_name;"
ALTER SYSTEM
[postgres@rac2 ~]$ psql -U postgres -c "SELECT pg_reload_conf();"
pg_reload_conf
----------------
t
(1 row)
[postgres@rac2 ~]$
[postgres@rac2 ~]$ cat /pgData/pgsql17/data/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
[postgres@rac2 ~]$
3.4 Restart PostgreSQL Service
[postgres@rac2 ~]$ psql -U postgres -c "SELECT pg_reload_conf();"
pg_reload_conf
----------------
t
(1 row)
[postgres@rac2 ~]$
--- OR ---
[root@rac2 ~]# systemctl stop postgresql-17.service
[root@rac2 ~]# systemctl start postgresql-17.service
4. Verify Failover Operation (rac2 is primary)
# Verify that rac2 is now primary
[postgres@rac2 ~]$ psql -U postgres -c "SELECT pg_is_in_recovery();"
pg_is_in_recovery
-------------------
f <---- false means Primary
(1 row)
[postgres@rac2 ~]$
5. Reconfigure rac3 (Standby2) to Point to New Primary (rac2)
5.1 Update primary_conninfo on rac3
# Update primary_conninfo info on postgresql.auto.conf
FROM:
[postgres@rac3 ~]$ cat /pgData/pgsql17/data/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=repuser password=repuser channel_binding=prefer host=192.168.2.21 port=5432 sslmode=prefer sslnegotiation=postgres sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable'
primary_slot_name = 'rac3_standby_slot'
[postgres@rac3 ~]$
TO:
postgres=# ALTER SYSTEM SET primary_conninfo = 'user=repuser password=repuser channel_binding=prefer host=192.168.2.22 port=5432 sslmode=prefer sslnegotiation=postgres sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable';
ALTER SYSTEM
postgres=# ALTER SYSTEM SET primary_slot_name = 'rac3_standby_slot';
ALTER SYSTEM
postgres=#
postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
postgres=#
[postgres@rac3 ~]$ cat /pgData/pgsql17/data/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=repuser password=repuser channel_binding=prefer host=192.168.2.22 port=5432 sslmode=prefer sslnegotiation=postgres sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable'
primary_slot_name = 'rac3_standby_slot'
[postgres@rac3 ~]$
[postgres@rac3 ~]$ psql -U postgres -c "SELECT pg_is_in_recovery();"
pg_is_in_recovery
-------------------
t <---- True means Standby
(1 row)
[postgres@rac3 ~]$
5.2 Update pg_hba.conf
[postgres@rac3 ~]$ cat /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
Is the server running on that host and accepting TCP/IP connections?
2025-10-17 13:00:30.625 EDT [8688] LOG: waiting for WAL to become available at 0/18023528
2025-10-17 13:00:35.627 EDT [11404] FATAL: could not connect to the primary server: connection to server at "192.168.2.21", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2025-10-17 13:00:35.629 EDT [8688] LOG: waiting for WAL to become available at 0/18023528
[postgres@rac3 ~]$
# Update pg_hba.conf
FROM:
[postgres@rac3 ~]$ cat /pgData/pgsql17/data/pg_hba.conf
..
..
# Allow standby servers for replication
host replication repuser 192.168.2.22/32 scram-sha-256
host replication repuser 192.168.3.20/32 scram-sha-256
[postgres@rac3 ~]$
TO:
[postgres@rac3 ~]$ cat /pgData/pgsql17/data/pg_hba.conf
..
..
# Allow standby servers for replication
host replication repuser 192.168.2.21/32 scram-sha-256
host replication repuser 192.168.3.20/32 scram-sha-256
[postgres@rac3 ~]$
5.3 Restart PostgreSQL Service
[postgres@rac3 ~]$ psql -U postgres -c "SELECT pg_reload_conf();"
pg_reload_conf
----------------
t
(1 row)
[postgres@rac3 ~]$
--- OR ---
[root@rac3 ~]# systemctl stop postgresql-17.service
[root@rac3 ~]# systemctl start postgresql-17.service
6. Data Validation Test – Change 1
6.1 Insert Data on New Primary and Verify logs
On Primary (rac2)
[postgres@rac2 ~]$ psql
psql (17.6)
Type "help" for help.
postgres=# SELECT * FROM EMP;
name | designation | project | company
------+-------------+---------+---------
Sugi | DBA | Jetstar | iGATE
Teja | DBA | RCM | iGATE
RAJ | DBA | RCM | iGATE
(3 rows)
postgres=# INSERT INTO emp VALUES ('KK', 'DBA LEAD', 'RCM', 'iGATE');
INSERT 0 1
postgres=# SELECT * FROM EMP;
name | designation | project | company
------+-------------+---------+---------
Sugi | DBA | Jetstar | iGATE
Teja | DBA | RCM | iGATE
RAJ | DBA | RCM | iGATE
KK | DBA LEAD | RCM | iGATE <-----
(4 rows)
postgres=#
2025-10-17 13:57:30.510 EDT [12525] ERROR: replication slot "rac3_standby_slot" does not exist
2025-10-17 13:57:30.510 EDT [12525] STATEMENT: START_REPLICATION SLOT "rac3_standby_slot" 0/18000000 TIMELINE 2
6.2 Create replication slot (rac3_standby_slot) on New Primary
# On new primary
[postgres@rac2 ~]$ psql -U postgres -c "SELECT * FROM pg_create_physical_replication_slot('rac3_standby_slot');"
slot_name | lsn
-------------------+-----
rac3_standby_slot |
(1 row)
[postgres@rac2 ~]$
2025-10-17 13:59:30.571 EDT [12570] ERROR: replication slot "rac3_standby_slot" does not exist
2025-10-17 13:59:30.571 EDT [12570] STATEMENT: START_REPLICATION SLOT "rac3_standby_slot" 0/18000000 TIMELINE 2
2025-10-17 13:59:34.554 EDT [9068] LOG: checkpoint starting: time
2025-10-17 13:59:34.681 EDT [9068] LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.102 s, sync=0.006 s, total=0.127 s; sync files=2, longest=0.004 s, average=0.003 s; distance=0 kB, estimate=32259 kB; lsn=0/18023EF0, redo lsn=0/18023E98
2025-10-17 13:59:35.572 EDT [12571] ERROR: replication slot "rac3_standby_slot" does not exist
2025-10-17 13:59:35.572 EDT [12571] STATEMENT: START_REPLICATION SLOT "rac3_standby_slot" 0/18000000 TIMELINE 2
2025-10-17 13:59:40.577 EDT [12572] ERROR: replication slot "rac3_standby_slot" does not exist
2025-10-17 13:59:40.577 EDT [12572] STATEMENT: START_REPLICATION SLOT "rac3_standby_slot" 0/18000000 TIMELINE 2
2025-10-17 13:59:45.580 EDT [12574] ERROR: replication slot "rac3_standby_slot" does not exist
2025-10-17 13:59:45.580 EDT [12574] STATEMENT: START_REPLICATION SLOT "rac3_standby_slot" 0/18000000 TIMELINE 2
2025-10-17 13:59:50.581 EDT [12575] ERROR: replication slot "rac3_standby_slot" does not exist
2025-10-17 13:59:50.581 EDT [12575] STATEMENT: START_REPLICATION SLOT "rac3_standby_slot" 0/18000000 TIMELINE 2
[postgres@rac2 ~]$
[postgres@rac2 ~]$ psql -U postgres -c "
> SELECT
> application_name AS standby_name,
> client_addr AS standby_ip,
> state,
> sync_state,
> pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS byte_lag
> FROM pg_stat_replication;
> "
standby_name | standby_ip | state | sync_state | byte_lag
--------------+--------------+-----------+------------+----------
walreceiver | 192.168.3.20 | streaming | async | 0 <--- no lag
(1 row)
[postgres@rac2 ~]$
*** Afrer few seconds there were no errors reported in the logs.
6.3 Verify Data on Standby and logs
postgres=# SELECT * FROM EMP;
name | designation | project | company
------+-------------+---------+---------
Sugi | DBA | Jetstar | iGATE
Teja | DBA | RCM | iGATE
RAJ | DBA | RCM | iGATE
KK | DBA LEAD | RCM | iGATE <----
(4 rows)
postgres=#
2025-10-17 13:59:50.581 EDT [12669] FATAL: could not start WAL streaming: ERROR: replication slot "rac3_standby_slot" does not exist
2025-10-17 13:59:50.584 EDT [8688] LOG: waiting for WAL to become available at 0/18002000
2025-10-17 13:59:55.583 EDT [12670] LOG: started streaming WAL from primary at 0/18000000 on timeline 2
2025-10-17 13:59:57.951 EDT [8686] LOG: restartpoint starting: time
2025-10-17 13:59:58.183 EDT [8686] LOG: restartpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.207 s, sync=0.009 s, total=0.232 s; sync files=3, longest=0.005 s, average=0.003 s; distance=2 kB, estimate=13284 kB; lsn=0/18023EF0, redo lsn=0/18023E98
2025-10-17 13:59:58.183 EDT [8686] LOG: recovery restart point at 0/18023E98
2025-10-17 13:59:58.183 EDT [8686] DETAIL: Last completed transaction was at log time 2025-10-17 13:55:47.944361-04.
[postgres@rac3 ~]$
*** Afrer few seconds there were no errors reported in the logs.
7. Data Validation Test – Change 2
7.1 Create Table and Insert Data on New Primary
[postgres@rac2 ~]$ psql
psql (17.6)
Type "help" for help.
postgres=# CREATE TABLE test_failover(id serial primary key, msg text);
CREATE TABLE
postgres=# INSERT INTO test_failover(msg) VALUES ('Replication working after failover');
INSERT 0 1
postgres=#
postgres=#
postgres=# SELECT * FROM test_failover;
id | msg
----+------------------------------------
1 | Replication working after failover
(1 row)
postgres=#
[postgres@rac2 ~]$ cat /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
2025-10-17 14:29:35.647 EDT [9068] LOG: checkpoint starting: time
2025-10-17 14:29:40.208 EDT [9068] LOG: checkpoint complete: wrote 45 buffers (0.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=4.436 s, sync=0.092 s, total=4.561 s; sync files=38, longest=0.037 s, average=0.003 s; distance=184 kB, estimate=26162 kB; lsn=0/18077EA8, redo lsn=0/18077E50
[postgres@rac2 ~]$
*** No errors reported in the logs.
7.2 Verify Data on Standby
[postgres@rac3 ~]$ psql -U postgres -c "SELECT * FROM test_failover;"
id | msg
----+------------------------------------
1 | Replication working after failover <---- Working as expected.
(1 row)
[postgres@rac3 ~]$
[postgres@rac3 ~]$ cat /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
2025-10-17 14:29:52.637 EDT [12847] LOG: restartpoint starting: time
2025-10-17 14:29:57.261 EDT [12847] LOG: restartpoint complete: wrote 46 buffers (0.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=4.537 s, sync=0.072 s, total=4.624 s; sync files=38, longest=0.015 s, average=0.002 s; distance=184 kB, estimate=184 kB; lsn=0/18077EA8, redo lsn=0/18077E50
2025-10-17 14:29:57.261 EDT [12847] LOG: recovery restart point at 0/18077E50
2025-10-17 14:29:57.261 EDT [12847] DETAIL: Last completed transaction was at log time 2025-10-17 14:28:28.964977-04.
[postgres@rac3 ~]$
*** No errors reported in the logs.
8. Rebuild Old Primary (rac1) as Standby (Optional / Later Step)
8.1 Stop PostgreSQL if Running
[root@rac1 ~]# systemctl stop postgresql-17.service
[root@rac1 ~]#
[root@rac1 ~]# ps -ef | grep postgres
root 16544 3678 0 14:43 pts/0 00:00:00 grep --color=auto postgres
[root@rac1 ~]#
8.2 Delete DATA, WAL, and ARCHIVE Files
[root@rac1 ~]# su - postgres
[postgres@rac1 ~]$
[postgres@rac1 ~]$ rm -rf /pgData/pgsql17/data/*
[postgres@rac1 ~]$ rm -rf /pgWal/pgsql17/wal/*
[postgres@rac1 ~]$ rm -rf /pgArch/pgsql17/arch/*
[postgres@rac1 ~]$
8.3 Perform Backup from Primary to Rebuild rac1 as Standby
[postgres@rac1 ~]$ pg_basebackup -h 192.168.2.22 -U repuser -p 5432 -D /pgData/pgsql17/data -Fp -Xs -P -R -C -S rac1_standby_slot
Password:
24643/24643 kB (100%), 1/1 tablespace
[postgres@rac1 ~]$
8.4 Verify postgresql.auto.conf and standby.signal
[postgres@rac1 ~]$ ls -ltr /pgData/pgsql17/data/standby.signal
-rw-------. 1 postgres postgres 0 Oct 17 14:49 /pgData/pgsql17/data/standby.signal
[postgres@rac1 ~]$
[postgres@rac1 ~]$ cat /pgData/pgsql17/data/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=repuser password=repuser channel_binding=prefer host=192.168.2.22 port=5432 sslmode=prefer sslnegotiation=postgres sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable'
primary_slot_name = 'rac1_standby_slot'
[postgres@rac1 ~]$
8.5 Start PostgreSQL on rac1
[root@rac1 ~]# systemctl start postgresql-17.service
[root@rac1 ~]#
[postgres@rac1 ~]$ cat /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
2025-10-17 14:54:36.566 EDT [16742] LOG: starting PostgreSQL 17.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-26), 64-bit
2025-10-17 14:54:36.566 EDT [16742] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-10-17 14:54:36.573 EDT [16742] LOG: listening on Unix socket "/run/postgresql/.s.PGSQL.5432"
2025-10-17 14:54:36.586 EDT [16742] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2025-10-17 14:54:36.596 EDT [16748] LOG: database system was interrupted; last known up at 2025-10-17 14:49:32 EDT
2025-10-17 14:54:38.095 EDT [16748] LOG: starting backup recovery with redo LSN 0/19000028, checkpoint LSN 0/19000080, on timeline ID 2
2025-10-17 14:54:38.095 EDT [16748] LOG: entering standby mode
2025-10-17 14:54:38.106 EDT [16748] LOG: redo starts at 0/19000028
2025-10-17 14:54:38.111 EDT [16748] LOG: completed backup recovery with redo LSN 0/19000028 and end LSN 0/19000120
2025-10-17 14:54:38.111 EDT [16748] LOG: consistent recovery state reached at 0/19000120
2025-10-17 14:54:38.112 EDT [16742] LOG: database system is ready to accept read-only connections
2025-10-17 14:54:38.181 EDT [16749] LOG: started streaming WAL from primary at 0/1A000000 on timeline 2
[postgres@rac1 ~]$
*** No errors reported in the logs.
9. Replication Verification – Change 3
9.1 Load Test Data into Table on Primary (rac2)
# Verify lag from Primary node
[postgres@rac2 ~]$ psql -U postgres -c "
> SELECT
> application_name AS standby_name,
> client_addr AS standby_ip,
> state,
> sync_state,
> pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS byte_lag
> FROM pg_stat_replication;
> "
standby_name | standby_ip | state | sync_state | byte_lag
--------------+--------------+-----------+------------+----------
walreceiver | 192.168.2.21 | streaming | async | 0
walreceiver | 192.168.3.20 | streaming | async | 0
(2 rows)
[postgres@rac2 ~]$
[postgres@rac2 ~]$ psql
psql (17.6)
Type "help" for help.
postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 13145
usesysid | 16452
usename | repuser
application_name | walreceiver
client_addr | 192.168.2.21
client_hostname |
client_port | 40334
backend_start | 2025-10-17 14:54:38.15459-04
backend_xmin |
state | streaming
sent_lsn | 0/1A000430
write_lsn | 0/1A000430
flush_lsn | 0/1A000430
replay_lsn | 0/1A000430
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
reply_time | 2025-10-17 17:05:39.286255-04
-[ RECORD 2 ]----+------------------------------
pid | 12747
usesysid | 16452
usename | repuser
application_name | walreceiver
client_addr | 192.168.3.20
client_hostname |
client_port | 36018
backend_start | 2025-10-17 14:16:07.947815-04
backend_xmin |
state | streaming
sent_lsn | 0/1A000430
write_lsn | 0/1A000430
flush_lsn | 0/1A000430
replay_lsn | 0/1A000430
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
reply_time | 2025-10-17 17:05:39.320194-04
postgres=#
# Verify Replication slots
[postgres@rac2 ~]$ psql -U postgres -c "SELECT slot_name, slot_type, database, active FROM pg_replication_slots;"
slot_name | slot_type | database | active
-------------------+-----------+----------+--------
rac3_standby_slot | physical | | t
rac1_standby_slot | physical | | t
(2 rows)
[postgres@rac2 ~]$
[postgres@rac2 ~]$ psql -U postgres -c "SELECT * FROM EMP;"
name | designation | project | company
------+-------------+---------+---------
Sugi | DBA | Jetstar | iGATE
Teja | DBA | RCM | iGATE
RAJ | DBA | RCM | iGATE
KK | DBA LEAD | RCM | iGATE
(4 rows)
[postgres@rac2 ~]$ psql -U postgres -c "INSERT INTO emp VALUES ('DEV', 'DBA MANAGER', 'RCM', 'iGATE');"
INSERT 0 1
[postgres@rac2 ~]$
[postgres@rac2 ~]$ psql -U postgres -c "SELECT * FROM EMP;"
name | designation | project | company
------+-------------+---------+---------
Sugi | DBA | Jetstar | iGATE
Teja | DBA | RCM | iGATE
RAJ | DBA | RCM | iGATE
KK | DBA LEAD | RCM | iGATE
DEV | DBA MANAGER | RCM | iGATE <---
(5 rows)
[postgres@rac2 ~]$
9.2 Monitor Replication on Standby 1 (rac1)
[postgres@rac1 ~]$ psql -U postgres -c "SELECT slot_name, slot_type, database, active FROM pg_replication_slots;"
slot_name | slot_type | database | active
-----------+-----------+----------+--------
(0 rows)
[postgres@rac1 ~]$
[postgres@rac1 ~]$ psql -U postgres -c "SELECT * FROM EMP;"
name | designation | project | company
------+-------------+---------+---------
Sugi | DBA | Jetstar | iGATE
Teja | DBA | RCM | iGATE
RAJ | DBA | RCM | iGATE
KK | DBA LEAD | RCM | iGATE
DEV | DBA MANAGER | RCM | iGATE <---
(5 rows)
[postgres@rac1 ~]$
9.3 Monitor Replication on Standby 2 (rac3)
[postgres@rac3 ~]$ psql -U postgres -c "SELECT slot_name, slot_type, database, active FROM pg_replication_slots;"
slot_name | slot_type | database | active
-----------+-----------+----------+--------
(0 rows)
[postgres@rac3 ~]$
[postgres@rac3 ~]$ psql -U postgres -c "SELECT * FROM EMP;"
name | designation | project | company
------+-------------+---------+---------
Sugi | DBA | Jetstar | iGATE
Teja | DBA | RCM | iGATE
RAJ | DBA | RCM | iGATE
KK | DBA LEAD | RCM | iGATE
DEV | DBA MANAGER | RCM | iGATE <---
(5 rows)
[postgres@rac3 ~]$
10. Log Verification
10.1 Check Logs on Primary
[postgres@rac2 ~]$ tail -100 /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
2025-10-17 14:54:32.813 EDT [9068] LOG: checkpoint starting: time
2025-10-17 14:54:32.860 EDT [9068] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s, total=0.048 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16384 kB, estimate=24261 kB; lsn=0/1A0000B8, redo lsn=0/1A000060
2025-10-17 15:04:33.059 EDT [9068] LOG: checkpoint starting: time
2025-10-17 15:04:33.219 EDT [9068] LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 2 recycled; write=0.101 s, sync=0.007 s, total=0.160 s; sync files=2, longest=0.005 s, average=0.004 s; distance=0 kB, estimate=21835 kB; lsn=0/1A000380, redo lsn=0/1A000328
[postgres@rac2 ~]$
*** No errors reported in the logs.
10.2 Check Logs on Standby 1 (rac1)
[postgres@rac1 ~]$ tail -100 /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
2025-10-17 15:04:36.818 EDT [16746] LOG: restartpoint starting: time
2025-10-17 15:04:36.946 EDT [16746] LOG: restartpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.107 s, sync=0.007 s, total=0.129 s; sync files=2, longest=0.005 s, average=0.004 s; distance=0 kB, estimate=14745 kB; lsn=0/1A000380, redo lsn=0/1A000328
2025-10-17 15:04:36.946 EDT [16746] LOG: recovery restart point at 0/1A000328
2025-10-17 15:04:36.946 EDT [16746] DETAIL: Last completed transaction was at log time 2025-10-17 15:03:38.479334-04.
[postgres@rac1 ~]$
*** No errors reported in the logs.
10.3 Check Logs on Standby 2 (rac3)
[postgres@rac3 ~]$ tail -100 /pgData/pgsql17/data/log/postgresql-Fri.log
..
..
..
2025-10-17 15:04:38.825 EDT [12847] LOG: restartpoint starting: time
2025-10-17 15:04:38.950 EDT [12847] LOG: restartpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.102 s, sync=0.008 s, total=0.125 s; sync files=2, longest=0.005 s, average=0.004 s; distance=0 kB, estimate=14745 kB; lsn=0/1A000380, redo lsn=0/1A000328
2025-10-17 15:04:38.950 EDT [12847] LOG: recovery restart point at 0/1A000328
2025-10-17 15:04:38.950 EDT [12847] DETAIL: Last completed transaction was at log time 2025-10-17 15:03:38.479334-04.
[postgres@rac3 ~]$
*** No errors reported in the logs.
Caution: Your use of any information or materials on this website is entirely at your own risk. It is provided for educational purposes only. It has been tested internally, however, we do not guarantee that it will work for you. Ensure that you run it in your test environment before using.
Thank you,
Rajasekhar Amudala
Email: br8dba@gmail.com
Linkedin: https://www.linkedin.com/in/rajasekhar-amudala/