Bright DBA

ORACLE DBA – Tips and Techniques

Skip to content
  • Home
  • DBA Monitor
    • DBA Sheet v7.0
    • OS-DB-Commands
    • Database-Scripts
    • Generate INSERT statements from the SELECT query
    • Generate INSERT statements to a flat file using a PL/SQL query
  • Oracle
    • DB-Install
      • Install Oracle AI Database 26ai
      • Install 19c GUI
      • Install 19c in silent mode
      • Install 18c
      • Install 12.2 Database Binaries in silent mode
      • Install Oracle Software in silent mode 11gR2
      • Install 10g
      • Deinstall Oracle 11gR2 Database binaries
      • Deinstall Oracle 18c Database Binaries
      • Oracle 19c Deinstall
      • Uninstall Oracle Client on Windows
    • DB-Upgrade
      • Upgrade DB using DBUA from 11.2.0.4 to 12.2.0.1
      • Upgrade DB Manually 11.2.0.4 to 12.2.0.1
      • Upgrade Oracle Database from 12.2.0.1 to 19c using DBUA
      • Upgrade Oracle Database Manually from 12.2.0.1 to 19c
      • Rolling Upgrade With an Existing Physical Standby Database
      • Upgrade TIMEZONE Version
      • Upgrade Database to 12.2 with Physical Standby
    • Patching
      • Patch Terminology
      • Apply Patch RU on Database 19c
      • 18c DB Patching
      • 18c DB Patching – Youtube
      • Apply RU on DATABASE 12.2 (CDB/PDB)
      • ROLLBACK RU from DATABASE 12.2
      • Apply DB PSU on Standby
    • DB-Admin
      • Oracle Container Database (CDB)
        • CREATE NEW PDB BY CLONING AN EXISTING PDB
        • CREATE PDB USING DBCA
        • CREATE PDB USING SEED
        • DROP PDB MANUALLY
        • STARTUP/SHUTDOWN CDB AND PDB
        • Convert Non-CDB to PDB with DBMS_PDB
        • SCHEMA REFRESH FROM 11G TO 12C PDB
      • Non-CDB
        • Store DB credentials in Oracle Wallet
        • Create database -DBCA silent mode
        • Drop Database
        • Delete database -DBCA silent mode
        • DBT-06103 The port (1,522) is already used – DBCA
        • Create DB 19C – DBCA-GUI
        • Create DB 19C – DBCA-SILENT
        • CREATE NON-CDB ON ORACLE 12C IN SILENT MODE
        • Configure OEM Express 12c
        • Create Non-CDB on Oracle 12c using DBCA (GUI)
        • How to Change MAX_STRING_SIZE on Physical Standby Environment
        • How to change SQL  prompt to show connected user and database name
        • Enable Archivelog
        • Change Archive Dest
        • How to Recover Loss of DATA – (Without a Backup!)
        • Configure OEM DBConsole manually
        • Proxy User
        • Database Link
        • DBMS_SCHEDULER
        • Enable/Disable JOBS during Maintenance
        • ORA-00054
        • Blocking Sessions
        • Create Service using DBMS_SERVICE
        • Error TNS-12543
      • Under Edit
    • BACKUP
      • How to Recover Loss of DATA – (Without a Backup!)
      • BACKUP ORACLE HOME AND INVENTORY
      • SCHEMA REFRESH FROM 11G TO 12C PDB
      • RMAN-CATALOG
      • RMAN Database Restore RAC – RAC
      • RMAN Database Restore ASM – ASM
      • RMAN Database Restore from ASM to File System
      • RMAN ACTIVE DUPLICATION ASM TO ASM
      • Backup Based RMAN Duplicate Database
  • ExaDB
    • Oracle Exadata – Overview
    • Exadata DB Server (Compute Node)
    • Exadata Storage Server (Cell Node)
    • ExaData infi Network
    • Exadata Part Replacement Runbook
    • CellCLI Commands
  • P.Tuning
    • Tuning Stage 1
      • Statistics
      • Oracle Histograms
      • Export/Import schema statistics
      • CONTROL MANAGEMENT PACK ACCESS
      • Generate Explain Plan
      • SQL Query is really hanged or not
    • Tuning Stage 2
      • STATSPACK
      • ASH
      • AWR
      • Colored SQL_ID
      • SYSAUX Tablespace
    • Tuning Stage 3
      • Troubleshooting Long Running Queries
      • SQL Tuning Advisor
      • SQLT
        • How to Install SQLT
        • How to Run SQLT
        • How to Create Custom SQL Profile
        • Uninstall SQLT
      • Tracing
      • TRCA – Trace Analyzer
    • FF Analysis (Advanced Troubleshooting)
      • Fire-Fighting Analysis (Ongoing / Live Issues)
        • Explain Plan Analysis
        • Top SQL Analysis
        • 10046 Trace Analysis
        • 10053 Trace Analysis
      • Forensic Analysis (RCA – Completed Issues)
  • HA
    • CONFIGURATION
      • Active Dataguard on Non-ASM
      • Active Dataguard on ASM
      • Convert Physical Standby Database to Snapshot Standby Database
      • Convert Snapshot Standby database to Physical Standby database
      • Configure Cascading Standby Database
      • DG Broker Configuration
      • DG Broker Error – ORA-16714
      • Disable DG Broker
      • 12.2 Active Dataguard in CDB on Non-ASM
      • Create Physical Standby using RMAN Backup Without Duplicate Command
      • Create Physical Standby using RMAN Backup with Duplicate Command
      • 12c DG Broker Configuration
    • SWITCHOVER
      • Switchover Manual No-Broker
      • Switchover with Broker
    • FAILOVER
      • Failover with Broker(No Flashback)
      • Failover with No Broker (No Flashback)
      • Failover with Broker using Flashback
      • Failover with No Broker using Flashback
    • Enable Real-Time Apply
    • Refresh Standby Database using RMAN Incremental SCN Backup
    • Resetlogs on Primary where Standby in place
    • Drop SRL
  • RAC
    • Create users, groups and Paths for Oracle RAC
    • Configure ASMLib for Oracle ASM
    • Configure UDEV Rules for Oracle ASM
    • ASM
      • Move ASM Spfile to diff. DiskGroup
      • ASM Commands
      • Instantiating disk: failed
      • CSSD won’t start automatically
    • RAC Install
      • RAC Install 11.2.0.3
      • Upgrade GI to 11.2.0.4
      • Downgrade GI to 11.2.0.3
      • root.sh failed with ORA-29783 on RAC
      • Delete Node
      • Add Node
      • Delete Node without remove software
      • Add Node Back which was DELETED without remove software
    • RAC Admin
      • How to take OCR backup on 11.2.0.4
      • Cluster Name
      • Move/Relocate OCR
      • Restore loss of all VOTE disk
      • Cluster Startup issues
      • Add SCAN
    • RAC Tools
      • CLUVFY
      • OSWatcher
    • RAC Standby 12.2
    • Create ACFS File System on RAC
  • GG
    • GoldenGate Installation
    • Unidirectional DML Replication
    • Zero Downtime Migration
    • DDL Replication
    • GoldenGate 12C Installation
    • Integrated Capture
    • Oracle GoldenGate Heterogeneous Replication
    • Credential Store
    • ADD NEW TABLE TO AN EXISTING GOLDENGATE CONFIGURATION
    • Install Oracle GoldenGate 19C for Database 12C
    • Install Oracle GoldenGate 19C for Database 19C
  • OEM
    • Install OEM 13.2
    • Install OEM 13.3
    • Add Targets Manually on EM Cloud Control 13c
    • Upgrade OMSPatcher
    • Apply Patch on OMS 13.1
    • Apply Patch on Agent 13.1
  • MS-SQL
    • Installation & Basics
      • SQL Server Basics
      • Install Microsoft SQL Server Express 2014
      • Install Only SQL Server Management Studio
      • Create Database using GUI
      • Enable TCP/IP Connection
      • Allow Remote Connection
    • Security & User Management
      • Authentication Method
      • Create a new domain user account in Windows
      • Create Windows Authentication Login
      • Create SQL Server Authenticated Login in SQL Server
      • Create Windows Group Authentication Login in SQL Server
    • Backup & Recovery
      • SQL Server – Recovery Models
      • Database Backup-FULL-DIFFERENTIAL-TLOG
      • Restore Database without Tail-log Backup
      • Restore Database with Tail-log Backup
    • High Availability
      • Log Shipping (SQL Server 2022)
      • Windows Server 2022 Failover Cluster
      • Setup 4 Node Always On Availability Group
      • Who is NT AUTHORITY\SYSTEM?
      • Add Database to Existing AG using Backup and Restore
      • Manual Failover (Planned)
      • Automatic Failover (Unplanned)
  • MySQL
    • Install MySQL 8 on Linux 7 Using RPM Packages
    • Create MySQL Database
    • Create New User Account in MySQL
    • Grant Privileges in MySQL
    • Create Roles in MySQL
    • How To Backup MySQL Database Using mysqldump
    • How to Restore Database from Single MySQL Database Backup
    • How to Restore A Single Database From A Backup File Which Contains Multiple Databases
    • How To Restore Single Table From Full Database Backup in MySQL
    • How To Configure MySQL Master-Slave Replication?
  • PostgreSQL
    • Installation & Basics
      • Install PostgreSQL 15 using yum Repository on Linux
      • Install PostgreSQL Using Source Code
      • How to START/STOP PostgreSQL 13 on Linux
      • Create Database in PostgreSQL
    • Security & User Management
      • PostgreSQL User Management
      • pg_hba conf
    • Storage & Architecture
      • Change Data Directory
      • WAL Files
      • Change pg_wal path
      • PostgreSQL Tablespace
    • Backup & Recovery
      • Enable Archive Mode
      • Disable ARCHIVELOG
      • Logical – pg_dump and pg_restore (Backup & Restore)
      • Logical – pg_dumpall (Backup & Restore)
      • Physical – pg_basebackup (Backup and Restore)
      • Backup & Restore to Another Host (No Archive Mode)
      • Backup & Restore on Same Host
      • Restore PostgreSQL to New Host using pg_basebackup + WAL Archives
      • PostgreSQL PITR – Point in Time Recovery
    • High Availability
      • Streaming Replication
      • Manual Failover – Streaming Replication
      • Synchronous Replication
      • PostgreSQL HA Setup with REPMGR + Automatic Failover
    • Postgres DBA Scripts
  • DB2
    • Install DB2 V10.5 using GUI
    • DB2 STOP and START
    • DB2 Troubleshooting Guide
  • DevOps
    • How to Install Jenkins on Linux

Always On Availability Group

Setup (4-Node) SQL Server Always On Availability Group (2022)

How to configure a 4-node SQL Server 2022 Always On Availability Group using an existing Windows Server Failover Cluster (WSFC).

Table of Contents


1. Prerequisites (All nodes)
2. Enable Always On Availability Groups (All Nodes)
3. Set Recovery Model to Full & Take Full Backup of Database) – (Primary Node)
4. Create Availability Group (Primary Node)
5. Create HADR Endpoints (All Nodes)
6. Join Secondary Replicas (All secondary)
7. Grant CREATE ANY DATABASE (All Nodes) – if only (SEEDING_MODE = AUTOMATIC)
8. Add Database to Availability Group (Primary Node)
9. Verify
10. Health Monitoring


0. Environment

ItemHostnamePublic IPSQL Replication IP
DC / Witness ServerDC01192.168.2.20NA
DB Node 1NTICBPSQLSGV01192.168.2.2110.10.10.11
DB Node 2NTICBPSQLSGV02192.168.2.2210.10.10.12
DB Node 3NTICBPSQLSGV03192.168.2.2310.10.10.13
DB Node 4NTICBPSQLSGV04192.168.2.2410.10.10.14
WSFC Cluster NameNTICBPSQLSGC05192.168.2.25NA
AG Listener NameNTICBPSQLSGL05192.168.2.26NA
Availability Group NameNTICBPSQLSGA05NANA

1. Prerequisites

  • Windows Server Failover Cluster (WSFC) already configured (4-node cluster)
  • SQL Server 2022 Enterprise Edition installed on all nodes
  • Databases must use FULL recovery model
  • Full backup taken before adding to AG
  • All nodes in the same Active Directory domain
  • Proper DNS resolution between nodes
  • Dedicated domain service account MSAPICBSGSQL for SQL Server (same on all nodes)
  • Same Directory Path on ALL SQL Nodes (Mandatory for AG)

Windows Server Failover Clustering (WSFC) is a Windows feature that groups multiple servers (nodes) into a cluster. It provides:

  • A shared cluster database (stored in the quorum) that holds metadata about all cluster resources
  • A heartbeat mechanism between nodes to detect failures
  • Resource ownership — any clustered resource (like an AG) is “owned” by exactly one node at a time

2. Enable Always On Availability Groups

Each SQL Server instance on each node must have Always On Availability Groups enabled. This is done via SQL Server Configuration Manager.

It tells the SQL Server engine to register itself with the local WSFC service and activate the HADR (High Availability Disaster Recovery) subsystem.

Run on each SQL Server node:












3. Set Recovery Model to Full & Take Full Backup of Database) – (On PRIMARY )

  • Each availability database must use the full recovery model , In Full recovery model transaction log records are never automatically truncated.
  • Backup database at least once, so the LSN chain is established.
  • Restored on secondary replicas with WITH NORECOVERY (leaving them in a restoring state, ready to receive log stream data)

3.1 Set recovery model to FULL

CREATE DATABASE TEST;

ALTER DATABASE TEST SET RECOVERY FULL;
GO

3.2 Take a full backup

BACKUP DATABASE [TEST] TO DISK = 'K:\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\Backup\TEST.bak'; 
GO


4. Create Availability Group (Primary Node)

When a DBA runs CREATE AVAILABILITY GROUP, several things happen simultaneously inside SQL Server and WSFC:

Inside SQL Server:

  • A new AG object is created in system metadata (sys.availability_groups)
  • Replica endpoints (database mirroring endpoints, typically on port 5022) are configured to handle the log stream communication between primary and secondaries.
  • The primary replica begins tracking the synchronization state of each database for each secondary.

Inside WSFC:

  • SQL Server calls Windows Cluster APIs to register the AG as a cluster resource
  • This registration stores the AG name, its owning node, and its health-check configuration into the WSFC cluster database
  • From WSFC’s perspective, the AG is now just another “resource” it needs to monitor and potentially fail over — it doesn’t understand SQL internals, it just knows “this resource is healthy or it isn’t”

The key insight here is the separation of concerns: SQL Server owns the data and synchronization logic; WSFC owns the failover orchestration. They communicate through a defined interface.

Two distinct resources get registered with WSFC:

AG Resource:

  • Represents the Availability Group itself
  • WSFC tracks which node currently owns it (i.e., which node is primary)
  • Health is determined by querying SQL Server via the resource DLL

Listener Resource (Virtual Network Name + IP):

  • A Virtual Network Name (VNN) like AG-Listener that clients use to connect
  • A Virtual IP address that floats between nodes — it’s always active on whichever node is currently primary.
  • WSFC manages bringing this IP online/offline during failovers.

These two resources are grouped together in a cluster resource group, meaning WSFC always moves them together. You’d never want the listener pointing to Node A while the AG primary is on Node B.

Run on NTICBPSQLSGV01 (Primary only)

USE master;
GO

CREATE AVAILABILITY GROUP NTICBPSQLSGA05
	WITH (
		AUTOMATED_BACKUP_PREFERENCE = PRIMARY,
		FAILURE_CONDITION_LEVEL = 3,
		HEALTH_CHECK_TIMEOUT = 30000
		)
FOR
REPLICA ON
    'NTICBPSQLSGV01' WITH (
        ENDPOINT_URL = 'TCP://10.10.10.11:5022',
        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
        FAILOVER_MODE = AUTOMATIC,
        BACKUP_PRIORITY = 30,
        PRIMARY_ROLE(ALLOW_CONNECTIONS = ALL),
        SESSION_TIMEOUT = 10,
        SEEDING_MODE = AUTOMATIC
    ),
    'NTICBPSQLSGV02' WITH (
        ENDPOINT_URL = 'TCP://10.10.10.12:5022',
        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
        FAILOVER_MODE = AUTOMATIC,
        BACKUP_PRIORITY = 30,
        PRIMARY_ROLE(ALLOW_CONNECTIONS = ALL),
        SESSION_TIMEOUT = 10,
        SEEDING_MODE = AUTOMATIC
    ),
    'NTICBPSQLSGV03' WITH (
        ENDPOINT_URL = 'TCP://10.10.10.13:5022',
        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
        FAILOVER_MODE = AUTOMATIC,
        BACKUP_PRIORITY = 30,
        PRIMARY_ROLE(ALLOW_CONNECTIONS = ALL),
        SESSION_TIMEOUT = 10,
        SEEDING_MODE = AUTOMATIC
    ),
    'NTICBPSQLSGV04' WITH (
        ENDPOINT_URL = 'TCP://10.10.10.14:5022',
        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
        FAILOVER_MODE = AUTOMATIC,
        BACKUP_PRIORITY = 30,
        PRIMARY_ROLE(ALLOW_CONNECTIONS = ALL),
        SESSION_TIMEOUT = 10,
        SEEDING_MODE = AUTOMATIC
    )
	
LISTENER 'NTICBPSQLSGL05' ( WITH IP ( ('192.168.2.26','255.255.255.0')), PORT = 1433);
GO

IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name = 'AlwaysOn_health')
BEGIN
    ALTER EVENT SESSION [AlwaysOn_health] ON SERVER WITH (STARTUP_STATE = ON);
END

IF NOT EXISTS (SELECT * FROM sys.dm_xe_sessions WHERE name = 'AlwaysOn_health')
BEGIN
    ALTER EVENT SESSION [AlwaysOn_health] ON SERVER STATE = START;
END
GO



5. Create HADR Endpoints (All Nodes)

Why HADR Endpoint Is Required ON ALL NODES
Primary to Secondary CommunicationTransaction Log TransportThe endpoint sends transaction log records from the primary replica to secondary replicas.
Replica SynchronizationData Movement ChannelAll Always On data synchronization occurs through this endpoint.
Failover CoordinationReplica CommunicationReplicas communicate health status and synchronization state using the endpoint.
Mandatory RequirementAll AG NodesEvery SQL Server instance participating in an Availability Group must have a HADR endpoint configured.

On NTICBPSQLSGV01:

USE [master]
GO
CREATE ENDPOINT [Hadr_endpoint]
	STATE = STARTED
	AS TCP (LISTENER_PORT = 5022, LISTENER_IP = (10.10.10.11))
	FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE,ENCRYPTION = DISABLED)
GO

IF (SELECT state FROM sys.endpoints WHERE name = N'Hadr_endpoint') <> 0
BEGIN
	ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED
END

GO

USE [master]
GO
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [SQL\MSAPICBSGSQL$];
GO

ALTER AUTHORIZATION ON ENDPOINT::[Hadr_endpoint] TO [sa];
GO

Repeat on other nodes using their respective IP addresses.

On NTICBPSQLSGV02:

USE [master] 
GO
CREATE ENDPOINT [Hadr_endpoint]
	STATE = STARTED
	AS TCP (LISTENER_PORT = 5022, LISTENER_IP = (10.10.10.12))
	FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE,ENCRYPTION = DISABLED)
GO

IF (SELECT state FROM sys.endpoints WHERE name = N'Hadr_endpoint') <> 0
BEGIN
	ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED
END

GO

USE [master]
GO
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [SQL\MSAPICBSGSQL$];
GO

ALTER AUTHORIZATION ON ENDPOINT::[Hadr_endpoint] TO [sa];
GO

On NTICBPSQLSGV03:

USE [master] 
GO
CREATE ENDPOINT [Hadr_endpoint]
	STATE = STARTED
	AS TCP (LISTENER_PORT = 5022, LISTENER_IP = (10.10.10.13))
	FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE,ENCRYPTION = DISABLED)
GO

IF (SELECT state FROM sys.endpoints WHERE name = N'Hadr_endpoint') <> 0
BEGIN
	ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED
END

GO

USE [master]
GO
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [SQL\MSAPICBSGSQL$];
GO

ALTER AUTHORIZATION ON ENDPOINT::[Hadr_endpoint] TO [sa];
GO

On NTICBPSQLSGV04:

USE [master] 
GO
CREATE ENDPOINT [Hadr_endpoint]
	STATE = STARTED
	AS TCP (LISTENER_PORT = 5022, LISTENER_IP = (10.10.10.14))
	FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE,ENCRYPTION = DISABLED)
GO

IF (SELECT state FROM sys.endpoints WHERE name = N'Hadr_endpoint') <> 0
BEGIN
	ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED
END
GO

USE [master]
GO
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [SQL\MSAPICBSGSQL$];
GO

ALTER AUTHORIZATION ON ENDPOINT::[Hadr_endpoint] TO [sa];
GO


6. Join Secondary Replicas

Internal Process After JOIN
Step 1Cluster VerificationSQL Server verifies that the instance belongs to the same WSFC cluster.
Step 2Replica AuthenticationReplica authentication occurs through the HADR endpoint.
Step 3Configuration SyncPrimary replica sends Availability Group configuration details to the secondary server.
Step 4Seeding PreparationThe secondary replica prepares to receive database files through automatic seeding.
Step 5Log SynchronizationOnce database copy exists, transaction log synchronization begins.
Why This Step Is Required
Replica ParticipationMandatory StepA server cannot participate in an Availability Group until it explicitly joins using this command.
Cluster RegistrationReplica MappingRegisters the SQL Server instance as a cluster-managed replica ( secondary replica within the Availability Group).
High AvailabilityFailover SupportOnly joined replicas can participate in failover and synchronization.

Run on NTICBPSQLSGV02, NTICBPSQLSGV03 and NTICBPSQLSGV04

USE master;
GO
ALTER AVAILABILITY GROUP [NTICBPSQLSGA05] JOIN;
GO

On NTICBPSQLSGV02

 

On NTICBPSQLSGV03 

 

On NTICBPSQLSGV04

7. Grant CREATE ANY DATABASE (Only if (SEEDING_MODE = AUTOMATIC), else not required, skip this step)

  • This permission is crucial for automatic seeding (introduced in SQL Server 2016).
  • With automatic seeding, SQL Server creates the database on secondary replicas by streaming data from the primary.
  • To do this, the AG itself must have the right to create databases on those replicas — hence the grant required.

Run on all nodes – NTICBPSQLSGV01, NTICBPSQLSGV02, NTICBPSQLSGV03 and  NTICBPSQLSGV04

USE master;
GO
ALTER AVAILABILITY GROUP [NTICBPSQLSGA05]
GRANT CREATE ANY DATABASE;
GO

On NTICBPSQLSGV01:



On NTICBPSQLSGV02:
  

On NTICBPSQLSGV03:

 

On NTICBPSQLSGV04:


8. Add Database to Availability Group

Run on Primary Node (NTICBPSQLSGV01) 

USE master;
GO
ALTER AVAILABILITY GROUP [NTICBPSQLSGA05]
ADD DATABASE [TEST];
GO



9. Verify

10. Health Monitoring

This is where WSFC and SQL Server interact continuously at runtime.

hadrres.dll — The Resource DLL: This is a critical component. WSFC doesn’t know how to talk to SQL Server natively. Instead, SQL Server ships a DLL (hadrres.dll) that WSFC loads into the cluster service process. This DLL acts as an interpreter — it knows how to connect to SQL Server, run health queries, and translate the results into signals WSFC understands (healthy / degraded / failed).

The health check loop works like this:

  1. WSFC’s cluster service calls into hadrres.dll on a regular interval
  2. hadrres.dll opens a connection to the local SQL Server instance using the NT AUTHORITY\SYSTEM is the Local System account, Windows Server Failover Clustering service (ClusSvc.exe) runs under this identity by default. This means when WSFC (via hadrres.dll) connects to SQL Server, it authenticates as NT AUTHORITY\SYSTEM using Windows Authentication, NT AUTHORITY\SYSTEM login has these privileges by default CONNECT SQL, VIEW SERVER STATE, ALTER ANY AVAILABILITY GROUP.
  3. It executes sp_server_diagnostics which returns a structured XML output covering:
    • System health — CPU, memory, I/O pressure
    • Resource health — memory availability
    • Query processing — scheduler health, worker threads
    • IO subsystem — disk response times
    • Events — recent critical errors
  4. The DLL evaluates the results against thresholds and reports back to WSFC

If the health check fails enough times (based on configured HealthCheckTimeout and FailureConditionLevel), WSFC declares the resource failed and initiates failover.

11. Tips

  • Transaction log management becomes mandatory — Must schedule regular transaction log backups (on the primary or allowed secondaries) to prevent the log from growing indefinitely and filling the disk.
  • No data loss in sync mode — this is only possible because the full log chain is always available.
  • Backups still needed — Availability Groups are not a replacement for backups. AGs protect against instance/hardware failure, but not user errors, corruption, or accidental deletes — continue taking full + log backups.

12. Conclusion

  • SQL Server owns data consistency, synchronization, and log streaming
  • WSFC owns node health, resource ownership, and failover orchestration (which means the automated configuration, coordination, and management of complex computer systems)
  • NT AUTHORITY\SYSTEM + scoped grants is the narrow, secure bridge between the two
  • The listener is the stable endpoint that hides all the complexity from applications

Each piece does exactly one job, and they communicate through well-defined, minimal interfaces. That’s what makes Always On both robust and secure.


Caution: Your use of any information or materials on this website is entirely at your own risk. It is provided for educational purposes only. It has been tested internally, however, we do not guarantee that it will work for you. Ensure that you run it in your test environment before using.

Thank you,
Rajasekhar Amudala
Email: br8dba@gmail.com
Linkedin: https://www.linkedin.com/in/rajasekhar-amudala/

Rajasekhar Amudala

   Rajasekhar Amudala

Master of Business Administration (MBA)
Oracle 10g Certified Professional (OCP)
Oracle 11g Certified Professional (OCP)
Oracle 10g RAC Certified Expert (OCE)
Oracle 11g RAC Certified Expert (OCE)
Oracle Golden Gate Certified
Oracle Exadata Certified
OCI Architect Professional
About me - Rajasekhar Amudala
 

Loading Comments...