What is a PostgreSQL Replication Slot?
A replication slot in PostgreSQL is a mechanism that ensures the WAL (Write-Ahead Log) files needed by a replica (subscriber or standby server) are retained on the primary server until they are no longer needed. It essentially tells PostgreSQL.Do not delete WAL data until all connected replicas have processed it.
Why Do We Use Replication Slots?
- Reliable Streaming Replication: They prevent WAL segments from being deleted before a replica has received them. This ensures data consistency and avoids data loss during replication.
- Support for Logical Replication: In logical replication (e.g., streaming data to a different schema or database), slots track the replication progress.
- Resilient Replica Recovery: If a replica is temporarily down, WAL files are preserved until the replica reconnects and catches up.
1. Physical Replication Slot:
* Used in streaming replication.
* Tied to the physical binary structure of the database.
2. Logical Replication Slot:
* Used in logical replication (e.g., pub/sub model).
* Works at the level of SQL changes (INSERT/UPDATE/DELETE).
Privileges Needed for Replication Slots
To create or drop replication slots, a user needs:
* Superuser privileges OR
* The REPLICATION role attribute.
Required privileges for:
* Creating a replication slot: pg_create_physical_replication_slot() or pg_create_logical_replication_slot()
* Dropping a slot: pg_drop_replication_slot()
Common Errors and Resolutions:
ERROR: replication slot <name> is already active
Cause: The Slot is already used by another process
Resolution: Disconnect the existing replication client or use a different slot
FATAL: could not receive data from WAL stream
ERROR: requested WAL segment has already been removed
Cause: WAL files were deleted before the replica could read them Resolution: Recreate the replica from a fresh base backup or use WAL archiving
ERROR: could not write to file "pg_wal/...": No space left on device Cause: WALs are retained because the slot is not being consumed
Resolution: Monitor replication lag, drop unused slots, or fix replica
ERROR: replication slot "<slot>" does not exist Cause: Trying to access a non-existent slot
Resolution: Verify the slot name or recreate the replication slot
ERROR: Logical replication slot <name> inactive for too long Cause: The subscriber hasn't connected in a while
Resolution: Investigate the subscriber or consider dropping the slot
Managing Replication Slots:
Create Physical Slot:
SELECT * FROM pg_create_physical_replication_slot('my_physical_slot');
Create Logical Slot:
SELECT * FROM pg_create_logical_replication_slot('my_logical_slot', 'pgoutput');
List All Slots:
SELECT * FROM pg_replication_slots;
Drop a Slot:
SELECT pg_drop_replication_slot('my_slot');
Best Practices:
- Monitor Slot Activity: Use pg_replication_slots to ensure slots are being consumed.
- Avoid Orphan Slots: Unused slots can fill the disk by retaining WALs. Drop them if not needed.
- Configure Monitoring/Alerts: Watch for lagging replicas or inactive slots.
- Use Logical Decoding Carefully: Logical slots can grow WALs fast if the subscriber is slow.
A diagram showing how replication slots work in a streaming or logical setup: