Disk Space Requirements for Vertica

Disk Space Requirements for Vertica

Disk Space Requirements for Vertica

In addition to actual data stored in the database, Vertica requires disk space for several data reorganization operations, such as mergeout and managing nodes in the cluster. For best results, Vertica recommends that disk utilization per node be no more than sixty percent (60%) for a K-Safe=1 database to allow such operations to proceed.

In addition, disk space is temporarily required by certain query execution operators, such as hash joins and sorts, in the case when they cannot be completed in memory (RAM). Such operators might be encountered during queries, recovery, refreshing projections, and so on. The amount of disk space needed (known as temp space) depends on the nature of the queries, amount of data on the node and number of concurrent users on the system. By default, any unused disk space on the data disk can be used as temp space. However, Verticarecommends provisioning temp space separate from data disk space.

Managing Disk Space

Vertica detects and reports low disk space conditions in the log file so you can address the issue before serious problems occur. It also detects and reports low disk space conditions via SNMP traps if enabled.

Critical disk space issues are reported sooner than other issues. For example, running out of catalog space is fatal; therefore, Vertica reports the condition earlier than less critical conditions. To avoid database corruption when the disk space falls beyond a certain threshold, Vertica begins to reject transactions that update the catalog or data.

A low disk space report indicates one or more hosts are running low on disk space or have a failing disk. It is imperative to add more disk space (or replace a failing disk) as soon as possible.

When Vertica reports a low disk space condition, use the DISK_RESOURCE_REJECTIONS system table to determine the types of disk space requests that are being rejected and the hosts on which they are being rejected.

To add disk space, see Adding Disk Space to a Node. To replace a failed disk, see Replacing Failed Disks.

Monitoring Disk Space Usage

You can use these system tables to monitor disk space usage on your cluster:

System table / Description
DISK_STORAGE / Monitors the amount of disk storage used by the database on each node.
COLUMN_STORAGE / Monitors the amount of disk storage used by each column of each projection on each node.
PROJECTION_STORAGE / Monitors the amount of disk storage used by each projection on each node.