ISSUE TYPE
COMPONENT NAME
CLOUDSTACK VERSION
CONFIGURATION
- CloudStack with KVM
- Ceph (RBD) Primary Storage
OS / ENVIRONMENT
N/A
SUMMARY
CloudStack heavily relies on the storage pool support of Libvirt for reporting capacity and allocation.
Libvirt will report the raw capacity of a Ceph cluster, for example:
root@hv-138-b15-27:~# virsh pool-info 27f662fa-6c7d-35fe-84d7-69b044600ffe
Name: 27f662fa-6c7d-35fe-84d7-69b044600ffe
UUID: 27f662fa-6c7d-35fe-84d7-69b044600ffe
State: running
Persistent: no
Autostart: no
Capacity: 1015.04 TiB
Allocation: 668.20 MiB
Available: 547.42 TiB
root@hv-138-b15-27:~#
In this case the Ceph cluster is almost 1PB in RAW capacity and t his is being reported by Libvirt.
However, when 1TB of data is stored, due to the 3x replication configured inside Ceph we will use 3TB of RAW storage.
CloudStack makes a calculation based on the Volumes ALLOCATED on the Ceph cluster, so let's assume we have a couple of volumes on this pool:
- 10x 1TB
- 20x 2TB
- 100x 500GB
In total we have allocated (10x1TB)+(20x2TB)+(100x500GB) 100TB of storage.
However, because we replicate 3x, we will use 300TB of RAW capacity.
CloudStack now thinks that 10% is in use, but in reality we use 30% of the cluster.
The disablethreshold of a storage pool can only be defined cluster-wide or zone-wide, but if you have a situation where you mix Ceph and NFS this does not work.
For example you want to set:
IDEAS
I suggest two possible changes:
storage.overprovisioning.factor
This value is currently only allowed to be set >1. If we allow a value of 0.3 we can underprovision the pool.
pool.storage.allocated.capacity.disablethreshold
We should allow this setting to be configured per storage pool.
ISSUE TYPE
COMPONENT NAME
CLOUDSTACK VERSION
CONFIGURATION
OS / ENVIRONMENT
N/A
SUMMARY
CloudStack heavily relies on the storage pool support of Libvirt for reporting capacity and allocation.
Libvirt will report the raw capacity of a Ceph cluster, for example:
In this case the Ceph cluster is almost 1PB in RAW capacity and t his is being reported by Libvirt.
However, when 1TB of data is stored, due to the 3x replication configured inside Ceph we will use 3TB of RAW storage.
CloudStack makes a calculation based on the Volumes ALLOCATED on the Ceph cluster, so let's assume we have a couple of volumes on this pool:
In total we have allocated (10x1TB)+(20x2TB)+(100x500GB) 100TB of storage.
However, because we replicate 3x, we will use 300TB of RAW capacity.
CloudStack now thinks that 10% is in use, but in reality we use 30% of the cluster.
The disablethreshold of a storage pool can only be defined cluster-wide or zone-wide, but if you have a situation where you mix Ceph and NFS this does not work.
For example you want to set:
IDEAS
I suggest two possible changes:
storage.overprovisioning.factor
This value is currently only allowed to be set >1. If we allow a value of 0.3 we can underprovision the pool.
pool.storage.allocated.capacity.disablethreshold
We should allow this setting to be configured per storage pool.