Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dm-cache documentation for XOSTOR #295

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions docs/xostor/xostor.md
Original file line number Diff line number Diff line change
Expand Up @@ -852,3 +852,66 @@ Note: iptables config must also be modified to remove LINSTOR port rules (edit `
For more info:
- https://linbit.com/blog/linstors-auto-evict/
- https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-linstor-auto-evict

### How to-enable dm-cache?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to enable dm-cache? (typo)

Should we also add a bit of context? What does the dm-cache do?


:::warning
This feature is currently experimental and not covered by [Vates Pro Support](https://vates.tech/pricing-and-support).
:::

On each host, create a new PV and VG using your cache devices:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On each host, create a new PV and VG using the cache devices:

```
pvcreate linstor_group_cache <CACHE_DEVICES>
vgcreate linstor_group_cache <CACHE_DEVICES>
```

Then you can enable the cache with few commands using the linstor controller.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you can enable the cache with a few commands using the linstor controller:


First, verify the group to modify, this last one must start with "xcp-sr-":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Verify the group to modify. It must start with "xcp-sr-":

```
linstor storage-pool list
```

Make sure the primary resource group is configured with cache support and enable the cache on the volume group:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Make sure the primary resource group is configured with cache support and enable the cache on the volume group:

```
linstor rg modify xcp-sr-linstor_group_thin_device --layer-list drbd,cache,storage
linstor vg set-property xcp-sr-linstor_group_thin_device 0 Cache/CachePool linstor_group_cache
```

:::tip
You can list caches on a host using `dmsetup ls`. Also one important thing, a cache is only created on diskful resources.
:::

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List caches on a host using dmsetup ls.
A cache is only created on diskful resources.

#### How to configure the cache size?

By default, a VDI uses a cache size of 1% of its volume size. But it can be changed globally for all VDIs:
```
linstor vg set-property xcp-sr-linstor_group_thin_device 0 Cache/Cachesize <PERCENTAGE>
```

You can change this value globally or on a particular resource definition with:
```
linstor rd set-property <VOLUME_NAME> Cache/Cachesize <PERCENTAGE>
```

It's totally arbitrary. You can go up to 20-30% for for VMS with a high write rate. This should be enough to support a significant number of requests. 10% for solicited VMs. Between 1-5% for VMs with a few requests. You can use 100% if you want, for example for a database on a small VDI with a lot of queries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's arbitrary and you can go up to 20-30% for VMS with a high write rate.


:::warning
The use of snapshots can consume more memory than necessary due to VHD chains that are too long. It's advisable to limit their use except via XOA during backup processes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to too-long VHD chains, snapshots can consume more memory than necessary. It's advisable to limit their use to backup processes via XOA.

:::

#### How to switch between read and read-write mode?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to switch between read and read-write modes? (typo)


Simply use:
```
linstor vg set-property xcp-sr-linstor_group_thin_device 0 Cache/OpMode <MODE>
```

By default `writethrough` mode is used. This mode is only useful to improve read performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to understand the cache system: are reads already "cached" by design in LINSTOR? ie doesn't wait to read to another node but having the local disk speed for it. Last time I did a bench, this is what I had. So I am probably ignoring something obvious in there.

Copy link
Member Author

@Wescoeur Wescoeur Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRBD can use the Linux system's buffers. But by design you don't have a cache: the read requests can be sent to the local host currently used and/or on another machine to accelerate parallel requests.

The interest of dm-cache here is to use HDDs (or slow SSDs) on the DRBD level and a faster device like an NVME to cache the most read blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mode is only useful for improving read performance.


With `writeback` mode enabled, the block to be written is added in the cached layer, not on the DRBD.
This data block is moved later, and the process caller (here tapdisk) is only notified when the block is flushed in the cache disk.

This algorithm is efficient in not having to wait for writes to be flushed to the local disk as well as to other DRBDs replicated on other nodes.
But if you have a power outage on the machine using the cache and if it contains data, then the data is lost.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if a power outage occurs on a machine using a cache that contains data, the data will be lost.

You don't have this issue with writethrough, but this mode is only used for read performance.