Question about what to put on RAID and what to put on NVME

ch00f@lemmy.world · edit-2 5 months ago

Question about what to put on RAID and what to put on NVME

ch00f@lemmy.world · edit-2 5 months ago

Can you elaborate? (learning a lot at the moment).

My thought was to just copy over the whole database directory every night at like 2am. Though some of the services do offer built-in database backup tools which I assume are designed to do what you’re talking about.

MalReynolds@slrpnk.net · 5 months ago

Basically, you want to shut down the database before backing up. Otherwise, your backup might be mid-transaction, i.e. broken. If it’s docker you can just docker-compose down it, backup, and then docker-compose up, or equivalent.

ch00f@lemmy.world · 5 months ago

Wouldn’t this require the service to go down for a few minutes every night?

MalReynolds@slrpnk.net · 5 months ago

Yup (although minutes seems long and depending on usage weekly might be fine). You can also combine it with updates which require going down anyway.

pcouy@lemmy.pierre-couy.fr · 5 months ago

Alternatively, if your databases are on a filesystem that supports snapshots (LVM, btrfs or ZFS for instance), you can make a snapshot of the filesystem, mount the snapshot and backup thame database from it. This will ensure the backup is consistent with itself (the backed up directory was not written to between the beginning and the end of the backup)

ch00f@lemmy.world · 5 months ago

Doesn’t this just pass the issue to when the snapshot is made? If the snapshot is created mid-database update, won’t you have the same problem?

tal@lemmy.today · edit-2 5 months ago

No, because the DBMS is going to be designed to permit power loss in the middle of a write without being corrupted. It’ll do something vaguely like this, if you are, for example, overwriting an existing record with a new one:

Write that you are going to make a change in a way that does not affect existing data.
Perform a barrier operation (which could amount to just syncing to disk, or could just tell the OS’s disk cache system to place some restrictions on how it later syncs to disk, but in any event will ensure that all writes prior to to the barrier operation are on disk prior to those write operations subsequent to it).
Replace the existing record. This may be destructive of existing data.
Potentially remove the data written in Step 1, depending upon database format.

If the DBMS loses power and comes back up, if the data from Step #1 is present and complete, it’ll consider the operation committed, and simply continue the steps from there. If Step 1 is only partially on disk, it’ll consider it not committed and delete it, treat the commit as not having yet gone through. From the DBMS’s standpoint, either the change happens as a whole or does not happen at all.

That works fine for power loss or if a filesystem is snapshotted at an instant in time. Seeing a partial commit, as long as the DBMS’s view of the system was at an instant in time, is fine; if you start it up against that state, it will either treat the change as complete and committed or throw out an incomplete commit.

However, if you are a backup program and happily reading the contents of a file, you may be reading a database file with no synchronization, and may wind up with bits of one or multiple commits as the backup program reads the the file and the DBMS writes to it – a corrupt database after the backup is restored.

ch00f@lemmy.world · 5 months ago

Very good to know! Thanks.

tal@lemmy.today · edit-2 5 months ago

Some databases support snapshotting (which won’t take the database down), and I believe that backup systems can be aware of the DBMS. I’m not a good person to ask as to best practices, because I don’t admin a DBMS, but it’s an issue that I do mention when people are talking about backups and DBMSes – if you have one, be aware that a backup system is going to have to take into account the DBMS one way or another if you want to potentially avoid backing up a database in inconsistent state.