Corfu: Copy Compaction
This the fifth post in a series about Corfu. In the last post we discussed striping a log across many mirrored sets of disks for IO concurrency. In this post we will discuss deletions from the global log and copy collection techniques to free the disk space:
CORFU makes it easy for developers to build applications over the shared log without worrying about garbage collection strategies
Copy collection will also allow us to compact the log even when we don’t use deletes which the paper doesn’t mention. Why do we need compaction under write-only usage? Writes are quantized into pages written to directly by clients in an uncoordinated manner. Under realistic workloads many writes may be much smaller than the page size. Without copy compaction Corfu would squander disk space.
First let us consider deletes at arbitrary positions within our fictional shared log example. Huh? We didn’t discuss arbitrary deletes in the Paxos post. When replicating generic finite state machine commands (or SQL statements) using Paxos we need to truncate the log to stop it growing infinitely. Typically that is done by snapshotting the state of the application we are replicating then truncating old log entries up to a specific point. That’s going to be trivial to achieve with what we have described so far in this series of posts. Indeed the Corfu paper covers that simple technique in addition to deletion at arbitrary positions. Being able to delete from an arbitrary position in the log is going to make reclaiming space none trivial. Why is that helpful?
Well it’s a bit of a topic in and of itself but Apache Cassandra is a good example of how this is useful. Cassandra appends updates to keys onto the end of files using append-only semantics. It uses bloom filters and indexes of the keys within each file. This supports fast checking as to whether an update to any key is held within any file. Cassandra then supports deletion of a key and “copy compaction” that I will describe below to recycle file space. This means that for basic key-value persistence Cassandra doesn’t need any file storage other than the log-like append-only files. So by allowing for deletions at arbitrary points in the log Corfu allows you to build something very similar to Cassanda. I could even stick my neck out and say that if Cassanda was being built today it would built using the Corfu approach. If you disagree with me then please post a comment.
Back to our fictional example of a shared disk holding a log file. If we are deleting from arbitrary positions within the log file need only one bit per position to track which log entries are deleted. To collect garbage and free up disk space we simply break the file into chunks and copy them without the deleted entries. This can happen as a background activity.
With Corfu client processes get a lease from a global sequencer to append a page of data the log. We would want to tune the page size to both the application load and the physical media characteristics. If an application then needs to append a very small payload to the log it will leave most of a page empty. When we copy collect to remove garbage we can also compact adjacent small values into a single page. We only need a slightly richer file format to encode of where values start and end where within the files.