Skip to content

Commit

Permalink
WBM: Fix stall deadlock with multiple cfs
Browse files Browse the repository at this point in the history
With a setting of multiple cfs and WriteBufferManager with allow_stall,
the DB can enter a deadlock when the WBM initiates a stall.
This happens since only the oldest cf is picked for flush when
HandleWriteBufferManagerFlush is called to flush the data and prevent the stall.
When using multiple CFs, this does not ensure the FreeMem will evict
enough memory to prevent a stall and no other flush is scheduled.

To fix this, add cfs to the flush queue so that we'll be below the mutable_limit_.
  • Loading branch information
Yuval Ariel committed Apr 14, 2024
1 parent 8d850b6 commit 7d31f9e
Showing 1 changed file with 9 additions and 10 deletions.
19 changes: 9 additions & 10 deletions db/db_impl/db_impl_write.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1747,9 +1747,9 @@ Status DBImpl::HandleWriteBufferManagerFlush(WriteContext* write_context) {
if (immutable_db_options_.atomic_flush) {
SelectColumnFamiliesForAtomicFlush(&cfds);
} else {
ColumnFamilyData* cfd_picked = nullptr;
SequenceNumber seq_num_for_cf_picked = kMaxSequenceNumber;

int64_t total_mem_to_free =
write_buffer_manager()->mutable_memtable_memory_usage() -
write_buffer_manager()->buffer_size() * 7 / 8;
for (auto cfd : *versions_->GetColumnFamilySet()) {
if (cfd->IsDropped()) {
continue;
Expand All @@ -1759,16 +1759,15 @@ Status DBImpl::HandleWriteBufferManagerFlush(WriteContext* write_context) {
// and no immutable memtables for which flush has yet to finish. If
// we triggered flush on CFs already trying to flush, we would risk
// creating too many immutable memtables leading to write stalls.
uint64_t seq = cfd->mem()->GetCreationSeq();
if (cfd_picked == nullptr || seq < seq_num_for_cf_picked) {
cfd_picked = cfd;
seq_num_for_cf_picked = seq;
auto mem_used = cfd->mem()->ApproximateMemoryUsageFast();
cfds.push_back(cfd);
total_mem_to_free -= mem_used;
if (total_mem_to_free <= 0) {
break;
}
}
}
if (cfd_picked != nullptr) {
cfds.push_back(cfd_picked);
}

MaybeFlushStatsCF(&cfds);
}
if (!cfds.empty()) {
Expand Down

0 comments on commit 7d31f9e

Please sign in to comment.