Skip to content

Commit

Permalink
sstables: don't forget to read static row
Browse files Browse the repository at this point in the history
[v2: fix check for static column (don't check if the schema is not compound)
 and move want-static-columns flag inside the filtering context to avoid
 changing all the callers.]

When a CQL request asks to read only a range of clustering keys inside
a partition, we actually need to read not just these clustering rows, but
also the static columns and add them to the response (as explained by Tomek
in issue scylladb#1568).

With the current code, that CQL request is translated into an
sstable::read_row() with a clustering-key filter. But this currently
only reads the requested clustering keys - NOT the static columns.

We don't want sstable::read_row() to unconditionally read the from disk
the static columns because if, for example, they are already cached, we
might not want to read them from disk. We don't have such partial-partition
cache yet, but we are likely to have one in the future.

This patch adds in the clustering key filter object a flag of whether we
need to read the static columns (actually, it's function, returning this
flag per partition, to match the API for the clustering-key filtering).

When sstable::read_row() sees the flag for this partition is true, it also
request to read the static columns.
Currently, the code always passes "true" for this flag - because we don't
have the logic to cache partially-read partitions.

The current find_disk_ranges() code does not yet support returning a non-
contiguous byte range, so this patch, if it notices that this partition
really has static columns in addition to the range it needs to read,
falls back to reading the entire partition. This is a correct solution
(and fixes scylladb#1568) but not the most efficient solution. Because static
columns are relatively rare, let's start with this solution (correct
by less efficient when there are static columns) and providing the non-
contiguous reading support is left as a FIXME.

Fixes scylladb#1568

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1471124536-19471-1-git-send-email-nyh@scylladb.com>
  • Loading branch information
nyh authored and avikivity committed Aug 15, 2016
1 parent 4fcebd4 commit 0d00da7
Show file tree
Hide file tree
Showing 4 changed files with 45 additions and 2 deletions.
8 changes: 8 additions & 0 deletions clustering_key_filter.cc
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,10 @@ class stateless_clustering_key_filter_factory : public clustering_key_filter_fac
virtual const std::vector<range<clustering_key_prefix>>& get_ranges(const partition_key& key) override {
return _ranges;
}

virtual bool want_static_columns(const partition_key& key) override {
return true;
}
};

class partition_slice_clustering_key_filter_factory : public clustering_key_filter_factory {
Expand Down Expand Up @@ -95,6 +99,10 @@ class partition_slice_clustering_key_filter_factory : public clustering_key_filt
}
return _slice.row_ranges(*_schema, key);
}

virtual bool want_static_columns(const partition_key& key) override {
return true;
}
};

static const shared_ptr<clustering_key_filter_factory>
Expand Down
7 changes: 7 additions & 0 deletions clustering_key_filter.hh
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ public:
// Create a clustering key filter that can be used for multiple clustering keys but they have to be sorted.
virtual clustering_key_filter get_filter_for_sorted(const partition_key&) = 0;
virtual const std::vector<range<clustering_key_prefix>>& get_ranges(const partition_key&) = 0;
// Whether we want to get the static row, in addition to the desired clustering rows
virtual bool want_static_columns(const partition_key&) = 0;

virtual ~clustering_key_filter_factory() = default;
};

Expand All @@ -65,6 +68,10 @@ public:
}
const std::vector<range<clustering_key_prefix>>& get_ranges(const partition_key& key) const;

bool want_static_columns(const partition_key& key) const {
return _factory ? _factory->want_static_columns(key) : true;
}

static const clustering_key_filtering_context create(schema_ptr, const partition_slice&);

static clustering_key_filtering_context create_no_filtering();
Expand Down
27 changes: 25 additions & 2 deletions sstables/partition.cc
Original file line number Diff line number Diff line change
Expand Up @@ -776,6 +776,21 @@ static inline clustering_key_prefix get_clustering_key(
return std::move(col.clustering);
}

static bool has_static_columns(const schema& schema, index_entry &ie) {
// We can easily check if there are any static columns in this partition,
// because the static columns always come first, so the first promoted
// index block will start with one, if there are any. The name of a static
// column is a composite beginning with a special marker (0xffff).
// But we can only assume the column name is composite if the schema is
// compound - if it isn't, we cannot have any static columns anyway.
//
// The first 18 bytes are deletion times (4+8), num blocks (4), and
// length of start column (2). Then come the actual column name bytes.
// See also composite::is_static().
auto data = ie.get_promoted_index_bytes();
return schema.is_compound() && data.size() >= 20 && data[18] == -1 && data[19] == -1;
}

future<sstable::disk_read_range>
sstables::sstable::find_disk_ranges(
schema_ptr schema, const sstables::key& key,
Expand All @@ -800,14 +815,22 @@ sstables::sstable::find_disk_ranges(
}
index_entry& ie = index_list[index_idx];
if (ie.get_promoted_index_bytes().size() >= 16) {
auto& ck_ranges = ck_filtering.get_ranges(
partition_key::from_exploded(*schema, key.explode(*schema)));
auto&& pkey = partition_key::from_exploded(*schema, key.explode(*schema));
auto& ck_ranges = ck_filtering.get_ranges(pkey);
if (ck_ranges.size() == 1 && ck_ranges[0].is_full()) {
// When no clustering filter is given to sstable::read_row(),
// we get here one range unbounded on both sides. This is fine
// (the code below will work with an unbounded range), but
// let's drop this range to revert to the classic behavior of
// reading entire sstable row without using the promoted index
} else if (ck_filtering.want_static_columns(pkey) && has_static_columns(*schema, ie)) {
// FIXME: If we need to read the static columns and also a
// non-full clustering key range, we need to return two byte
// ranges in the returned disk_read_range. We don't support
// this yet so for now let's fall back to reading the entire
// partition which is wasteful but at least correct.
// This case should be replaced by correctly adding the static
// column's blocks to the return.
} else if (ck_ranges.size() == 1) {
auto data = ie.get_promoted_index_bytes();
// note we already verified above that data.size >= 16
Expand Down
5 changes: 5 additions & 0 deletions sstables/sstables.hh
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,11 @@ public:
return _generation;
}

// read_row() reads the entire sstable row (partition) at a given
// partition key k, or a subset of this row. The subset is defined by
// a filter on the clustering keys which we want to read, which
// additionally determines also if all the static columns will also be
// returned in the result.
future<streamed_mutation_opt> read_row(
schema_ptr schema,
const key& k,
Expand Down

0 comments on commit 0d00da7

Please sign in to comment.