Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce NewMultiCfIterator API #12153

Closed
wants to merge 23 commits into from

Conversation

jaykorean
Copy link
Contributor

@jaykorean jaykorean commented Dec 15, 2023

NOTE: The behavior described below has changed in #12480 . Please refer to the PR for the expected behavior of the iterator.

This PR introduces a new public API, NewMultiCfIterator(). The new API takes a vector of column family handles to build a cross-column-family iterator, which internally maintains multiple DBIters as child iterators from a consistent database state. When a key exists in multiple column families, the iterator selects the value (and wide columns) from the first column family containing the key, following the order provided in the column_families parameter. Similar to the merging iterator, a min heap is used to iterate across the child iterators. Backward iteration and direction change functionalities will be implemented in future PRs.

The comparator used to compare keys across different column families will be derived from the iterator of the first column family specified in column_families. This comparator will be checked against the comparators from all other column families that the iterator will traverse. If there's a mismatch with any of the comparators, the initialization of the iterator will fail.

Please note that this PR is not enough for users to start using the new API. The new API and MultiCfIterator implementations are still marked as "DO NOT USE - UNDER CONSTRUCTION". This PR is just the first of many PRs that will follow soon.

This PR includes the following:

  • Introduction and partial implementation of the MultiCfIterator, which implements the generic Iterator interface. The implementation includes the construction of the iterator, SeekToFirst(), Next(), Valid(), key(), value(), and columns().
  • Unit tests to verify iteration across multiple column families in two distinct scenarios: (1) keys are unique across all column families, and (2) the same keys exist in multiple column families.

@jaykorean jaykorean force-pushed the multi_cf_iterator branch 5 times, most recently from c21d080 to 0401111 Compare December 19, 2023 22:09
@jaykorean jaykorean changed the title [WIP] Cross-ColumnFamily Iterator Introducing MultiCfIterator Dec 19, 2023
@jaykorean jaykorean changed the title Introducing MultiCfIterator Introduce MultiCfIterator Dec 19, 2023
@jaykorean jaykorean marked this pull request as ready for review December 19, 2023 22:46
@facebook-github-bot
Copy link
Contributor

@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

}
}

TEST_F(MultiCfIteratorTest, DISABLED_IterateAttributeGroups) {
Copy link
Contributor Author

@jaykorean jaykorean Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will re-enable it when attribute_groups() are implemented. Just to illustrate what they would return when implemented.

@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

include/rocksdb/db.h Outdated Show resolved Hide resolved
db/multi_cf_iterator_impl.cc Outdated Show resolved Hide resolved
db/multi_cf_iterator_test.cc Outdated Show resolved Hide resolved
include/rocksdb/db.h Outdated Show resolved Hide resolved
include/rocksdb/db.h Outdated Show resolved Hide resolved
include/rocksdb/db.h Outdated Show resolved Hide resolved
include/rocksdb/multi_cf_iterator.h Outdated Show resolved Hide resolved
@@ -238,9 +238,17 @@ class AttributeGroup {
WideColumns columns_;
};

inline bool operator==(const AttributeGroup& lhs, const AttributeGroup& rhs) {
return lhs.column_family() == rhs.column_family() &&
lhs.columns() == rhs.columns();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does == on WideColumns make sense? There is no documented canonical ordering, so the same set of column->value pairs could compare as !=

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Columns are always sorted (by name) before added. Not sure if I understand your concern here. Could you provide a quick example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data structure is fully defined in wide_columns.h. It is therefore reusable. This (reusable / general) comparison operator assumes an invariant on the data that the class itself does not describe. That mismatch (or omission) is asking for mismatch in expectations (potential bug).

Should this invariant always be true of WideColumns? I don't know, but that should be clear to users and developers.

Copy link
Contributor Author

@jaykorean jaykorean Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an invariant on the data that the class itself does not describe

column names and values are always compared bytewise and it's commented in the class, but I think we can do better documentation.

.. that should be clear to users and developers.

Are you worried that it is not clear enough for users and developers to know that WideColumns are always sorted by column names (by bytewise comparison) and both name and value of all columns should match for two WideColumns to be considered equal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an invariant on the data that the class itself does not describe

column names and values are always compared bytewise and it's commented in the class, but I think we can do better documentation.

Which class? The only reference I see to something like "compared bytewise" is on operator== for WideColumn, which is outside the class. There is no reference to sorting or ordering.

Are you worried that it is not clear enough for users and developers to know that WideColumns are always sorted by column names

I don't see any documentation (APIs in db.h, wide_columns.h or wiki) indicating that sortedness is expected or provided, or whether perhaps the user ordering provided in PutEntity is preserved. I don't even see clear documentation of whether PutEntity completely overwrites a previous PutEntity on the same key or only on a column-by-column basis.

Nor exactly what Delete() does after PutEntity, just "works the same way as for regular key-values." So does it delete only the default column (since you can't specify a column)?

CC @ltamasi

and both name and value of all columns should match for two WideColumns to be considered equal?

The presumption for operator== in C++ is structural equality. The question is what relationship that has to semantic equality, which can be different in various contexts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any documentation (APIs in db.h, wide_columns.h or wiki) indicating that sortedness is expected or provided, or whether perhaps the user ordering provided in PutEntity is preserved. I don't even see clear documentation of whether PutEntity completely overwrites a previous PutEntity on the same key or only on a column-by-column basis.

Nor exactly what Delete() does after PutEntity, just "works the same way as for regular key-values." So does it delete only the default column (since you can't specify a column)?

CC @ltamasi

These points are mostly covered by the API comments at https://github.com/facebook/rocksdb/blob/main/include/rocksdb/db.h#L430-L459 , which, similarly to those of Put, use the concept of database entries associated with keys. After a PutEntity, the key is associated with that entity, regardless of any previous value (whether it's a plain KV, an existing entity, or nothing at all); after a Delete, the key is not associated with any database entry, again regardless of any previous value. There is no expectation that the input WideColumns structure is sorted, although it does get sorted by RocksDB, and thus anything read back from RocksDB will be sorted. We can extend the comments to clarify this point.

@jaykorean Sorry about the delay! Will review the PR shortly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ltamasi If I was expecting PutEntity to do partial update, I might assume it's sloppy language, overly copying from Put(), even if it's technically correct and consistent. "will be overwritten" doesn't always mean entirely overwritten.

I think part of the confusion is singular "entry" and "entity" when dealing with plural things. Even if the description of Delete() is technically correct, it's not clear that it can deal with plurals like wide columns, especially since there's no DeleteEntity (like PutEntity / GetEntity). A reasonable user would doubt whether it handles new functionality even if its un-updated description technically implies it does.

db/multi_cf_iterator_impl.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

db/db_impl/db_impl.cc Outdated Show resolved Hide resolved
db/multi_cf_iterator_impl.h Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jaykorean has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the PR title and description (and re-import or copy to phabricator), but otherwise LGTM. (Assumes for now we are moving toward the unified iterator experience.) Thanks!

@jaykorean jaykorean changed the title Introduce MultiCfIterator Introduce NewMultiCfIterator API Mar 5, 2024
@facebook-github-bot
Copy link
Contributor

@jaykorean merged this pull request in 3412195.

@jaykorean jaykorean deleted the multi_cf_iterator branch March 5, 2024 18:27
facebook-github-bot pushed a commit that referenced this pull request Mar 18, 2024
Summary:
This PR continues #12153 by implementing the missing `Iterator` APIs - `Seek()`, `SeekForPrev()`, `SeekToLast()`, and `Prev`. A MaxHeap Implementation has been added to handle the reverse direction.

The current implementation does not include upper/lower bounds yet. These will be added in subsequent PRs. The API is still marked as under construction and will be lifted after being added to the stress test.

Please note that changing the iterator direction in the middle of iteration is expensive, as it requires seeking the element in each iterator again in the opposite direction and rebuilding the heap along the way. The first `Next()` after `SeekForPrev()` requires changing the direction under the current implementation. We may optimize this in later PRs.

Pull Request resolved: #12422

Test Plan: The `multi_cf_iterator_test` has been extended to cover the API implementations.

Reviewed By: pdillinger

Differential Revision: D54820754

Pulled By: jaykorean

fbshipit-source-id: 9eb741508df0f7bad598fb8e6bd5cdffc39e81d1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants