Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a separate range classes for internal usage #12071

Closed
wants to merge 3 commits into from

Conversation

jowlyzhang
Copy link
Contributor

@jowlyzhang jowlyzhang commented Nov 16, 2023

Introduce some different range classes UserKeyRange and UserKeyRangePtr to be used by internal implementation. The Range class is used in both public APIs like DB::GetApproximateSizes, DB::GetApproximateMemTableStats, DB::GetPropertiesOfTablesInRange etc and internal implementations like ColumnFamilyData::RangesOverlapWithMemtables, VersionSet::GetPropertiesOfTablesInRange.

These APIs have different expectations of what keys this range class contain. Public API users are supposed to populate the range with the user keys without timestamp, in the same way that point lookup and range scan APIs' key input only expect the user key without timestamp. The internal APIs implementation expect a user key whose format is compatible with the user comparator, a.k.a a user key with the timestamp.

This PR contains:

  1. introducing counterpart range class UserKeyRange UserKeyRangePtr for internal implementation while leave the existing Range and RangePtr class only for public APIs. Internal implementations are updated to use this new class instead.
  2. add user-defined timestamp support for DB::GetPropertiesOfTablesInRange API and DeleteFilesInRanges API.

Test Plan:
existing tests
Added test for DB::GetPropertiesOfTablesInRange and DeleteFilesInRanges APIs for when user-defined timestamp is enabled.
The change in external_file_ingestion_job doesn't have a user-defined timestamp enabled test case coverage, will add one in a follow up PR that adds file ingestion support for UDT.

@jowlyzhang jowlyzhang marked this pull request as draft November 16, 2023 01:14
@jowlyzhang jowlyzhang changed the title Add a separate range class for internal usage Add a separate range classes for internal usage Nov 16, 2023
@jowlyzhang jowlyzhang marked this pull request as ready for review November 16, 2023 22:14
@facebook-github-bot
Copy link
Contributor

@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @jowlyzhang! Sorry about the long delay - this fell off my radar somehow.

db/db_compaction_test.cc Outdated Show resolved Hide resolved
Comment on lines 4921 to 4922
assert(ranges[i].start == nullptr || start.has_value());
assert(ranges[i].limit == nullptr || limit.has_value());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really an "if and only if" relation, right? If so, we could extend these assertions to cover the opposite direction too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching this! This assertion is indeed only partially correct. It would pass too if ranges[i].start is null but start has value. I have updated these assertions.

db/dbformat.h Outdated
Comment on lines 97 to 98
const Slice* start;
const Slice* limit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind UserKeyRangePtr is that each boundary can be optional, right? If so, we could use std::optional<Slice> instead of const Slice*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Using a pointer type here is really an implicit way of expressing the optional nature of the bound. I have updated their types.

db/dbformat.h Outdated Show resolved Hide resolved
db/external_sst_file_ingestion_job.cc Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@jowlyzhang has updated the pull request. You must reimport the pull request before landing.

@jowlyzhang
Copy link
Contributor Author

Thanks for the PR @jowlyzhang! Sorry about the long delay - this fell off my radar somehow.

Thank you for the thorough review and improvement suggestions! No worries, this is just a refactor PR. I only imported it recently when I figured some other ongoing work can use this.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang merged this pull request in e3e8fbb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants