Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer static_cast in place of most reinterpret_cast #12308

Closed
wants to merge 5 commits into from

Conversation

pdillinger
Copy link
Contributor

@pdillinger pdillinger commented Jan 30, 2024

Summary:
The following are risks associated with pointer-to-pointer reinterpret_cast:

  • Can produce the "wrong result" (crash or memory corruption). IIRC, in theory this can happen for any up-cast or down-cast for a non-standard-layout type, though in practice would only happen for multiple inheritance cases (where the base class pointer might be "inside" the derived object). We don't use multiple inheritance a lot, but we do.
  • Can mask useful compiler errors upon code change, including converting between unrelated pointer types that you are expecting to be related, and converting between pointer and scalar types unintentionally.

I can only think of some obscure cases where static_cast could be troublesome when it compiles as a replacement:

  • Going through void* could plausibly cause unnecessary or broken pointer arithmetic. Suppose we have
    struct Derived: public Base1, public Base2. If we have Derived* -> void* -> Base2* -> Derived* through reinterpret casts, this could plausibly work (though technical UB) assuming the Base2* is not dereferenced. Changing to static cast could introduce breaking pointer arithmetic.
  • Unnecessary (but safe) pointer arithmetic could arise in a case like Derived* -> Base2* -> Derived* where before the Base2 pointer might not have been dereferenced. This could potentially affect performance.

With some light scripting, I tried replacing pointer-to-pointer reinterpret_casts with static_cast and kept the cases that still compile. Most occurrences of reinterpret_cast have successfully been changed (except for java/ and third-party/). 294 changed, 257 remain.

A couple of related interventions included here:

  • Previously Cache::Handle was not actually derived from in the implementations and just used as a void* stand-in with reinterpret_cast. Now there is a relationship to allow static_cast. In theory, this could introduce pointer arithmetic (as described above) but is unlikely without multiple inheritance AND non-empty Cache::Handle.
  • Remove some unnecessary casts to void* as this is allowed to be implicit (for better or worse).

Most of the remaining reinterpret_casts are for converting to/from raw bytes of objects. We could consider better idioms for these patterns in follow-up work.

I wish there were a way to implement a template variant of static_cast that would only compile if no pointer arithmetic is generated, but best I can tell, this is not possible. AFAIK the best you could do is a dynamic check that the void* conversion after the static cast is unchanged.

Test Plan: existing tests, CI

Summary:
The following are risks associated with pointer-to-pointer reinterpret_cast:
* Can produce the "wrong result" (crash or memory corruption). IIRC, in
theory this can happen for any up-cast of down-cast for a
non-standard-layout type, though in practice would only happen for
multiple inheritance cases (where the base class pointer might be
"inside" the derived object). We don't use multiple inheritance a lot,
but we do.
* Can mask useful compiler errors upon code change, including converting
between unrelated pointer types that you are expecting to be related,
and coverting between pointer and scalar types unintentionally.

I can only think of some obscure cases where static_cast could be
troublesome when it compiles as a replacement:
* Going through `void*` could plausibly cause unnecessary or broken
pointer arithmetic. Suppose we have
`struct Derived: public Base1, public Base2`.  If we have
`Derived*` -> `void*` -> `Base2*` -> `Derived*` through
reinterpret casts, this could plausibly work (though technical UB)
assuming the `Base2*` is not dereferenced. Changing to static cast could
introduce breaking pointer arithmetic.
* Unnecessary (but safe) pointer arithmetic could arise in a case like
`Derived*` -> `Base2*` -> `Derived*` where before the Base2 pointer
might not have been dereferenced. This could potentially affect
performance.

With some light scripting, I tried replacing pointer-to-pointer
reinterpret_casts with static_cast and kept the cases that still
compile. Most occurrences of reinterpret_cast have successfully been
changed (except for java/ and third-party/). 294 changed, 257 remain.

A couple of related interventions:
* Previously Cache::Handle was not actually drived from in the
implementations and just used as a `void*` stand-in with
reinterpret_cast. Now there is a relationship to allow static_cast. In
theory, this could introduce pointer arithmetic (as described above)
but is unlikely without multiple inheritance AND non-empty Cache::Handle.
* Remove some unnecessary casts to void* as this is allowed to be
  implicit (for better or worse).

Most of the remaining reinterpret_casts are for converting to/from raw
bytes of objects. We could consider better idioms for these patterns in
follow-up work.

I wish there were a way to implement a template variant of static_cast
that would only compile if no pointer arithmetic is generated, but
best I can tell, this is not possible. AFAIK the best you could do is a
dynamic check that the void* conversion after the static cast is
unchanged.

Test Plan: existing tests, CI
@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@pdillinger has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

As an additional safety measure, we could try building the original code with clang using the -Wreinterpret-base-class and -Wundefined-reinterpret-cast flags to see if it surfaces any issues.

@facebook-github-bot
Copy link
Contributor

@pdillinger has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@pdillinger merged this pull request in 54cb9c7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants