Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address clippy lints #160

Closed
wants to merge 23 commits into from
Closed

Conversation

lopopolo
Copy link
Contributor

This PR addresses clippy lint violations, one violation at a time, by working through cargo clippy output. This uses the default warn set of warnings in clippy.

I did run cargo clippy --fix -- --warn clippy::pedantic to see what it would do and it converted a lot of pointer as casts to ptr.cast::<_>() calls. I find this increases readability and is less footgun-y in that it preserves the constness of pointers, but I did not include these changes in this PR.

I did not include any changes to the CI pipeline in this PR.

See #158.

@@ -142,7 +137,7 @@ impl<'a> Iterator for WordIndices<'a> {

#[inline]
fn next(&mut self) -> Option<(usize, usize, &'a str)> {
while let Some((start, end, word)) = self.0.next() {
for (start, end, word) in self.0.by_ref() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears it can have the same Iterator::find simplification performed on it as the above hunk of the diff.

@lopopolo
Copy link
Contributor Author

clippy in pedantic mode reports an "unreadable literal" lint which suggests placing underscores in long literal numbers.

For example, this statement:

*cp = (b as u32 & 0b111111) | (*cp << 6);

is difficult to parse at first glance. It looked to me that the bit mask is a no-op, but with the underscore it's clear that the code masks off the top 2 bits of the byte:

*cp = (b as u32 & 0b0011_1111) | (*cp << 6);

@lopopolo
Copy link
Contributor Author

lopopolo commented May 22, 2023

These commits all fix pedantic lints and I'm a bit iffier on them. The transmute one I think is good though:

This is the CLI invocation to get a sense for what I've suppressed:

cargo clippy -- --warn clippy::pedantic \
    --allow clippy::module_name_repetitions \
    --allow clippy::must_use_candidate \
    --allow clippy::unnecessary_wraps \
    --allow clippy::missing_errors_doc \
    --allow clippy::match_same_arms \
    --allow clippy::enum_glob_use \
    --allow clippy::uninlined_format_args \
    --allow clippy::if_not_else \
    --allow clippy::single_match_else \
    --allow clippy::too_many_lines \
    --allow clippy::inline_always \
    --allow clippy::wildcard_imports \
    --allow clippy::similar_names \
    --allow clippy::ptr_as_ptr \
    --allow clippy::cast_ptr_alignment

Per upstream, the pedantic lints are intended to be liberally disabled if you disagree.

Comment on lines 235 to 242
unsafe fn ptr_add(ptr: *const u8, amt: usize) -> *const u8 {
debug_assert!(amt < ::core::isize::MAX as usize);
ptr.offset(amt as isize)
ptr.add(amt)
}

/// Decrement the given pointer by the given amount.
unsafe fn ptr_sub(ptr: *const u8, amt: usize) -> *const u8 {
debug_assert!(amt < ::core::isize::MAX as usize);
ptr.offset((amt as isize).wrapping_neg())
ptr.sub(amt)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two functions can probably be nuked and the routines from core slotted into place where they are used.

I believe these two functions on the pointer primitive were added during the great pointer reform of std.

@@ -479,7 +486,10 @@ mod bstr {
| '\x7f' => {
write!(f, "\\x{:02x}", ch as u32)?;
}
'\n' | '\r' | '\t' | _ => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this change, I'd probably want to add a comment on why newline, carriage return, and tab are called out specially here to get their own match arm.

@@ -1,3 +1,5 @@
#![allow(clippy::single_char_add_str)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are several places in the crate where the replacement character is used in a single character string. I prefer suppressing this because it means you don't pay the cost for a UTF-8 encode at runtime.

@@ -1,3 +1,5 @@
#![allow(clippy::all)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suppressed all lints because these modules appear to be generated code. clippy didn't like the use of the 'static lifetime in const definitions.

Comment on lines -76 to +78
unsafe { mem::transmute(slice) }
unsafe { &*(slice as *const [u8] as *const BStr) }
}

#[inline]
pub(crate) fn from_bytes_mut(slice: &mut [u8]) -> &mut BStr {
unsafe { mem::transmute(slice) }
unsafe { &mut *(slice as *mut [u8] as *mut BStr) }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting rid of the transmute seems like a win here. This cast soup could probably use a SAFETY comment that refers to the fact that BStr is repr(transparent) wrapper around [u8].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a clippy restriction lint that can be used to enforce that all unsafe blocks have safety comments: https://rust-lang.github.io/rust-clippy/master/index.html#undocumented_unsafe_blocks

Comment on lines -3747 to 3748
let line = &self.bytes[..end + 1];
let line = &self.bytes[..=end];
self.bytes = &self.bytes[end + 1..];
Copy link
Contributor Author

@lopopolo lopopolo May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably replace this suggestion and the code with a call to self.bytes.split_at(end). This would eliminate one bounds check and one panicking branch.

Comment on lines 3766 to +3767
let line = &self.bytes[end + 1..];
self.bytes = &self.bytes[..end + 1];
self.bytes = &self.bytes[..=end];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -188,7 +188,7 @@ pub trait ByteVec: private::Sealed {
fn imp(os_str: OsString) -> Result<Vec<u8>, OsString> {
use std::os::unix::ffi::OsStringExt;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code could be made to support wasi targets as well, which has a similar OsStringExt which is identical to the unix one.

#[cfg(unix)]
#[inline]
fn imp<'a>(os_str: &'a OsStr) -> Cow<'a, [u8]> {
fn imp(os_str: &OsStr) -> Cow<'_, [u8]> {
use std::os::unix::ffi::OsStrExt;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, wasi support could be added.

@lopopolo lopopolo mentioned this pull request May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant