Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally derive Debug for structs that allow it #1040

Draft
wants to merge 14 commits into
base: master
Choose a base branch
from
10 changes: 10 additions & 0 deletions c2rust-transpile/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ use itertools::Itertools;
use log::{info, warn};
use regex::Regex;
use serde_derive::Serialize;
use strum_macros::Display;

use crate::c_ast::Printer;
use crate::c_ast::*;
Expand Down Expand Up @@ -82,6 +83,7 @@ pub struct TranspilerConfig {
pub disable_refactoring: bool,
pub preserve_unused_functions: bool,
pub log_level: log::LevelFilter,
pub derives: Vec<Derive>,

// Options that control build files
/// Emit `Cargo.toml` and `lib.rs`
Expand All @@ -91,6 +93,14 @@ pub struct TranspilerConfig {
pub binaries: Vec<String>,
}

#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Display, PartialOrd, Ord)]
pub enum Derive {
Clone,
Copy,
Debug,
BitfieldStruct,
}

impl TranspilerConfig {
fn binary_name_from_path(file: &Path) -> String {
let file = Path::new(file.file_stem().unwrap());
Expand Down
60 changes: 50 additions & 10 deletions c2rust-transpile/src/translator/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ use dtoa;
use failure::{err_msg, format_err, Fail};
use indexmap::indexmap;
use indexmap::{IndexMap, IndexSet};
use itertools::Itertools;
use log::{error, info, trace, warn};
use proc_macro2::{Punct, Spacing::*, Span, TokenStream, TokenTree};
use syn::spanned::Spanned as _;
Expand All @@ -27,12 +28,12 @@ use c2rust_ast_builder::{mk, properties::*, Builder};
use c2rust_ast_printer::pprust::{self};

use crate::c_ast::iterators::{DFExpr, SomeId};
use crate::c_ast::*;
use crate::cfg;
use crate::convert_type::TypeConverter;
use crate::renamer::Renamer;
use crate::with_stmts::WithStmts;
use crate::{c_ast, format_translation_err};
use crate::{c_ast::*, Derive};
use crate::{ExternCrate, ExternCrateDetails, TranspilerConfig};
use c2rust_ast_exporter::clang_ast::LRValue;

Expand Down Expand Up @@ -1194,6 +1195,12 @@ struct ConvertedVariable {
pub init: TranslationResult<WithStmts<Box<Expr>>>,
}

pub struct ConvertedStructFields {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needed to be pub because convert_struct_fields is, but it seems like both of those should possibly be pub(crate). I didn't want to make that change because I didn't see very many instances of pub(crate) in c2rust-transpile. Should I leave that as is or make these crate-private?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub should be fine here. We haven't been careful with the pub API for c2rust-transpile since it's meant to be used only from the CLI (being able to call it programmatically would be nice, but it's tricky, as the C++ clang part of it isn't entirely thread-safe). So since most of the types here are pub but probably shouldn't be, it's fine to make this one pub, too.

pub field_entries: Vec<Field>,
pub contains_va_list: bool,
pub can_derive_debug: bool,
}

impl<'c> Translation<'c> {
pub fn new(
mut ast_context: TypedAstContext,
Expand Down Expand Up @@ -1628,14 +1635,12 @@ impl<'c> Translation<'c> {
}

// Gather up all the field names and field types
let (field_entries, contains_va_list) =
self.convert_struct_fields(decl_id, fields, platform_byte_size)?;
let ConvertedStructFields {
field_entries,
contains_va_list,
can_derive_debug,
} = self.convert_struct_fields(decl_id, fields, platform_byte_size)?;

let mut derives = vec![];
if !contains_va_list {
derives.push("Copy");
derives.push("Clone");
};
let has_bitfields =
fields
.iter()
Expand All @@ -1644,10 +1649,27 @@ impl<'c> Translation<'c> {
_ => unreachable!("Found non-field in record field list"),
});
if has_bitfields {
derives.push("BitfieldStruct");
self.use_crate(ExternCrate::C2RustBitfields);
}

let derives = self
.tcfg
.derives
.iter()
.flat_map(|derive| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.flat_map(|derive| {
.filter_map(|derive| {

They're actually functionally equivalent, but .filter_map is a bit clearer on the intent.

let can_derive = match derive {
Derive::Clone | Derive::Copy => !contains_va_list,
Derive::Debug => can_derive_debug,
Derive::BitfieldStruct => has_bitfields,
};
Comment on lines +1660 to +1664
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing the logic like this is much cleaner and easier to understand than how it was before. That's great!

if can_derive {
Some(derive.to_string())
} else {
None
}
})
Comment on lines +1659 to +1670
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you can split them up and it's much cleaner.

Suggested change
.flat_map(|derive| {
let can_derive = match derive {
Derive::Clone | Derive::Copy => !contains_va_list,
Derive::Debug => can_derive_debug,
Derive::BitfieldStruct => has_bitfields,
};
if can_derive {
Some(derive.to_string())
} else {
None
}
})
.filter(|derive| match derive {
Derive::Clone | Derive::Copy => !contains_va_list,
Derive::Debug => can_derive_debug,
Derive::BitfieldStruct => has_bitfields,
})
.map(|derive| format!("{derive:?}")

.collect_vec();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.collect_vec();
.collect::<Vec<_>>();

Equally simple and doesn't require itertools; same with the other .collect_vec().


let mut reprs = vec![simple_metaitem("C")];
let max_field_alignment = if is_packed {
// `__attribute__((packed))` forces a max alignment of 1,
Expand Down Expand Up @@ -1697,10 +1719,28 @@ impl<'c> Translation<'c> {
];
let repr_attr = mk().meta_list("repr", outer_reprs);
let outer_field = mk().pub_().enum_field(mk().ident_ty(inner_name));

let outer_struct_derives = self
.tcfg
.derives
.iter()
.flat_map(|derive| {
let can_derive = match derive {
Derive::Clone | Derive::Copy | Derive::Debug => true,
Derive::BitfieldStruct => false,
};
if can_derive {
Some(derive.to_string())
} else {
None
}
})
Comment on lines +1727 to +1737
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.flat_map(|derive| {
let can_derive = match derive {
Derive::Clone | Derive::Copy | Derive::Debug => true,
Derive::BitfieldStruct => false,
};
if can_derive {
Some(derive.to_string())
} else {
None
}
})
.filter(|derive| match derive {
Derive::Clone | Derive::Copy | Derive::Debug => true,
Derive::BitfieldStruct => false,
})
.map(|derive| format!("{derive:?}")

.collect_vec();

let outer_struct = mk()
.span(span)
.pub_()
.call_attr("derive", vec!["Copy", "Clone"])
.call_attr("derive", outer_struct_derives)
.meta_item_attr(AttrStyle::Outer, repr_attr)
.struct_item(name, vec![outer_field], true);

Expand Down
68 changes: 63 additions & 5 deletions c2rust-transpile/src/translator/structs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,14 @@ use std::collections::HashSet;
use std::ops::Index;

use super::named_references::NamedReference;
use super::TranslationError;
use crate::c_ast::{BinOp, CDeclId, CDeclKind, CExprId, CRecordId, CTypeId};
use super::{ConvertedStructFields, TranslationError};
use crate::c_ast::{BinOp, CDeclId, CDeclKind, CExprId, CRecordId, CTypeId, CTypeKind};
use crate::diagnostics::TranslationResult;
use crate::translator::{ExprContext, Translation, PADDING_SUFFIX};
use crate::with_stmts::WithStmts;
use c2rust_ast_builder::mk;
use c2rust_ast_printer::pprust;
use failure::format_err;
use syn::{
self, AttrStyle, BinOp as RBinOp, Expr, ExprAssign, ExprAssignOp, ExprBinary, ExprBlock,
ExprCast, ExprMethodCall, ExprUnary, Field, Meta, NestedMeta, Stmt, Type,
Expand Down Expand Up @@ -276,6 +277,52 @@ impl<'a> Translation<'a> {
Ok(reorganized_fields)
}

/// Returns true if a struct field with this type can be part of a struct that derives `Debug`
fn can_struct_field_derive_debug(&self, ctype: CTypeId) -> bool {
Comment on lines +280 to +281
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only checking if a type recursively contains any unions? If so, can you rename the fn (and other variables, fields) to contains_union. That's more similar to contains_va_list and how that's used to calculate if Clone, Copy can be derived. And it can potentially be used by other derives that have similar requirements to Debug, as a bunch probably have similar behavior in regards to unions.

let ty = self.ast_context.resolve_type(ctype);
let can_derive_debug = match ty.kind {
// Recurse into struct fields. A struct is debuggable iff all of its fields are
CTypeKind::Struct(decl_id) => {
let decl = self
.ast_context
.get_decl(&decl_id)
.ok_or_else(|| format_err!("Missing decl {:?}", decl_id))
.unwrap();
match &decl.kind {
CDeclKind::Struct { fields, .. } => {
for field_decl_id in fields.as_ref().unwrap_or(&vec![]) {
let field_decl = self
.ast_context
.get_decl(&field_decl_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.get_decl(&field_decl_id)
.get_decl(field_decl_id)

.ok_or_else(|| format_err!("Missing decl {:?}", field_decl_id))
.unwrap();
match &field_decl.kind {
CDeclKind::Field { typ, .. } => {
let can_derive_debug =
self.can_struct_field_derive_debug(typ.ctype);
if !can_derive_debug {
return false;
}
}
_ => panic!("not a field"),
}
}
true
}
_ => panic!("not a struct"),
}
}

// A union or struct containing a union cannot derive Debug
CTypeKind::Union(..) => false,

// All translated non-union C types can derive Debug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about other recursive types like typedefs, arrays, etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man, really good point

_ => true,
};

can_derive_debug
}

/// Here we output a struct derive to generate bitfield data that looks like this:
///
/// ```no_run
Expand All @@ -296,7 +343,7 @@ impl<'a> Translation<'a> {
struct_id: CRecordId,
field_ids: &[CDeclId],
platform_byte_size: u64,
) -> TranslationResult<(Vec<Field>, bool)> {
) -> TranslationResult<ConvertedStructFields> {
let mut field_entries = Vec::with_capacity(field_ids.len());
// We need to clobber bitfields in consecutive bytes together (leaving
// regular fields alone) and add in padding as necessary
Expand All @@ -317,6 +364,7 @@ impl<'a> Translation<'a> {
field_name
};

let mut can_derive_debug = true;
for field_type in reorganized_fields {
match field_type {
FieldType::BitfieldGroup {
Expand Down Expand Up @@ -377,10 +425,20 @@ impl<'a> Translation<'a> {

field_entries.push(field);
}
FieldType::Regular { field, .. } => field_entries.push(*field),
FieldType::Regular { field, ctype, .. } => {
// Struct is only debuggable if all regular fields are
// debuggable. Assume any fields added by the translator
// such as padding are debuggable
can_derive_debug &= self.can_struct_field_derive_debug(ctype);
field_entries.push(*field)
}
Comment on lines +428 to +434
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is more readable if it's computed similar to how contains_va_list is:

        // A struct is only debuggable if all regular fields are debuggable.
        // Assume any fields added by the translator, such as padding, are debuggable.
        let can_derive_debug = reorganized_fields.iter().all(|field| match field {
            &FieldType::Regular { ctype, .. } => self.can_struct_field_derive_debug(ctype),
            _ => true,
        });

}
}
Ok((field_entries, contains_va_list))
Ok(ConvertedStructFields {
field_entries,
contains_va_list,
can_derive_debug,
})
}

/// Here we output a block to generate a struct literal initializer in.
Expand Down
44 changes: 42 additions & 2 deletions c2rust/src/bin/c2rust-transpile.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
use clap::{Parser, ValueEnum};
use log::LevelFilter;
use regex::Regex;
use std::{fs, path::PathBuf};
use std::{fs, path::PathBuf, collections::HashSet};

use c2rust_transpile::{Diagnostic, ReplaceMode, TranspilerConfig};
use c2rust_transpile::{Derive, Diagnostic, ReplaceMode, TranspilerConfig};

const DEFAULT_DERIVES: &[Derive] = &[Derive::Clone, Derive::Copy, Derive::BitfieldStruct];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are less default derives as they are necessary derives (BitfieldStruct is necessary for bitfield structs and Copy is necessary for union fields). Also, could you make them an associated const on Derive, like this?

impl Derive {
    /// The derives that are necessary for compilation.
    const NECESSARY: &[Self] = &[Self::Clone, Self::Copy, Self::BitfieldStruct];
}


#[derive(Debug, Parser)]
#[clap(
Expand Down Expand Up @@ -155,6 +157,17 @@ struct Args {
/// Fail when the control-flow graph generates branching constructs
#[clap(long)]
fail_on_multiple: bool,

/// Add extra derived traits to generated structs in addition to the default
/// set of derives (Copy, Clone, BitfieldStruct). Specify multiple times to
/// add more than one derive. A struct will derive all traits in the set for
/// which it is eligible.
///
/// For example, a struct containing a union cannot derive Debug, so
/// `#[derive(Debug)]` will not be added to that struct regardless of
/// whether `--derive Debug` is specified.
#[clap(long = "derive", value_enum, value_name = "TRAIT")]
extra_derives: Vec<ExtraDerive>,
Comment on lines +168 to +170
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ExtraDerive is fine, but I think the option should still be --derive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still is

}

#[derive(Debug, PartialEq, Eq, ValueEnum, Clone)]
Expand All @@ -164,6 +177,20 @@ enum InvalidCodes {
CompileError,
}

#[derive(Clone, Copy, Debug, ValueEnum)]
#[clap(rename_all = "PascalCase")]
enum ExtraDerive {
Debug,
}

impl ExtraDerive {
fn to_transpiler_derive(&self) -> Derive {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are two different types because I didn't want to make c2rust-transpile depend on clap so it could implement ValueEnum. It works out because allowing --derive Clone wouldn't make much sense anyway if that's enabled by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. One minor (clippy) suggestion:

Suggested change
fn to_transpiler_derive(&self) -> Derive {
fn to_transpiler_derive(self) -> Derive {

match self {
Self::Debug => Derive::Debug,
}
}
}

fn main() {
let args = Args::parse();

Expand Down Expand Up @@ -216,6 +243,7 @@ fn main() {
emit_no_std: args.emit_no_std,
enabled_warnings: args.warn.into_iter().collect(),
log_level: args.log_level,
derives: get_derives(&args.extra_derives),
};
// binaries imply emit-build-files
if !tcfg.binaries.is_empty() {
Expand Down Expand Up @@ -261,3 +289,15 @@ fn main() {
.expect("Failed to remove temporary compile_commands.json");
}
}

fn get_derives(extra_derives: &[ExtraDerive]) -> Vec<Derive> {
// Make sure there are no dupes and sort so the derives are always in the same order
let derives_set: HashSet<_> = DEFAULT_DERIVES
.iter()
.cloned()
.chain(extra_derives.iter().map(|d| d.to_transpiler_derive()))
.collect();
let mut derives: Vec<Derive> = derives_set.into_iter().collect();
derives.sort();
derives
}
9 changes: 9 additions & 0 deletions scripts/test_translator.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,12 @@ def __init__(self, log_level: str, path: str, flags: Set[str] = set()) -> None:
self.reorganize_definitions = "reorganize_definitions" in flags
self.emit_build_files = "emit_build_files" in flags

derive_flag_prefix = "derive:"
derives = set()
for derive_flag in filter(lambda f: f.startswith(derive_flag_prefix), flags):
derives.add(derive_flag[len(derive_flag_prefix):])
self.derives = derives

def translate(self, cc_db: str, ld_lib_path: str, extra_args: List[str] = []) -> RustFile:
extensionless_file, _ = os.path.splitext(self.path)

Expand All @@ -97,6 +103,9 @@ def translate(self, cc_db: str, ld_lib_path: str, extra_args: List[str] = []) ->
if self.emit_build_files:
args.append("--emit-build-files")

for derive in self.derives:
args.extend(["--derive", derive])

if self.log_level == 'DEBUG':
args.append("--log-level=debug")

Expand Down
29 changes: 29 additions & 0 deletions tests/structs/src/debug_derive.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
//! derive:Debug

#include <stdarg.h>

typedef struct {
int a;
} Debuggable1;

typedef struct {
va_list v;
} Debuggable2;

Debuggable1 *kDebuggable1;
Debuggable2 *kDebuggable2;

// A struct containing a union cannot derive Debug, so make
// sure we don't generate a #[derive] for it
typedef struct {
struct {
struct {
union {
int d;
float e;
} c;
} b;
} a;
} NotDebuggable1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can test this is not Debug using static_assertions::assert_not_impl_any!. There's also a similar assert_impl_one! you can use to do the check you're already doing, that a type is Debug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That way you also don't need to generate a value of the types, just the type itself.


NotDebuggable1 kNotDebuggable1;
11 changes: 11 additions & 0 deletions tests/structs/src/test_debug_derive.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
//! feature_c_variadic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?


use crate::debug_derive::{rust_kDebuggable1, rust_kDebuggable2};
use std::fmt::Debug;

pub fn test_debuggable() {
unsafe {
// Make sure all debuggable structs implement `Debug`
let _debuggable: Vec<*mut dyn Debug> = vec![rust_kDebuggable1, rust_kDebuggable2];
}
}
Loading