- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
          Little improves in CString new when creating from slice
          #92124
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
  
    Little improves in CString new when creating from slice
  
  #92124
              Conversation
| r? @kennytm (rust-highfive has picked a reviewer for you, use r? to override) | 
| I checked generated asm for  use std::ffi::{CString, NulError};
#[no_mangle]
pub extern "Rust" fn make_c_string()->Result<CString, NulError>{
    CString::new("Hello world!")
}Asm generated using command:  Old generated code    .text
    .def     @feat.00;
    .scl    3;
    .type   0;
    .endef
    .globl  @feat.00
.set @feat.00, 0
    .file   "cstring_new.a85eb7bc-cgu.0"
    .def     make_c_string;
    .scl    2;
    .type   32;
    .endef
    .section    .text,"xr",one_only,make_c_string
    .globl  make_c_string
    .p2align    4, 0x90
make_c_string:
.seh_proc make_c_string
    pushq   %rsi
    .seh_pushreg %rsi
    pushq   %rdi
    .seh_pushreg %rdi
    subq    $56, %rsp
    .seh_stackalloc 56
    .seh_endprologue
    movq    %rcx, %rsi
    leaq    __unnamed_1(%rip), %rdx
    leaq    32(%rsp), %rdi
    movl    $12, %r8d
    movq    %rdi, %rcx
    callq   _ZN70_$LT$$RF$str$u20$as$u20$std..ffi..c_str..CString..new..SpecIntoVec$GT$8into_vec17h42e14556a4e783e0E
    movq    %rsi, %rcx
    movq    %rdi, %rdx
    callq   _ZN3std3ffi5c_str7CString4_new17hd06ee8640c50ad96E
    movq    %rsi, %rax
    addq    $56, %rsp
    popq    %rdi
    popq    %rsi
    retq
    .seh_endproc
    .section    .rdata,"dr",one_only,__unnamed_1
__unnamed_1:
    .ascii  "Hello world!"New generated code    .text
    .def     @feat.00;
    .scl    3;
    .type   0;
    .endef
    .globl  @feat.00
.set @feat.00, 0
    .file   "cstring_new.558cd7f7-cgu.0"
    .def     make_c_string;
    .scl    2;
    .type   32;
    .endef
    .section    .text,"xr",one_only,make_c_string
    .globl  make_c_string
    .p2align    4, 0x90
make_c_string:
.seh_proc make_c_string
    pushq   %rsi
    .seh_pushreg %rsi
    subq    $32, %rsp
    .seh_stackalloc 32
    .seh_endprologue
    movq    %rcx, %rsi
    leaq    __unnamed_1(%rip), %rdx
    movl    $12, %r8d
    callq   _ZN66_$LT$$RF$str$u20$as$u20$std..ffi..c_str..CString..new..NewImpl$GT$8new_impl17ha5748a2c8ce696e6E
    movq    %rsi, %rax
    addq    $32, %rsp
    popq    %rsi
    retq
    .seh_endproc
    .section    .rdata,"dr",one_only,__unnamed_1
__unnamed_1:
    .ascii  "Hello world!"It can be seen that new caller code is shorter. As for lto performance, I compiled same code used  On Unix, there is probably no difference because Rust uses  | 
| I have 2 questions: 
 Example: ; _$LT$$RF$mut$u20$$u5b$u8$u5d$$u20$as$u20$std..ffi..c_str..CString..new..NewImpl$GT$::new_impl::hcd1310bd9104df0d
_ZN83_$LT$$RF$mut$u20$$u5b$u8$u5d$$u20$as$u20$std__ffi__c_str__CString__new__NewImpl$GT$8new_impl17hcd1310bd9104df0dE proc near
push    rsi
sub     rsp, 20h
mov     rsi, rcx
call    _ZN75_$LT$$RF$$u5b$u8$u5d$$u20$as$u20$std__ffi__c_str__CString__new__NewImpl$GT$8new_impl17h8473efdc7dbe7405E ; _$LT$$RF$$u5b$u8$u5d$$u20$as$u20$std..ffi..c_str..CString..new..NewImpl$GT$::new_impl::h8473efdc7dbe7405
mov     rax, rsi
add     rsp, 20h
pop     rsi
retn
_ZN83_$LT$$RF$mut$u20$$u5b$u8$u5d$$u20$as$u20$std__ffi__c_str__CString__new__NewImpl$GT$8new_impl17hcd1310bd9104df0dE endp | 
| Surprisingly, adding  | 
| It turns out that  | 
new when creating from slicenew when creating from slice
      | Removed all code related to new allocation mechanism. | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| JFYI: I wouldn't be able to participate until 15th January. I would return to this PR after that if needed. | 
Old code already contain optimization for cases with `&str` and `&[u8]` args. This commit adds a specialization for `&mut[u8]` too. Also, I added usage of old slice in search for zero bytes instead of new buffer because it produce better code for Windows on LTO builds. For other platforms, this wouldn't cause any difference because it calls `libc` anyway. Inlined `_new` method into spec trait to reduce amount of code generated to `CString::new` callers.
| @kennytm May you review my PR please? :) | 
| This seems okay to me, thanks. @bors r+ | 
| 📌 Commit 4b62a77 has been approved by  | 
…askrgr Rollup of 10 pull requests Successful merges: - rust-lang#88642 (Formally implement let chains) - rust-lang#89621 (doc: guarantee call order for sort_by_cached_key) - rust-lang#91278 (Use iterator instead of recursion in `codegen_place`) - rust-lang#92124 (Little improves in CString `new` when creating from slice) - rust-lang#92783 (Annotate dead code lint with notes about ignored derived impls) - rust-lang#92797 (Remove horizontal lines at top of page) - rust-lang#92920 (Move expr- and item-related pretty printing functions to modules) - rust-lang#93041 (Remove some unused ordering derivations based on `DefId`) - rust-lang#93051 (Add Option::is_some_with and Result::is_{ok,err}_with) - rust-lang#93062 (Update books) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup


Old code already contain optimization for cases with
&strand&[u8]args. This commit adds a specialization for&mut[u8]too.Also, I added usage of old slice in search for zero bytes instead of new buffer because it produce better code for constant inputs on Windows LTO builds. For other platforms, this wouldn't cause any difference because it calls
libcanyway.Inlined
_newmethod into spec trait to reduce amount of code generated toCString::newcallers.