-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SvUV() macro 100% of time calls Perl_sv_2uv_flags() because SvUOK_nog/Perl_sv_setuv/SVf_IVisUV #22653
Comments
How much does:
actually save? Even though 4d55d9a states that this is "Pretty much the same change as bd30fe8", but that particular short-cut is in addition to that. I wonder if it would be better to simply remove that. |
I strongly suspect @tsee tested this with a micro-benchmark, but I do not know what benchmark was used. |
builtin.c is very new (2021), and while the code works, it is not a shining good example of C or XS code and best practices -actual bug cv_set_call_checker_flags(cv, builtin->checker, newSVuv(PTR2UV(builtin)), 0); leaks refcnt -remove and reorganize all the misaligned fields in "struct BuiltinFuncDescriptor", on x64, it was 6 ptrs big. After this patch, each element is 2 ptrs big. (21*6*8)-(21*2*8) = 672 bytes saved, from interp binary, and it was almost all null bytes. -To make a bitfield take bits from U8 sub_name_len not U16 ckval, which are 0-~400 perl opcodes. less chance of, faster visibility, and less debugging to do, if very long method names get added and broken, vs breaking perl runtime engine impl details if suddenly the opcodes grow in the future. -S_warn_experimental_builtin() will likely never execute, factor the "builtin->name" pointer read the cold separate static func vs at callers. -S_export_lexical() batch ptr r/w, so CC doesn' have to again emit a r/w to ithread interp var PL_curpad after the SvREFCNT_dec() fn call. Do ++ first, in case an almost impossible exception is thrown from SvREFCNT_dec(). Paranoia, no real reason. -also note inefficiency that pad_add_name_pvn() allocates a SV head, AND upgrades it to SVt_PVCV, and right after S_export_lexical() frees that SV *, and puts our own in. No Perl API exists to allocate a pad slot, with an uninit garbage pad slot returned ready to fill. -pad_add_name_sv->pad_add_name_pvn, pad_add_name_sv() has no perf (HEK? COW?) benefits over pad_add_name_pvn(), and is only a thin wrapper. Don't use SVs in non-SV APIs if a cheaper malloc()/cstring can be used. -"SvUV(ckobj)"->"SvUVX(ckobj)", remove the unconditional "2uv()" fn call. That fix locally the major problem in Perl#22653 . Bc we created this SV, nobody knows about it, and just blindly deref with SvUVX skipping magic and stringifying overhead. ck_builtin_*()s execute alot in mktables and modern perl code, its not the BOOT: sub. -'SV *prototype = newSVpvs(""); SAVEFREESV(prototype);' dont make this temp SV over and over. Use MY_CXT to store our 3 high refcnt COW strings. Using COW shared HEK instead other COW types, since HEK code is much older and "stable", and the "new COW" API in sv.c has a very little API, and I think, untested, would assert, panic, or generically corrupt a CV * leading to a SEGV. sv_sethek() is "safer". Not using flag SVppv_STATIC and C strings, since no API exists for SVppv_STATIC. Maybe one day, SVppv_STATIC can be put in here, for now sv_sethek() is simpler. -Perl_die->Perl_die_nocontext saves CC CPU opcodes on ithread perl. -S_import_sym() do not make multiple redundant cycle through SV code for 1 use "builtin::foo" and "&foo" strings. And S_import_sym() can be called with PP ->import(); many times x 21 on script start up, b/c of many .pm's wanting these subs. -S_cv_is_builtin() we set CvFILE() on our newXS call, it would be bizzare if the CV* does NOT have our CvFILE ro string ptrs anymore. Direct ptr comparison skips the call to libc on for streq. Should be a panic if CV_DYNFILE or alien CvFILE RO string ptr is found. But KISS and only do ptr cmp for perf. -pad_findmy_sv->pad_findmy_pvn, left is thin wrap for right, skip SV * alloc and SVPV manip code and go very close to assembly with a C stack buffer and maybe inlined memcpy()s. -BOOT: newXS_flags()->newXS_len_flags(), remove strlen(), also, share the "$" and "@" PVs between the CV*s. There is a lack of perlapi, and other design discussions, why newXS_len_flags() with RO const char * proto, mallloc()s a copy, and remember Perl rounds up to 16 bytes for a PV *, and "2 PTRs" secret malloc header, so, for prototype string "$", 2 bytes. 32 bytes * 21 XSUBS = 672 or more bytes were wasted using proto arg of newXS_len_flags(). sv_sethek() fixes that. Also remember the SV*/HEK* is reused for ck_entersub_args_proto() later on. savings are disk/file size (RO data) of libperl bin, and proc unconditional startup overhead in CPU time, perm malloc() reduction, faster ->import() per package/per yy_parser
Without researching, (correct this if I am wrong) I think pp_add() the pure perl op codes, always do IV math first, then upgrade to UV math on overflow, then upgrade to NV math on overflow, plus negative numbers are popular jkjk. If pp_math() ops are IV priority first, UV math will definitely be slower if that is the 2nd or 3rd branch/test jump in the C code. Perhaps some CPUs, gaming/assembly/big data back then, signed ints were some nanoseconds faster than unsigned ints, or zero extend vs sign extend was slightly different speeds on CPUs back then. Or Steffen??? has a biz algo, where negative numbers -2billion to 0 to +2billion, kept his code in IV integer mode, while UV's constantly overflowed and degraded to slower NV/FPUs. But that is very problem specific to one user. I want to just change the SvUV() macro, so SVIV no UV flag, take fast path (no fnc call), through SvUV() macro, but the git history is so complicated, and changed in the 2010s, it needs the original authors or chip in, or someone who remembers from back then. I vaguely remember Father C 10 years ago, being very active in poking and writing fixes for the SV XS magic system, since he found and claimed and probably did fix many cases of "data loss" or data degradation (IV/UV/NV/PV casting IDK where, that destroys round trip ability) through the perlapi setters/getters. Including some kind of flaws with private IOK NOK POK vs public IOK NOK POK. Revolves around data loss inside Perl Magic API, and that is the specific commit that caused the damage (broke the fast path in SvUV). A trolling argument I have is, is its "required" to call sv_2uv() fn call on an IV, incase the IV is negative, and that is a programmer error, or security exploit, and P5 must process abort or throw large console messages? I'd really like to see this fixed, since there is tons of CPAN code storing pointers in sv_setuv()/SvUV(), that is suffering. Perhaps, if nobody has technical input, it is just best to be adventurous and change SvUV() and see how much of core breaks, and see if CPAN complaints roll in, or its crickets from CPAN, which means the SvUV change was a success. I'd like a 2nd opinion, since git blame commits, point to repairing "data loss" 10 years ago, and a slower thought process is needed. |
related to "SvUV() macro 100% of time calls Perl_sv_2uv_flags" Perl#22653 Until Perl#22653 is solved, clean up newSVuv() and remove branch to "newSViv()" that is unexplained by git blame. BUT, keep original intent and behaviour of "newSViv()" branch for now. Add asserts to guard against 0x8000,0000 == SVf_IVisUV changing. Value of SVf_IVisUV can change in the future, and there might be (I didn't git blame), logic that sign flag and SVf_IVisUV are equal. But these changes depend on SVf_IVisUV being 0x8000,0000 and must be updated if SVf_IVisUV changes. Change SvXXXV_set() to be an explicity bodyless SV head optimization. MSVC 2022 -O1 combined SET_SVANY_FOR_BODYLESS_IV() and SvIV_set(). But instead of hopes and prayers on "UB" or "ISB/IDB" of CCs that could change at random in any previous or future build number of a CC, do it explictly. Bodyless SV head API is defined by P5P, not CC vendors. 9155444 3/20/2022 3:05:10 PM Perl_newSViv: simplify by using (inline) newSV_type Fix deoptimized Perl_newSViv(). In that commit it forgot about Perl_newSVuv(). Since newSV_type() is a inline fn, and "inline" is CC domain UB optimization. And newSV_type() is far more complex than CPP macro new_SV(), and newSV_type() depends on 100% perfection from CC's LTO engine and ".o" disk format, and possibly depends on the CC breaking ISO C spec with -O3 or -O4. Which turn on extreme SEGV inducing C variable aliasing rules that few C code bases tolerate. Quick examples, a reddit comment (not credible), claims "uint8_t *" and "char *" can not be casted since the CC or CPU has 9 bit bytes or a 9 wire data bus, and ECC parity wire is 9 of 9 for "uint8_t" and 8 of 9 for "signed char" and wire 9 for "char" is the ECC parity wire. The platform's libc's fwritef(), hides the secretly converts 9 bit bytes, to standard 8 bit bytes, making the CC "ISO C compliant". My more realistic scenario, inside newSV_type(). How can the CC know, what if Perl_more_sv() or Perl_more_bodies(), calls mprotect(), modifies "static const struct body_details bodies_by_type [];", calls mprotect() again, and returns execution to newSV_type()? Just switch to new_SV(). Its a CPP macro, not subject to CC UB inlining, and new_SV() only has 1 fn call and is super light weight. Old P5P commits/ML/CPAN dev talk about this area of code being crucial to (CPAN XS) deserializing perf in perl, so perf considerations, with proper asserts, has priority over readability. Links to old core commits in Perl#22653 briefly discuss deserializing perf as rational, so this patch also follows that design idea. Perl_vnewSVpvf(), "malloc(1ch);" which in reality is "malloc(16ch)" makes no sense, since almost zero chance fmtstr+args+\0 <= 16, and perl malloc() round up, is semi-UB/a build flag default on anyways. Using guesstimate malloc(pat_len), increases chances far higher, that a realloc() inside sv_vcatpvfn_flags(), OS realloc(), will realloc() in place, not changing the ptr, esp assuming OS malloc() does bucket of power of 2 allocator algo. Assume, 40ch malloc() fmt string, bucket to 64ch by OS malloc(), throw in a %u 32b, that is max +10ch-2ch for "%u". So output is 48ch. realloc(48ch) is inplace, therefore it is a win.
On Sun, Oct 13, 2024 at 05:48:12AM -0700, bulk88 wrote:
Without researching, (correct this if I am wrong) I think pp_add() the pure perl op codes, always do IV math first, then upgrade to UV math on overflow, then upgrade to NV math on overflow,
Yes, it was effectively a way to keep an SV as an integer (now unsigned) for
values between IV_MAX and UV_MAX, rather than upgrading to NV.
The number-handling code in the core is likely to be optimised around this
assumption - i.e. it wasn't envisioned that the UV flag would typically be
set for values below IV_MAX.
…--
Technology is dominated by two types of people: those who understand what
they do not manage, and those who manage what they do not understand.
|
We had at least one bug #12993 report where a 32-bit Linux system was allocating memory above the 2G mark in a child thread. |
So are negative pointers or 1.3 GB long scalars on 32b perl officially supported by P5P or not? I remember sisyphus and rurban some years ago both claim /3GB mode and Win32 32b Perl had instant SEGVs on start up and it was unfixable. I asked did they try to fix the SEGVs, they said yes, but abandoned it after the 3 or 5th SEGV fix they did and then deleted their private branch. I've never compiled /3GB mode myself. |
As far as I know they are supported. I'd certainly try to fix such bugs if I saw them and had the attention left to do so, as I fixed #12993. |
It's been 3 weeks; I don't think we're likely to see more technical input, and this strikes me as a pragmatic, if "adventurous", position. |
Description
I was stepping debugging some XS code, and noticed, if you create a SvUV, with sv_setuv(), then later use SvUV() macro or eqv inline function, %99.999 of the time, Perl_sv_2uv_flags() will execute.
This "SvUV" created with sv_setuv(), or newSVuv()
the IV is really a C pointer/opaque pointer from a C library. 100% of the time, SvUV() macro, will execute Perl_sv_2uv_flags(). Because of
that
turns off SVf_IVisUV in final SvUV, then the SvUOK_nog() in SvUV(), says "NOT AN INTEGER" and calls the HEAVY getter Perl_sv_2uv_flags() each time. Since these are pointers, I think on 99% OSes, highest bit in an address, or negative addresses, are kernel space, and prohibited for user mode. Win32 32b and maybe some Solaris'es use and alloc high bit/negative pointer addresses, but I dont think P5 is compatible with such CPUs/C compilers/build configs.
I traced
to
Revision: 4bac9ae
Author: Chip Salzenberg chip@pobox.com
Date: 6/22/2012 6:18:18 PM
Message:
Magic flags harmonization.
traced to
Revision: 4d55d9a
Author: Steffen Müller smueller@cpan.org
Date: 11/28/2014 11:34:00 AM
Message:
Repeat newSViv optimization for newSVuv
and that quotes
Revision: bd30fe8
Author: Eric Herman eric@freesa.org
Date: 11/28/2014 9:08:04 AM
Message:
Speed up newSViv()
see also
Revision: 013abb9
Author: Nicholas Clark nick@ccl4.org
Date: 2/1/2012 4:58:14 PM
Message:
Update, correct and clarify the comment in Perl_sv_setuv().
AKA #8002
So questions, why is #define SvUOK_nog(sv) ((SvFLAGS(sv) & (SVf_IOK|SVf_IVisUV|SVs_GMG)) == (SVf_IOK|SVf_IVisUV)) used? isnt the byte size and bitfields for IV and UV 100% identical all CPUs OSes?
Doesnt SVf_IVisUV only matter for NV FP double conversion and stringify?
Does SV MAGIC somehow pass "want UV" vs "want IV" flags to the CPAN XS vtable or CPAN PP tied SCALARs?
Do we throw truncate/overflow console warnings for SvIV ptrs (SVf_IOK yes SVf_IVisUV no), passed to SvUV() getter macro, with the SVIV being "-1"? I can't find such code (XSUB level, not PP ops) in sv.c, but I will ask anyways.
This is bad enough a fix should go into maint releases.
PS, the problem in found in blead Perl 5 core XS and PP and runtime code. Because SvUV() is being used in some places to fetch pointer type integers.
Dirty fix, CPAN author side, always use setiv() SvIV() SvIVX() or ONLY use SvUVX() in XS code to store/get/read/set pointers and pointer like data, hell, just NEVER use SvUV() macro, since ONLY "(UV)-1" and "(UV)-123456" ever use the "fast path" in SvUV().
There needs to be a P5P fix for this. Vs manual hunting and patching of CPAN XS and core.
Steps to Reproduce
sv_setuv(); sv_dump(); SvUV(); and use a C step debugger inside SvUV(); statement
Expected behavior
Do not do a func call inside SvUV(); for SVIV flagged SV heads.
Perl configuration
NA, blead perl 5.41 and many other versions
The text was updated successfully, but these errors were encountered: