-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disambiguate obfuscated/non-obfuscated names #718
Comments
I don't fully understand the problem... I'd rather want to avoid another parallel obfuscated name infrastructure. Are you passing a rename map into Redex, e.g. from a previous Proguard run? A few thoughts / suggestions:
Does that help? |
We run proguard obfuscation before redex since some passes require a proguard mapping file in order to run properly. I'm not sure why this is the case but it is (or was when we first adopted redex). This means we have a bunch of obfuscated classes in the mix as well as the rename pass to further shrink the renaming although this runs second to last right before This leads to an error-prone situation where some pass developers forget that they are working with obfuscated/deobfuscated strings and try to do comparisons between them which always fails leading to problems. We've hit this several times already and were trying to come up with a fix to make the situation less error-prone. While using Every developer we have working on redex passes has made this mistake at least once. |
It sounds like in your situation the At one point, we were in the same spot. Maybe @ssj933 has more context on the Proguard deprecation history. But I wonder how often does your optimization logic rely on the names of a type/class. That's not very common for us. In most cases, we rely on the name of some key classes that are the root of a hierarchy. Most of them are not renamed either because they are external (part of JDK), or marked as "donotrename" in the dependency (Kotlin runtime).
Do you plan to add an option to plugin the Proguard rename map into the |
I'm pretty sure the proguard map is already taken into account for the
deobfuscated_name. That isn't a problem. It's just the ease with which new
developers can forget to translate between de obfuscated and obfuscated.
We have some cases where we refer to classes by their deobfuscated names
such as when we input proguard's usage.txt or are dealing with some
internal special classes.
…On Tue, Aug 30, 2022, 12:26 PM Wei Zhang ***@***.***> wrote:
It sounds like in your situation the deobfuscated_name is already
"obfuscated" by Proguard, is that right?
At one point, we were in the same spot. Maybe @ssj933
<https://github.com/ssj933> has more context on the Proguard deprecation
history.
But I wonder how often does your optimization logic rely on the names of a
type/class. That's not very common for us. In most cases, we rely on the
name of some key classes that are the root of a hierarchy. Most of them are
not renamed either because they are external (part of JDK), or marked as
"donotrename" in the dependency (Kotlin runtime).
My idea is to create two different types for obfuscated and deobfuscated
strings and have all names derived from the APK be instances of obfuscated
strings
Do you plan to add an option to plugin the Proguard rename map into the
deobfuscated_names in Redex?
—
Reply to this email directly, view it on GitHub
<#718 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQ3JAL7TY3YG5SXNYV6LWZLV3ZN4ZANCNFSM577VSLLA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Yeah we have to live with that. It hasn't been a major concern tho. Can I ask why is |
Many passes require that proguard keep rules have been communicated to Redex via some ProGuard config files (and even that is just a superficial sanity check), but I am not aware of any relevant passes that would require an actual pro guard (class-name) mapping. If at all possible, I suggest that you stop running Proguard to obfuscate classes first. That's the direction we've been taking. |
It's what we want, but we also want it to be obvious that we need to use If we had this typing difference, we'd know at compile-time what we made a mistake. |
IC. Thanks for explaining that. It's more clear to me what this is about now. Sounds like you are proposing adding subtypes to |
Yes, that's pretty much what I am suggesting. I think we can keep DexString
(and probably std::string as well) as a common parent and treat obfuscated
and deobfuscated types as specializations to keep compatibility with
existing code.
…On Wed, Aug 31, 2022, 12:31 PM Wei Zhang ***@***.***> wrote:
If we had this typing difference, we'd know at compile-time what we made a
mistake.
IC. Thanks for explaining that. It's more clear to me what this is about
now.
Sounds like you are proposing adding subtypes to DexString to encoding
obfuscation status into types. Or something similar? Personally I'm open to
a change like that as long as it does not require a sweeping update for
existing name handling logic in the code base.
—
Reply to this email directly, view it on GitHub
<#718 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQ3JALZVEFZZS6YFWQKGKI3V36XKVANCNFSM577VSLLA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
SG. I will be open to see what the change looks like. But I guess there's an underlying question about the differences in how we set things up. Like in your case, there's more interaction between the preceding Proguard and Redex. But we have deprecated Proguard a few years back. So the vast majority of the symbol obfuscation is done by Redex (towards the end of the optimization pipeline). That's maybe a worthwhile discussion by itself. |
It's a good discussion to have, deprecating proguard. But we can't do it in
our build because things start crashing when proguard does not run
obfuscation. Must have too many Gradle plugins that depend on proguardisms
that we can look into later.
…On Thu, Sep 1, 2022, 11:05 AM Wei Zhang ***@***.***> wrote:
SG. I will be open to see what the change looks like.
But I guess there's an underlying question about the differences in how we
set things up. Like in your case, there's more interaction between the
preceding Proguard and Redex. But we have deprecated Proguard a few years
back. So the vast majority of the symbol obfuscation is done by Redex
(towards the end of the optimization pipeline). That's maybe a worthwhile
discussion by itself.
—
Reply to this email directly, view it on GitHub
<#718 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQ3JAL6YWAQA3PF7CSJLS6DV4DV6NANCNFSM577VSLLA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Obfuscated entities in dex files may include types, field names, method names, method protos with their contained result types and (argument) type lists. Do you want subclasses for DexType, DexField, etc.? For DexField and DexMethod we already have a special thing going where we play games with the -Ref subtypes. I wonder how the different subtypes would affect existing passes. For example, strings, there's a rule that there is only one DexString* for each unique string payload. Would your subtype get a different pointer address for identical strings? This would create problems, semantic and optimization-quality-wise. In general, the Dex* concepts mirror quite closely concepts that are found in dex files, including uniqueness properties. Regarding DexStrings in particular, we've tried to make sure they are very lightweight. On an 64-bit platform, each DexString takes up 16 bytes, and doesn't have a vtable. I'd like to keep that. Anyway, I am not sure what you exactly you have in mind. Some examples for potential API changes would help :-) |
My team and I have been developing passes for use internally but we commonly hit an issue where we interact with deobfuscated names (perhaps from a hardcoded list,
usage.txt
from proguard, or other sources) but then use them with names retrieved fromDexClass
orDexType
reachable fromscope
which are generally obfuscated. This has caused many problems.Is this also an issue for others? And does anyone have ideas for a fix?
My idea is to create two different types for obfuscated and deobfuscated strings and have all names derived from the APK be instances of obfuscated strings and where APIs that read deobfuscated names (ones that developers write for input names) use the deobfuscated type.
I'm willing to do the work to refactor, but this would be a huge change and I wanted to get feedback before starting work on what may be a doomed project.
The text was updated successfully, but these errors were encountered: