Skip to content

feat: Improve JSON serialization with ModifiableVariable #236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

TrueSkrillor
Copy link
Contributor

This PR replaces the cumbersome and verbose XML serialization of ModifiableVariable objects using Jakarta with JSON serialization using Jackson. It is one of several PRs with the goal of migrating TLS-Attacker to a full JSON serialization. Jackson is already being used, for example, by the Scanner-Core for serializing scanner results. This change has become necessary for various reasons that limit TLS-Attacker and related projects. Among others, those are:

  • Verbose XML structure due to repeated opening and closing tags (especially in the case of arrays and lists).
  • Seemingly "best effort" use of XML annotations without a proper bottom-up approach. This PR designs the JSON structure from the bottom up (starting with ModifiableVariables), while adding required flags and less verbose element names where suitable.
  • Severe limitations in combination with inheritance and polymorphism in Jakarta. Jackson allows using reflection to provide named subtypes at runtime. This is particularly interesting for Protocol-Attacker where subtypes may not be known at compile-time. Note that for all subtypes in the ModifiableVariable library, the named subtypes are explicitly given to use custom, less verbose names rather than their class names.
  • Lack of XSD schema generation. This PR configures a plugin to generate an up-to-date JSON schema whenever mvn install is run.
  • Jakarta required specifying the adapter for each field. Jackson allows for global configuration (i.e., hex encoded byte arrays) using pre-defined modules. This PR introduces the ModifiableVariableModule module which configures Jackson's object mapper for (de-)serializing modifiable variables.

The resulting JSON schema for a ModifiableVariable instance can be found under src/main/resources/ModifiableVariable.schema.json.

Example (will return a ModifiableBigInteger that computes (originalValue + 10) * 5):

{
    "$schema": "ModifiableVariable.schema.json",
    "@type": "BigInteger",
    "modifications": [
        {
            "@type": "BigIntegerAdd",
            "summand": 10
        },
        {
            "@type": "BigIntegerMultiply",
            "factor": 5
        }
    ]

@TrueSkrillor TrueSkrillor requested a review from ic0ns June 19, 2025 13:09
@ic0ns
Copy link
Contributor

ic0ns commented Jun 19, 2025

To be honest - I am not so sure about this one. Like - point1 you are making is not really relevant for us - we are not storing billions and trillions of XML files - so spending a few more bytes on tags is not really worth fixing. Point 2 i dont understand. Point 3: We had this for XML - we removed it cause it was making our lives miserable. Point 4 is nice - but I dont think this is that much related to XML but just to the framework.

For me, this is a bit like switching from Java to kotlin - like - i see some benefits - but its not really fixing a real issue we are facing that I would say its worth it to switch.

Like - at the moment we kind of have both JSON and XML cause MongoDB in the crawler prefers JSON - but the next usecase is some different dataformat... the only real benefit is that we could drop one of the two imho. Maybe we can discuss in the next meeting

@TrueSkrillor
Copy link
Contributor Author

Yes, we should discuss this. I consider this PR mandatory to allow a soft switch towards Protocol-Attacker as Jakarta prevents us from introducing common superinterfaces for types used in the configuration and workflow files. I came across this issue while trying to introduce a common WorkflowTraceType interface in Protocol-Attacker, which is stored by an abstract Config base class. With Jackson, it is possible to use interfaces and specify which implementation to use. Maybe the title of this PR is misleading, but the main motivation is to move away from Jakarta, not replacing XML (which is just the logical, and from my point of view desirable, consequence).

@TrueSkrillor
Copy link
Contributor Author

TrueSkrillor commented Jun 20, 2025

Minimal example:

public abstract class Config {
    private WorkflowTraceType workflowTraceType;
}

public class TlsConfig extends Config { }

public interface WorkflowTraceType { }

public enum TlsWorkflowTraceType implements WorkflowTraceType {
    DTLS
}

With Jakarta, it is not possible to serialize TlsConfig because of the interface type WorkflowTraceType in the base class (although it works just fine programmatically). With Jackson, you can simply tell the object mapper used in TLS-Attacker to map WorkflowTraceType to TlsWorkflowTraceType by adding a module with addAbstractTypeMapping(WorkflowTraceType.class, TlsWorkflowTraceType.class). Using generics here does not solve this issue due to type erasure (Jakarta will only see the interface).

@TrueSkrillor
Copy link
Contributor Author

TrueSkrillor commented Jun 20, 2025

XmlAnyElement is catch-all, so this does not work for more than one such property.

From the documentation:

There can be only one XmlAnyElement annotated JavaBean property in a class and its super classes.

@ic0ns
Copy link
Contributor

ic0ns commented Jun 20, 2025

so this does not work for more than one such property.

what do you mean?

@ic0ns
Copy link
Contributor

ic0ns commented Jun 20, 2025

Ah I understand now 🤔

@ic0ns
Copy link
Contributor

ic0ns commented Jun 20, 2025

Ok that convinces me - the alternatives are not looking great - I dont have strong opinions on XML vs Json - for me its good enough when you can machine read it. Another big concern I have is with the "how" we use it. Cause right now, it looks like we need to add a lot of annotations as seen in your PR - which I really would want to try to avoid. Also looks like we are starring at a lot of refactoring in the downstream project with this PR that will probably keep us busy for weeks.

@TrueSkrillor
Copy link
Contributor Author

Regarding annotations, most of them are there to allow for a less verbose JSON serialization / deserialization. Currently, my PR uses the following annotations:

  • @JsonTypeInfo -> Required to allow Jackson to encode the type of an interface
  • @JsonSubTypes -> Used in conjunction with @JsonTypeInfo(use = JsonTypeInfo.Id.NAME) to map names to actual types. Can be considered optional, if we decide to use @JsonTypeInfo(use = JsonTypeInfo.Id.MINIMAL_CLASS) instead. However, I am not sure that the generated schema will include subtypes then.
  • @JsonInclude -> Only used for nicer JSON representation (i.e., leave out null values and empty lists). If we do not care, we can simply remove those annotations.
  • @JsonAutoDetect -> Can also be set globally when creating a new ObjectMapper, if we use the same autodetect mechanism for each class (that is, include all fields but no getter / setter). Equivalent to @XmlAccessorType(XmlAccessType.FIELD).
  • @JsonProperty(required = true) -> Used to mark a property as required in the schema.

On the other hand, @XmlJavaTypeAdapter and @XmlRootElement annotations can be removed without replacement.

@TrueSkrillor
Copy link
Contributor Author

In the meantime, I also thought about how we can migrate all projects to Jackson. The most promising migration path from my point of view would be to keep XML annotations in place for now, and simply add the Jackson annotations on top of it. Once all projects are updated to use Jackson instead of Jakarta, we can simply remove XML-related annotations and dependencies in all projects all at once. This would allow us to migrate to Jackson step-by-step without disrupting BOM updates.

@ic0ns
Copy link
Contributor

ic0ns commented Jun 23, 2025

* `@JsonTypeInfo` -> Required to allow Jackson to encode the type of an interface

sure

* `@JsonSubTypes` -> Used in conjunction with `@JsonTypeInfo(use = JsonTypeInfo.Id.NAME)` to map names to actual types. Can be considered optional, if we decide to use `@JsonTypeInfo(use = JsonTypeInfo.Id.MINIMAL_CLASS)` instead. However, I am not sure that the generated schema will include subtypes then.

This I want to avoid at all cost. Schema I am willing to drop for this.

* `@JsonInclude` -> Only used for _nicer_ JSON representation (i.e., leave out null values and empty lists). If we do not care, we can simply remove those annotations.

Can we configure this globally instead?

* `@JsonAutoDetect` -> Can also be set globally when creating a new `ObjectMapper`, if we use the same autodetect mechanism for each class (that is, include all fields but no getter / setter). Equivalent to `@XmlAccessorType(XmlAccessType.FIELD)`.

We should do that.

* `@JsonProperty(required = true)` -> Used to mark a property as required in the schema.

I assume this mostly affects modvar - so it should be fine.

@TrueSkrillor
Copy link
Contributor Author

TrueSkrillor commented Jun 25, 2025

This I want to avoid at all cost. Schema I am willing to drop for this.

Okay, I tested a few approaches. It seems that we can either have a correct schema or no @JsonSubTypes annotation. However, in this case, I value a proper schema (especially for higher level structures like configuration and workflow descriptions) significantly higher than removing a few lines of code. What is the reason you prefer it the other way around?

@TrueSkrillor TrueSkrillor force-pushed the feat/json-serialization branch 2 times, most recently from 8666a86 to 40a1e33 Compare June 25, 2025 08:43
@ic0ns
Copy link
Contributor

ic0ns commented Jun 25, 2025

People were constantly forgetting to add types, you sometimes had the same "style" of annotations at multiple points in the code base and then had to search all over the place where you have to add the new class. Additionally, its not flexible when you want to add more classes from the outside.

@TrueSkrillor TrueSkrillor force-pushed the feat/json-serialization branch 2 times, most recently from d25d285 to 8cc3c7b Compare June 25, 2025 09:41
@TrueSkrillor TrueSkrillor force-pushed the feat/json-serialization branch from 8cc3c7b to 475dd26 Compare June 25, 2025 09:42
@TrueSkrillor
Copy link
Contributor Author

People were constantly forgetting to add types, you sometimes had the same "style" of annotations at multiple points in the code base

This was an issue with Jakarta, but not Jackson. Jackson requires this annotation exactly once in the base class and not in multiple locations throughout the code base. With Jakarta, you were required to denote possible subtypes in all classes using the supertype. So, this should be a non-issue.

Additionally, its not flexible when you want to add more classes from the outside.

While we statically list those subtypes to be included in the schema, this does not mean that you cannot include additional named types from the outside. The object mapper provides one with the registerSubtypes() method. For example, to register a custom ModifiableShort from the outside:

ObjectMapper mapper = new ObjectMapper();
mapper.registerModule(new ModifiableVariableModule());

mapper.registerSubtypes(new NamedType(ModifiableShort.class, "Short"));
// Now, mapper can (de-)serialize ModifiableShort as a subtype of ModifiableVariable

@ic0ns
Copy link
Contributor

ic0ns commented Jun 25, 2025

but if I include additional types from the outside I cannot use the schema anymore? If I can - why cant i just directly add the types from the outside without listing them

@ic0ns
Copy link
Contributor

ic0ns commented Jun 25, 2025

Cause - like if I move "DataContainer" or something to protocol-attacker - I cannot add these annotations anymore anyways cause all the implementations are outside of protocol-attacker

@TrueSkrillor
Copy link
Contributor Author

  • @JsonInclude -> Only used for nicer JSON representation (i.e., leave out null values and empty lists). If we do not care, we can simply remove those annotations.

Can we configure this globally instead?

We can set a default inclusion globally, if we want to. However, this may result in unwanted exclusions where we want to explicitly include null or empty values. I only included the annotation where I deemed exclusion of null / empty values okay to do. The better way might be to remove these annotations altogether.

@TrueSkrillor
Copy link
Contributor Author

but if I include additional types from the outside I cannot use the schema anymore? If I can - why cant i just directly add the types from the outside without listing them

This is a limitation of the schema generation library I used here. The Jackson module for this library correctly picks up @JsonSubTypes annotated classes, but does not support detection of subtypes through the object mapper. However, the library itself is rather modular. So it should be possible to implement a custom Jackson module (or open a PR upstream) to support another use parameter.

@TrueSkrillor TrueSkrillor force-pushed the feat/json-serialization branch from c0bb9e4 to bdf4fde Compare June 25, 2025 12:55
@TrueSkrillor
Copy link
Contributor Author

TrueSkrillor commented Jun 25, 2025

Okay, I think I found a decent solution for the @JsonSubTypes issue. I implemented a custom module for the schema library that can resolve subtypes from the object mapper without requiring @JsonSubTypes to be present. I consequently removed this annotation and replaced @JsonTypeInfo(use = JsonTypeInfo.Id.NAME) with @JsonTypeInfo(use = JsonTypeInfo.Id.SIMPLE_NAME) to use the class name (without package name) instead. Subtypes are registered as part of the Jackson module using reflection.

As an added bonus, the schema is now more precise because I implemented a check to verify that subtypes match the bound types of the declared type. Now, for every type of ModifiableVariables, the modifications only list the ones actually allowed (rather than all).

For now, I kept all other annotations in place.

@TrueSkrillor TrueSkrillor changed the title feat: Replace XML with JSON serialization feat: Improve JSON serialization with ModifiableVariable Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants