Skip to content

savannstm/marshal-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

marshal-rs

marshal-rs is a complete Rust implementation of Ruby-lang's Marshal.

It is capable of 🔥 BLAZINGLY FAST loading data from dumped Ruby Marshal files, as well as 🔥 BLAZINGLY FAST dumping it back to Marshal format.

Installation

cargo add marshal-rs

Overview

This crate has two main structs, Loader and Dumper, along with helper functions that use them internally. There's three load functions: load, load_utf8, load_binary, and a single dump function: dump.

load takes a &[u8], consisting of Marshal data bytes (that can be read using std::fs::read) as its only argument, and outputs Value.

dump, in turn, takes marshal_rs::Value as its only argument and serializes it back to Vec<u8> Marshal byte stream. It does not preserve strings' initial encoding, writing all strings as UTF-8 encoded.

By default, in load function, Ruby strings, that include encoding instance variable, are serialized to JSON strings, and those which don't, serialized to byte arrays.

load_utf8 function tries to convert arrays without instance variable to string, and produces string if array is valid UTF-8, and object otherwise.

load_binary function converts all strings to objects.

This behavior also can be controlled in Loader by calling Loader::set_string_mode.

You can manage the prefix of instance variables using instance_var_prefix argument in load and dump, or by using Loader::set_instance_var_prefix or Dumper::set_instance_var_prefix. Passed string replaces "@" instance variables' prefixes.

To avoid loss of precision, floats are stored as strings.

If Marshal file contains any extra float mantissa bits, marshal-rs discards them. They aren't written by latest 4.8 version of Marshal, but it still preserves them, if encounters any. marshal-rs does not.

The reason this crate wraps around serde_json::Value, is because it needs to cleanly track unique object instances and object metadata.

The table shows, how marshal-rs serializes Ruby types to Value:

Ruby object Serialized to Value
nil null
true, false true, false
1337 (Integer) 1337
36893488147419103232 (Big Integer) "36893488147419103232"
13.37 (Float) "13.37"
"ligma" (String, with instance variable) "ligma"
:ligma (Symbol) "ligma"
/lgma/i (Regex) "/lgma/i"
[ ... ] (Array) [ ... ]
Hash, Struct IndexMap<Value, Value>
Object.new IndexMap<String, Value>
Class, Module null (Doesn't dump any data to Marshal)

Value can be stringified and written to JSON using Value::to_string function. That will wrap each non-trivial value in an object, that holds its metadata as object keys. It some metadata field is empty, it won't write it to JSON. For trivial values (null, bool, integer) it will insert the literal value. For example:

null becomes: null.

[null, true, 1] becomes:

{
    "__id": number,
    "__type": 9,
    "__value": [null, true, 1],
}

object.

Possible __type values are defined in src/types.rs:

pub enum ValueType {
    #[default]
    Null = 0,
    Bool(bool) = 1,
    Integer(i32) = 2,
    Float(String) = 3,
    Bigint(String) = 4,
    String(String) = 5,
    Bytes(Vec<u8>) = 6,
    Symbol(String) = 7,
    Regexp(String) = 8,
    Array(Vec<Value>) = 9,
    Object(ObjectMap) = 10,
    Class = 11,
    Module = 12,
    HashMap(HashMap) = 13,
    Struct(HashMap) = 14,
}

Possible __flags values are defined in src/types.rs:

struct ValueFlags: u8 {
    const None = 0;
    const OldModule = 1;
    const UserClass = 2;
    const Data = 4;
    const UserDefined = 8;
    const UserMarshal = 16;
}

Keep in mind __flags also could be a combination of some flags.

Unsafe code

In this crate, unsafe code provides the ability to replicate Marshal's behavior. It shouldn't ever cause problems.

Test coverage

Currently, tests feature dumping/loading the following values: nil, bool, positive/negative fixnum, positive/negative bignum, float (including inf, nan and negative), utf-8/non-utf-8 strings, object links, array, hashes, structs, objects (including extended with modules, with custom marshal_ methods, with custom _load/_dump methods), regexps, built-in class subclasses.

Also tests include loading/dumping RPG Maker game's files and battle-testing them.

If something is missing in the tests, open an issue or submit a pull request.

Example

use std::fs::read;
use marshal_rs::{load, dump, Value};

// Note: Value supports indexing by `&str` if it's an object,
// and by `&Value` if it's a hashmap, but to use that you need
// to import `Get` trait.
// use marshal_rs::Get;

fn main() {
    // Read marshal data from file
    // let marshal_data: Vec<u8> = read("./Map001.rvdata2").unwrap();
    // For this example, we'll just take pre-defined marshal data
    let marshal_data = [0x04, 0x08, 0x30];

    // Serializing to json
    // `load` takes a `&[u8]` as argument, so `Vec<u8>` must be borrowed
    let serialized_to_json: Value = load(&marshal_data, None).unwrap();

    // Here you may stringify Value using `Value::to_string`, and
    // `std::fs::write` it to file

    // Serializing back to marshal
    // `dump` requires owned Value as argument
    let serialized_to_marshal: Vec<u8> = dump(serialized_to_json, None);

    // Here you may `std::fs::write` serialized Marshal data to file
}

MSRV

Minimum supported Rust version is 1.63.0.

References

License

Project is licensed under WTFPL.

Packages

No packages published

Languages