Skip to content

Diary - transformation pipeline #31

Open
@sashafirsov

Description

@sashafirsov

Microapplication Container loads various resource types and along with loading it needs to transform those resources more than once. As transformations would be aware of syntax and some semantics,
it has a sense to pass AST instead of working with source text. Transformation would either incrementally adding the elements to AST or (optimization ones) replace the content completely. The 1st could reuse initially parsed AST, last could (better?) create new tree and feed data during initial AST traverse.

Formats for AST data:

  • Json, actually JS object - 🤢 needs marshalling to/from transformer
  • apache arrow. Has sufficient support for programming languages, reuse of data TBD.
  • 🤩 xml dom. Could reuse same binary data by huge set of languages which also includes XSLT - only native multithreaded lang in browser.

XML libs also give ability to use streaming processing eliminating the need for keeping whole datasets in memory and threading the pipeline.

In-browser XSLT 1.0 and saxon based xslt transformation are 1st candidates as they give multithreading and streaming out of the box.

Same as XML libs give ability to traverse data in stream and directly via DOM API, transformers could benefit from both ways of data access.

Direct DOM access could be beneficial when there is a need for several passes or altering initial DOM

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions