A steganography tool for hiding a message in a dataset, such as csv or parquet files.
This tool hides a payload by permuting the rows of the dataset. The is tolerant to modification thanks to a Reed-Solomon code and a Luby-s LT fontain code.
You can experiment with the Python API using this Google Colab notebook.
pip install steganodf
# Encoding
steganodf encode -m hello host.csv stegano.csv
steganodf encode -m hello host.parquet stegano.parquet
steganodf encode -m hello -p password host.parquet stegano.parquet
# Decoding
steganodf decode stegano.csv
steganodf decode stegano.csv -p password
import steganodf
import polars as pl
df = pl.read_csv("https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv")
new_df = steganodf.encode(df, "made by steganodf", password="secret")
# Extract your message
message = steganodf.decode(df, password="secret")Sacha Schutz, Meganne Souprayen. Watermark tabular datasets with rows permutations and fountain code. TechRxiv. April 28, 2025. DOI: 10.36227/techrxiv.174585796.61215338/v1 Watermark tabular datasets with rows permutations and fountain code computing and processing