|
1 | | -<p align="center"><img src="https://github.com/packing-box/python-dsff/raw/main/docs/pages/imgs/logo.png"></p> |
2 | | -<h1 align="center">DataSet File Format <a href="https://twitter.com/intent/tweet?text=DataSet%20File%20Format%20-%20XSLX-based%20format%20for%20handling%20datasets.%0D%0ATiny%20library%20for%20handling%20a%20dataset%20as%20an%20XSLX%20and%20for%20converting%20it%20to%20ARFF,%20CSV%20or%20a%20FilelessDataset%20structure%20as%20for%20the%20Packing%20Box.%0D%0Ahttps%3a%2f%2fgithub%2ecom%2fpacking-box%2fpython-dsff%0D%0A&hashtags=python,dsff,machinelearning"><img src="https://img.shields.io/badge/Tweet--lightgrey?logo=twitter&style=social" alt="Tweet" height="20"/></a></h1> |
3 | | -<h3 align="center">Store a dataset in XSLX-like format.</h3> |
4 | | - |
5 | | -[](https://pypi.python.org/pypi/dsff/) |
6 | | -[](https://python-dsff.readthedocs.io/en/latest/?badge=latest) |
7 | | -[](https://github.com/dhondta/python-dsff/actions/workflows/python-package.yml) |
8 | | -[](#) |
9 | | -[](https://pypi.python.org/pypi/dsff/) |
10 | | -[](https://snyk.io/test/github/packing-box/python-dsff?targetFile=requirements.txt) |
11 | | -[](https://pypi.python.org/pypi/dsff/) |
12 | | - |
13 | | - |
14 | | -This library contains code for handling the DataSet File Format (DSFF) based on the XSLX format and for converting it to [ARFF](https://www.cs.waikato.ac.nz/ml/weka/arff.html) (for use with the [Weka](https://www.cs.waikato.ac.nz/ml/weka) framework), [CSV](https://www.rfc-editor.org/rfc/rfc4180) or a [FilelessDataset structure](https://docker-packing-box.readthedocs.io/en/latest/usage/datasets.html) (from the [Packing Box](https://github.com/packing-box/docker-packing-box)). |
15 | | - |
16 | | -```sh |
17 | | -pip install --user dsff |
18 | | -``` |
19 | | - |
20 | | -## :sunglasses: Usage |
21 | | - |
22 | | -**Creating a DSFF from a FilelessDataset** |
23 | | - |
24 | | -```python |
25 | | ->>> import dsff |
26 | | ->>> with dsff.DSFF() as f: |
27 | | - f.write("/path/to/my-dataset") # folder of a FilelessDataset (containing data.csv, features.json and metadata.json) |
28 | | - f.to_arff() # creates ./my-dataset.arff |
29 | | - f.to_csv() # creates ./my-dataset.csv |
30 | | -# while leaving the context, ./my-dataset.dsff is created |
31 | | -``` |
32 | | - |
33 | | -**Creating a FilelessDataset from a DSFF** |
34 | | - |
35 | | -```python |
36 | | ->>> import dsff |
37 | | ->>> with dsff.DSFF("/path/to/my-dataset.dsff") as f: |
38 | | - f.to_dataset() # creates ./[dsff-title] with data.csv, features.json and metadata.json |
39 | | -``` |
40 | | - |
41 | | -## :star: Related Projects |
42 | | - |
43 | | -You may also like these: |
44 | | - |
45 | | -- [Awesome Executable Packing](https://github.com/packing-box/awesome-executable-packing): A curated list of awesome resources related to executable packing. |
46 | | -- [Bintropy](https://github.com/packing-box/bintropy): Analysis tool for estimating the likelihood that a binary contains compressed or encrypted bytes (inspired from [this paper](https://ieeexplore.ieee.org/document/4140989)). |
47 | | -- [Dataset of packed ELF files](https://github.com/packing-box/dataset-packed-elf): Dataset of ELF samples packed with many different packers. |
48 | | -- [Dataset of packed PE files](https://github.com/packing-box/dataset-packed-pe): Dataset of PE samples packed with many different packers (fork of [this repository](https://github.com/chesvectain/PackingData)). |
49 | | -- [Docker Packing Box](https://github.com/packing-box/docker-packing-box): Docker image gathering packers and tools for making datasets of packed executables. |
50 | | -- [PEiD](https://github.com/packing-box/peid): Python implementation of the well-known Packed Executable iDentifier ([PEiD](https://www.aldeid.com/wiki/PEiD)). |
51 | | -- [PyPackerDetect](https://github.com/packing-box/pypackerdetect): Packing detection tool for PE files (fork of [this repository](https://github.com/cylance/PyPackerDetect)). |
52 | | -- [REMINDer](https://github.com/packing-box/reminder): Packing detector using a simple heuristic (inspired from [this paper](https://ieeexplore.ieee.org/document/5404211)). |
53 | | - |
54 | | - |
55 | | -## :clap: Supporters |
56 | | - |
57 | | -[](https://github.com/packing-box/python-dsff/stargazers) |
58 | | - |
59 | | -[](https://github.com/packing-box/python-dsff/network/members) |
60 | | - |
61 | | -<p align="center"><a href="#"><img src="https://img.shields.io/badge/Back%20to%20top--lightgrey?style=social" alt="Back to top" height="20"/></a></p> |
62 | | - |
| 1 | +<p align="center"><img src="https://github.com/packing-box/python-dsff/raw/main/docs/pages/imgs/logo.png"></p> |
| 2 | +<h1 align="center">DataSet File Format <a href="https://twitter.com/intent/tweet?text=DataSet%20File%20Format%20-%20XSLX-based%20format%20for%20handling%20datasets.%0D%0ATiny%20library%20for%20handling%20a%20dataset%20as%20an%20XSLX%20and%20for%20converting%20it%20to%20ARFF,%20CSV%20or%20a%20FilelessDataset%20structure%20as%20for%20the%20Packing%20Box.%0D%0Ahttps%3a%2f%2fgithub%2ecom%2fpacking-box%2fpython-dsff%0D%0A&hashtags=python,dsff,machinelearning"><img src="https://img.shields.io/badge/Tweet--lightgrey?logo=twitter&style=social" alt="Tweet" height="20"/></a></h1> |
| 3 | +<h3 align="center">Store a dataset in XSLX-like format.</h3> |
| 4 | + |
| 5 | +[](https://pypi.python.org/pypi/dsff/) |
| 6 | +[](https://python-dsff.readthedocs.io/en/latest/?badge=latest) |
| 7 | +[](https://github.com/dhondta/python-dsff/actions/workflows/python-package.yml) |
| 8 | +[](#) |
| 9 | +[](https://pypi.python.org/pypi/dsff/) |
| 10 | +[](https://snyk.io/test/github/packing-box/python-dsff?targetFile=requirements.txt) |
| 11 | +[](https://pypi.python.org/pypi/dsff/) |
| 12 | + |
| 13 | + |
| 14 | +This library contains code for handling the DataSet File Format (DSFF) based on the XSLX format and for converting it to [ARFF](https://www.cs.waikato.ac.nz/ml/weka/arff.html) (for use with the [Weka](https://www.cs.waikato.ac.nz/ml/weka) framework), [CSV](https://www.rfc-editor.org/rfc/rfc4180) or a [FilelessDataset structure](https://docker-packing-box.readthedocs.io/en/latest/usage/datasets.html) (from the [Packing Box](https://github.com/packing-box/docker-packing-box)). |
| 15 | + |
| 16 | +```sh |
| 17 | +pip install --user dsff |
| 18 | +``` |
| 19 | + |
| 20 | +## :sunglasses: Usage |
| 21 | + |
| 22 | +**Creating a DSFF from a FilelessDataset** |
| 23 | + |
| 24 | +```python |
| 25 | +>>> import dsff |
| 26 | +>>> with dsff.DSFF() as f: |
| 27 | + f.write("/path/to/my-dataset") # folder of a FilelessDataset (containing data.csv, features.json and metadata.json) |
| 28 | + f.to_arff() # creates ./my-dataset.arff |
| 29 | + f.to_csv() # creates ./my-dataset.csv |
| 30 | + f.to_db() # creates ./my-dataset.db (SQLite DB) |
| 31 | +# while leaving the context, ./my-dataset.dsff is created |
| 32 | +``` |
| 33 | + |
| 34 | +**Creating a FilelessDataset from a DSFF** |
| 35 | + |
| 36 | +```python |
| 37 | +>>> import dsff |
| 38 | +>>> with dsff.DSFF("/path/to/my-dataset.dsff") as f: |
| 39 | + f.to_dataset() # creates ./[dsff-title] with data.csv, features.json and metadata.json |
| 40 | +``` |
| 41 | + |
| 42 | +## :star: Related Projects |
| 43 | + |
| 44 | +You may also like these: |
| 45 | + |
| 46 | +- [Awesome Executable Packing](https://github.com/packing-box/awesome-executable-packing): A curated list of awesome resources related to executable packing. |
| 47 | +- [Bintropy](https://github.com/packing-box/bintropy): Analysis tool for estimating the likelihood that a binary contains compressed or encrypted bytes (inspired from [this paper](https://ieeexplore.ieee.org/document/4140989)). |
| 48 | +- [Dataset of packed ELF files](https://github.com/packing-box/dataset-packed-elf): Dataset of ELF samples packed with many different packers. |
| 49 | +- [Dataset of packed PE files](https://github.com/packing-box/dataset-packed-pe): Dataset of PE samples packed with many different packers (fork of [this repository](https://github.com/chesvectain/PackingData)). |
| 50 | +- [Docker Packing Box](https://github.com/packing-box/docker-packing-box): Docker image gathering packers and tools for making datasets of packed executables. |
| 51 | +- [PEiD](https://github.com/packing-box/peid): Python implementation of the well-known Packed Executable iDentifier ([PEiD](https://www.aldeid.com/wiki/PEiD)). |
| 52 | +- [PyPackerDetect](https://github.com/packing-box/pypackerdetect): Packing detection tool for PE files (fork of [this repository](https://github.com/cylance/PyPackerDetect)). |
| 53 | +- [REMINDer](https://github.com/packing-box/reminder): Packing detector using a simple heuristic (inspired from [this paper](https://ieeexplore.ieee.org/document/5404211)). |
| 54 | + |
| 55 | + |
| 56 | +## :clap: Supporters |
| 57 | + |
| 58 | +[](https://github.com/packing-box/python-dsff/stargazers) |
| 59 | + |
| 60 | +[](https://github.com/packing-box/python-dsff/network/members) |
| 61 | + |
| 62 | +<p align="center"><a href="#"><img src="https://img.shields.io/badge/Back%20to%20top--lightgrey?style=social" alt="Back to top" height="20"/></a></p> |
| 63 | + |
0 commit comments