This Python application is designed to convert Microsoft Word documents (.docx) into S1000D XML format. S1000D is an international specification for technical publications, particularly in the aerospace and defense industries.
A sample Word document (sample.docx
) and its corresponding XML output (sample.xml
) are included in the "DEMO" folder for reference. You can use these files to test the converter and understand the expected structure of the input and output files.
Before using this converter, make sure you have the following prerequisites installed:
- Python (version 3.x)
- Flask (a Python web framework)
- lxml (a library for processing XML and HTML)
- Bootstrap (for the user interface)
You can install the required Python packages using pip:
pip install Flask lxml
-
Clone or download this repository to your local machine.
-
Navigate to the project directory.
-
Run the Flask application by executing
main.py
:python main.py
This will start the web application, and you should see output indicating the server is running.
-
Open your web browser and go to
http://localhost:5000
to access the file upload interface. -
Click the "Choose file" button to select a Microsoft Word document (.docx) that you want to convert to S1000D XML.
-
Click the "Upload" button to initiate the conversion process.
-
Once the conversion is complete, you will receive a success message.
-
The resulting S1000D XML file will be saved as
s1000d.xml
in the project directory.
main.py
: The main Python script that defines the Flask web application. It handles file uploads and initiates the conversion process.convert.py
: A Python script that performs the conversion of the Word document to S1000D XML.upload.html
: An HTML template for the file upload interface.uploads/
: A directory where uploaded Word documents, extracted XML content, and images are temporarily stored during the conversion process.
-
This application is designed for educational and demonstration purposes. Depending on your specific requirements, you may need to customize the conversion process to suit your needs.
-
Ensure that you have the necessary permissions to access and modify files in the project directory.
-
The conversion process assumes that the input Word document follows a specific structure, with headings and lists. You may need to adjust the code if your documents have a different structure.
-
The application uses Bootstrap for the user interface to make it more user-friendly.