This project provides a set of Python scripts to convert the raw text and image files from the Macrolichen Companion of the Alaska Mainland into a structured format suitable for a Hugo static site generator.
The goal is to transform the original field guide content into an accessible and easily maintainable online resource. This digital format enhances the guide's usability by allowing for dynamic search, easier updates, and broader accessibility for scientists and enthusiasts alike.
The scripts automate the website building process by batch processing species descriptions (Word documents) and associated photographs (image files) to generate Hugo-compatible Markdown files. These output files integrate seamlessly into a Hugo website, enabling rapid deployment of and updates to the digital version of the Macrolichen Companion of the Alaska Mainland.
- Python 3.13+
- Miniforge (with conda and conda-forge channel) - For package management.
- fastexcel - For working with .xlsx files.
- Polars - For data manipulation.
- Pillow - For image manipulation.
- python-docx - For working with .docx files.
- Pathlib - For object-oriented filesystem paths.
- textwrap - For dedenting strings.
- re - For string pattern matching using regular expressions.
Follow these simple steps to run these scripts on your local machine.
Before you begin, ensure you have the following installed:
- Git: For cloning the repository.
- Miniforge: Optional, but recommended. Miniforge is an open-source, minimalist package manager that
provides
condaand defaults to theconda-forgechannel.- You can download the appropriate installer for your system from the Miniforge GitHub releases page.
- Jan Kirenz has a short and sweet Miniforge setup tutorial to help you get started.
- Hugo website. The scripts are formatted to be used with the Hugo documentation theme as simple as plain book.
-
Clone the repository: Open your terminal or command prompt and clone the project repository to your local machine using Git:
git clone https://github.com/accs-uaa/akveg-lichens.git
-
Navigate to the project directory: Change into the cloned project directory:
cd akveg-lichens -
Create and activate a Conda environment: Creating a dedicated Conda environment for this project will help you to manage dependencies and avoid conflicts with other Python installations.
conda create -n akveg-lichens python=3.13 conda activate akveg-lichens
-
Install dependencies: Once your Conda environment is active, install the required Python libraries from the
conda-forgechannel.conda env create -f environment.yml
This command will install all libraries listed in the
environment.ymlfile, which are necessary for the scripts to run correctly.
This project provides a suite of Python scripts to process your raw .docx files and associated images, transforming them into a structured format suitable for generating a Hugo website.
You will almost certainly have to modify these scripts to match the document formatting and folder structure of your project. Nevertheless, these scripts can provide a jumping off point that you can apply to your use case.
- Prepare your input files
Ensure your raw .docx files and image files are located in same parent directory. The script assumes that there is one sub-folder per taxon that contains both the images and .docx files for that taxon.
The .docx files in this project followed a specific template. Instead of using headings, sections were identified at the start of each paragraph using the section name followed by a colon and a white space (e.g., Description: ). An example for Cladonia stellaris is shown below.
- Run the processing scripts
Scripts are numbered sequentially according to the order in which they should be executed.
The first script, 01_create_toc.py, establishes the Table of Contents for the Hugo website by creating folders
for each taxonomic group. Each folder is then populated with an index Markdown file that specifies the title and
order of that group within the table of contents.
The second script, 02_format_descriptions.py is the main processing script that will convert your .docx content into
Markdown files and copy your images.
This script will:
- Extract images found in the sub-folders of the parent directory.
- Copy these images to static/images/ within your Hugo site structure.
- Read .docx files from the specified input directory.
- Convert them into Markdown files.
- Add the corresponding images to each file.
- Save the files to the appropriate taxonomic group sub-folder (created in the TOC script).
- View your site locally
Once the content and images are processed, navigate to your Hugo site's root directory and use Hugo's built-in command to preview your generated Hugo site by starting local server:
cd path/to/your/hugo/site
hugo server- Build your Hugo site
When you are happy with the way things look, compile all your Markdown content and static assets into the public/ directory, ready for deployment.
hugoThis project is provided under the GNU General Public License v3.0. It is free to use and modify in part or in whole.
- Project Maintainer: Amanda Droghini
- Email: adroghini (at) alaska.edu
- GitHub: @adroghini


