You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(v1.1.0): Add HTML output and improve sequence matching algorithm
- Add HTML output support for lightweight browser-viewable results
- Completely overhaul the DNA sequence marking algorithm
- Implement cross-line pattern matching to detect sequences spanning multiple lines
- Fix bugs with space handling in the spaced matching mode
- Improve code documentation with detailed function docstrings
- Refactor variable names for better consistency with Python coding standards
This release significantly improves pattern detection accuracy and adds
an alternative output format for users without Microsoft Word.
Copy file name to clipboardExpand all lines: README.md
+26-11Lines changed: 26 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,9 @@
1
-
# BioAlign - DNA Sequence Alignment Tool
1
+
# BioAlign - DNA Sequence Alignment and Marking Tool
2
2
3
-
BioAlign is a user-friendly tool for DNA sequence alignmentand visualization. It uses Clustal Omega for alignment and creates nicely formatted Word documents with customizable sequence highlighting.
3
+
BioAlign is a user-friendly tool for DNA sequence alignment, marking and visualization. It uses Clustal Omega for alignment and creates nicely formatted HTML and Word documents with sequence highlighting.
4
4
5
5
## Table of Contents
6
+
6
7
-[Features](#features)
7
8
-[Installation](#installation)
8
9
-[Usage](#usage)
@@ -27,8 +28,9 @@ BioAlign is a user-friendly tool for DNA sequence alignment and visualization. I
27
28
- Automatic DNA sequence alignment using Clustal Omega
28
29
- Triplet notation formatting for better readability
29
30
- Search and highlight specific DNA sequences in the alignment
30
-
- Option for spaced or exact sequence matching
31
-
- Different highlight colors for each sequence (optional)
31
+
- Advanced pattern matching with both exact and space-ignoring modes
- Output to both HTML and Word documents with highlighted matches
32
34
- Caching of alignment results for unchanged sequences
33
35
34
36
## Installation
@@ -39,6 +41,7 @@ No installation required! The release zip file contains everything you need:
39
41
2. Run `start.bat` to launch the application
40
42
41
43
The package includes:
44
+
42
45
- Embedded Python 3.13.2 runtime
43
46
- Clustal Omega 1.2.2 executable
44
47
- All required Python dependencies
@@ -58,6 +61,7 @@ Create or edit the `sequences.json` file in the application folder. This file sh
58
61
```
59
62
60
63
Where:
64
+
61
65
- Each key is the sequence name
62
66
- Each value is the DNA sequence
63
67
- You can add as many sequences as needed
@@ -76,26 +80,32 @@ When prompted:
76
80
77
81
1. Enter a DNA sequence to search for (e.g., "CTG") or leave empty to disable highlighting
78
82
2. Choose search mode:
79
-
-`exact`: Matches only exact sequences without spaces
80
-
-`spaced`: Matches sequences allowing for spaces between nucleotides
81
-
3. Choose whether to use separate colors for each sequence (yes/no)
83
+
-`exact`: Matches only exact sequences including spaces
84
+
-`spaced`: Ignores spaces during matching, useful for finding patterns across triplet notation
85
+
86
+
The improved spaced mode can now correctly identify patterns that span across the triplet spaces and sequence lines in the formatted output.
82
87
83
88
### Output
84
89
85
90
The program generates:
91
+
86
92
-`sequences.fasta`: The input file for Clustal Omega
87
93
-`sequences.aln`: The alignment result from Clustal Omega
88
-
-`sequences.docx`: The final Word document with formatted alignment and highlighting
94
+
-`sequences.html`: HTML document with formatted alignment and highlighting
95
+
-`sequences.docx`: Word document with formatted alignment and highlighting
96
+
97
+
The HTML output provides a lightweight, browser-viewable alternative that doesn't require Microsoft Word to open.
89
98
90
99
## Example
91
100
92
-
For the provided example sequences, searching for "CTG" with spaced mode enabled will highlight this pattern in all sequences, allowing you to easily compare variations.
101
+
For the provided example sequences, searching for "CTG" with spaced mode enabled will highlight this pattern in all sequences, allowing you to easily compare variations. The improved algorithm will now correctly find instances even when they span across the spaces in triplet notation.
93
102
94
103
## Notes
95
104
96
105
- The tool caches alignment results to avoid redundant calculations
97
106
- The Word document uses Courier New font for consistent spacing
98
-
- Highlighting uses yellow by default, or green/yellow/pink when using separate colors
107
+
- The HTML output uses the same monospace formatting for consistency with the Word document
108
+
- The improved marking algorithm can now detect patterns that span across multiple lines of the same sequence
0 commit comments