The exponential growth of genomic datasets necessitates robust, scalable, and reversible methods for DNA sequence encoding that can support downstream computational workflows. Among existing approaches, CGR has been widely adopted for its ability to visualize and analyze nucleotide composition in a geometric framework [@almeida_analysis_2001]. However, CGR suffers from inherent limitations: it relies on floating-point arithmetic, which introduces precision errors, lacks scalability to large sequences, and is not reversible, making exact sequence reconstruction impossible. The Integer Chaos Game Representation introduced by Yin *et al.* [@yin_encoding_2018], addresses these shortcomings by providing a mathematically rigorous and fully reversible encoding scheme based on integer arithmetic. Despite its theoretical advantages, iCGR remains underutilized due to the absence of a comprehensive, open-source implementation suitable for genome-scale applications. Apart from the illustrative prototype provided by the original authors, no available software supports full encoding, decoding, and standardized storage of iCGR coordinates in a format adapted to large-scale, reproducible workflows.
0 commit comments