Canonicalize PSMILES

IMPORTANT NOTE: The code and data shared here is available for academic non-commercial use only

I recommend using the psmiles Python package that integrates canonicalization and other tools to work with PSMILES.

PSMILES (Polymer SMILES) is a chemical language to represent polymer structures. PSMILES strings have two stars ([*] or *) symbols that indicate the two endpoints of the polymer repeat unit and otherwise follow the daylight SMILES syntax defined at OpenSmiles. Developed as part of arXiv.

The raw PSMILES syntax is ambiguous and non-unique; i.e., the same polymer may be written using many PSMILES strings:

Polyethylene	Polyethylene oxide	Polypropylene
`[]C[]`	`[]CCO[]`	`[]CC([])C`
`[]CC[]`	`[]COC[]`	`[]CC(CC([])C)C`
`[]CCC[]`	`[]OCC[]`	`CC([])C[]`

The canonicalization routine of the PSMILES packages finds a canonicalized version of the SMILES string by

Finding the shortest representation of a PSMILES string

[*]CCOCCO[*] -> [*]CCO[*]

Making the PSMILES string cyclic

[*]CCO[*] -> C1 CCO C1

Applying the canonicalization routine as implemented in RDKit

C1 CCO C1 -> C1 COC C1

Breaking the cyclic bond

C1 COC C1 -> [*]COC[*]

Install

pip install git+https://github.com/Ramprasad-Group/canonicalize_psmiles.git

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github/workflows		.github/workflows
canonicalize_psmiles		canonicalize_psmiles
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Polyethylene	Polyethylene oxide	Polypropylene
`[]C[]`	`[]CCO[]`	`[]CC([])C`
`[]CC[]`	`[]COC[]`	`[]CC(CC([])C)C`
`[]CCC[]`	`[]OCC[]`	`CC([])C[]`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Canonicalize PSMILES

IMPORTANT NOTE: The code and data shared here is available for academic non-commercial use only

Install

How to use

About

Releases 2

Packages

Contributors 5

Languages

License

Ramprasad-Group/canonicalize_psmiles

Folders and files

Latest commit

History

Repository files navigation

Canonicalize PSMILES

IMPORTANT NOTE: The code and data shared here is available for academic non-commercial use only

Install

How to use

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 5

Languages

Packages