-
Notifications
You must be signed in to change notification settings - Fork 28
Open
Description
You have done a commendable work in curating this repo.
But there is a deficiency in the original sacred-texts.com Sanskrit that has inadvertently crept up here also.
This error is in the last letter of every line or ślokārdha-s, the halanta or ् are not represented properly.
नरॊत्तमम should be नरॊत्तमम्
उदीरयेत should be उदीरयेत्
् is missing everywhere.
We have two ways to remedy this problem:
- Look for the last character of each line and if it is अ-कारान्त then replace with हलन्त ् . Exceptions: a-kārānta valid words such as मम, च etc.
- Use an alternate data source such as https://bombay.indology.info/mahabharata/text/UD/MBh01.txt and others.
If you could upload the programs used by you for scraping the websites, scanning Sanskrit text along with accent markers and OCR, turning PDFs into JSON, etc. to the repo, I can create a pull request with the additional data sources.
Metadata
Metadata
Assignees
Labels
No labels