A comprehensive Python library for parsing, formatting, and performing arithmetic operations with Chinese numerals (汉字数字). Supports traditional and simplified Chinese, regional variants, financial formats, and full mathematical operations.
- Full Arithmetic Support: All Python operators (
+,-,*,/,//,%,**) with Chinese numerals - Multi-Regional Support: Mainland China (CN), Taiwan (TW), Hong Kong (HK) variants
- Multiple Writing Systems: Simplified, Traditional, and Financial (uppercase) formats
- Decimal & Negative Numbers: Complete support for floating-point and negative values
- Intelligent Parsing: Handles complex patterns, zero-bridging, and regional differences
- Round-Trip Consistency: Parse → Format → Parse yields identical results
- Large Number Support: Up to 10^44 (載) with proper unit handling
- Profile System: Configurable formatting and parsing behavior
pip install hanzi-arithmeticfrom hanzi_arithmetic import chinese
# Create Chinese numbers
num1 = chinese("三千五百") # 3500
num2 = chinese("六百五十") # 650
# Perform arithmetic
result = num1 + num2
print(result) # 四千一百五十 (4150)
# Mix with regular numbers
result = num1 * 2 + 100
print(result) # 七千一百 (7100)
# Access the numeric value
print(int(result)) # 7100
print(float(result)) # 7100.0from hanzi_arithmetic import chinese, CN_EVERYDAY, TW_HK_EVERYDAY, FINANCIAL_CN
# Mainland China (Simplified)
cn_num = chinese(12000, CN_EVERYDAY)
print(cn_num) # 一万二千
# Taiwan/Hong Kong (Traditional)
tw_num = chinese(12000, TW_HK_EVERYDAY)
print(tw_num) # 一萬二千
# Financial/Banking Format
financial = chinese(12000, FINANCIAL_CN)
print(financial) # 壹万贰仟from hanzi_arithmetic import chinese
# Chained operations
result = chinese("一万") + chinese("五千") - 3000 + chinese("八百")
print(result) # 一万二千八百
# Mixed type operations
price = chinese("九万八千")
discount = price * 0.15 # 15% discount
final_price = price - discount
print(final_price) # 八万三千三百
# Division and remainders
total = chinese("十万")
parts = total / 7
remainder = total % 7
print(f"Each part: {parts}, Remainder: {remainder}")# Decimal parsing and formatting
decimal_num = chinese("三点一四一五九") # 3.14159
print(float(decimal_num)) # 3.14159
# Decimal arithmetic
result = chinese("十点五") + 2.3
print(result) # 十二点八# Large number handling
trillion = chinese("一万亿") # 1 trillion (Mainland)
tw_trillion = chinese("一兆", TW_HK_EVERYDAY) # 1 trillion (Taiwan)
print(int(trillion)) # 1000000000000
print(int(tw_trillion)) # 1000000000000
# Very large numbers
huge_num = chinese("九千九百万亿")
print(f"Value: {int(huge_num):,}") # 9,900,000,000,000,000from hanzi_arithmetic import FINANCIAL_CN, FINANCIAL_TW_HK
# Financial formats (anti-fraud uppercase)
amount = chinese(1234567, FINANCIAL_CN)
print(amount) # 壹佰贰拾叁万肆仟伍佰陆拾柒
# Banking calculations
principal = chinese("十万", FINANCIAL_CN) # 100,000
interest_rate = 0.045 # 4.5%
interest = principal * interest_rate
total = principal + interest
print(f"Principal: {principal}") # 壹拾万
print(f"Interest: {chinese(int(interest), FINANCIAL_CN)}") # 肆仟伍佰
print(f"Total: {chinese(int(total), FINANCIAL_CN)}") # 壹拾万肆仟伍佰# Document processing
text_amounts = ["三万五千", "十二万八千", "五十万"]
total_amount = sum(chinese(amount).value for amount in text_amounts)
print(f"Total: {chinese(total_amount)}") # 六十五万三千
# Multi-format parsing (handles different input styles)
inputs = ["两万", "兩萬", "二万", "貳萬"] # Different ways to write 20,000
values = [chinese(inp).value for inp in inputs]
print(f"All equal: {all(v == 20000 for v in values)}") # TrueThe library uses profiles to handle different regional and formatting preferences:
from hanzi_arithmetic import (
CN_EVERYDAY, # Mainland China, everyday use
TW_HK_EVERYDAY, # Taiwan/Hong Kong, everyday use
FINANCIAL_CN, # Mainland China, financial format
FINANCIAL_TW_HK # Taiwan/Hong Kong, financial format
)
number = 1234567
for profile in [CN_EVERYDAY, TW_HK_EVERYDAY, FINANCIAL_CN, FINANCIAL_TW_HK]:
formatted = chinese(number, profile)
print(f"{profile.__class__.__name__}: {formatted}")| Feature | CN Everyday | TW/HK Everyday | CN Financial | TW/HK Financial |
|---|---|---|---|---|
| Large Units | 万亿 (wan yi) | 兆 (zhao) | 万亿 | 兆 |
| Script | Simplified | Traditional | Simplified | Traditional |
| Digits | 一二三... | 一二三... | 壹贰叁... | 壹貳參... |
| Zero | 零 | 零 | 零 | 零 |
from hanzi_arithmetic import ChineseNumberProfile, Script, Locale
# Create custom profile
custom_profile = ChineseNumberProfile(
script=Script.TRADITIONAL,
locale=Locale.TW,
use_liang_output=True, # Use 兩 instead of 二
accept_archaic=True # Accept archaic forms like 廿 (20)
)
num = chinese(2000, custom_profile)
print(num) # Uses custom formatting rulesfrom hanzi_arithmetic.exceptions import ParseError, FormatError
try:
invalid_num = chinese("不是数字") # Invalid text
except ParseError as e:
print(f"Parse error: {e}")
print(f"Problem text: {e.text}")# Complex numbers with zero bridging
complex_nums = [
"一万零三", # 10,003
"十万零五十", # 100,050
"一千万零七", # 10,000,007
"三千零一万", # 30,010,000
]
for num_text in complex_nums:
num = chinese(num_text)
print(f"{num_text} = {int(num):,}")- Banking systems and financial software
- Invoice and receipt processing
- Anti-fraud number verification (financial formats)
- Accounting and bookkeeping systems
- Chinese text analysis and extraction
- Document processing and data mining
- Multilingual number normalization
- Language learning applications
- Cross-regional number format handling
- Traditional/Simplified Chinese conversion
- Cultural adaptation for different markets
- Government and legal document processing
- Historical document analysis
- Linguistic research on number systems
- Cultural studies and anthropology
- Educational tools and curricula
The main class representing a Chinese number with full arithmetic support.
class ChineseNumber:
def __init__(self, value: Union[str, int, float, ChineseNumber],
profile: Optional[ChineseNumberProfile] = None)
@property
def value(self) -> Union[int, float] # Numeric value
@property
def chinese(self) -> str # Chinese text representation
@property
def profile(self) -> ChineseNumberProfile # Formatting profileConvenient way to create ChineseNumber instances.
def chinese(value: Union[str, int, float, ChineseNumber],
profile: Optional[ChineseNumberProfile] = None) -> ChineseNumberCN_EVERYDAY: Mainland China, everyday usage (default)TW_HK_EVERYDAY: Taiwan/Hong Kong, everyday usageFINANCIAL_CN: Mainland China, financial/banking formatFINANCIAL_TW_HK: Taiwan/Hong Kong, financial format
ChineseNumberError: Base exception classParseError: Raised when text cannot be parsedFormatError: Raised when number cannot be formattedValidationError: Raised for grammar rule violations
The library includes comprehensive test suites covering:
- Core integer parsing and formatting
- Decimal and negative number handling
- Arithmetic operator overloading
- Regional variant processing
- Zero-bridging and complex patterns
- Round-trip consistency (parse → format → parse)
- Financial format validation
- Error handling and edge cases
Run tests with:
pytest tests/Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
- Python 3.8+
- No external dependencies for core functionality
- pytest for testing
This project is licensed under the MIT License - see the LICENSE file for details.
Chinese numerals, Hanzi numbers, Chinese digits, Traditional Chinese, Simplified Chinese, Taiwan, Hong Kong, Mainland China, financial formatting, banking numbers, arithmetic operations, number parsing, text processing, NLP, natural language processing, multilingual, localization, internationalization, 汉字数字, 中文数字, 繁体中文, 简体中文, 金融格式, 数字处理
Made with ❤️ for the Chinese language processing community.