★ 350 Rust MIT sse 更新 1小时前

Zhtw

A linguistic linter for Traditional Chinese (zh-TW)

GitHub

安装配置

{
    "mcpServers": {
        "zhtw-mcp": {
            "command": "/path/to/zhtw-mcp",
            "args": []
        }
    }
}

README 摘要

# zhtw-mcp A linguistic linter for Traditional Chinese (zh-TW) that enforces Taiwan Ministry of Education (MoE) standards on vocabulary, punctuation, and character shapes. It plugs into AI coding assistants through the [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) and catches Mainland Chinese (zh-CN) regional drift before it reaches the user. The tool enforces three official Taiwan standards: - [Revised Handbook of Punctuation](https://language.moe.gov.tw/001/upload/files/site_content/m0001/hau/c2.htm) (《重訂標點符號手冊》修訂版) -- punctuation marks - [Standard Form of National Characters](https://language.moe.gov.tw/001/Upload/files/SITE_CONTENT/M0001/STD/F4.HTML) (《國字標準字體》) -- character shapes - Cross-strait vocabulary normalization, grounded in [OpenCC](https://github.com/BYVoid/OpenCC)'s TWPhrases/TWVariants datasets -- word choices Over 1100 vocabulary rules and 15 casing rules are compiled into the binary. For ambiguous terms, the server asks the AI assistant it runs inside for help deciding -- no extra API keys required. ## Why this exists ### Modern Chinese is an inadequately standardized language In the late Qing dynasty, scholars had to express Western concepts in a writing system with no native vocabulary for them. Whether coining new words or importing translations via Japanese (和製漢語), they assembled a literary system under enormous time pressure. Many translated terms were inconsistent, ambiguous, or contradictory. The Chinese-speaking world has lived with these deficiencies for over a century. ### Simplified Chinese made it worse The PRC simplification effort reduced not just stroke counts but vocabulary precision. Terms that should vary by domain got flattened into single catch-all translations. Many PRC translations were coined hastily: if a term worked in one context, it spread uncritically to others. ### AI models amplify the problem AI language models learn from web text where Simplified Chinese vastly outweighs Traditional Chinese (roughly 2.6:1 in [CC-100](https://data.statmt.org/cc-100/)). Major datasets like [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) do not even track Traditional Chinese separately. A [FAccT 2025 study](https://arxiv.org/abs/2505.22645) confirmed that most models favor zh-CN terminology when asked to write zh-TW. The output looks plausible but is not how people in Taiwan actually write. This goes beyond character conversion. The same word often means different things across the strait: | English | zh-CN | zh-TW | Why it matters | |---------|-------|-------|----------------| | concurrency | 並發 | 並行 | In zh-CN, 並行 means "parallel" -- a different concept entirely | | parallel | 並行 | 平行 | zh-CN 並行 = "parallel"; in Taiwan, 並行 = "concurrent" | | process (OS) | 進程 | 行程 | 進程 in Taiwan means "progress," not an OS process | | file / document | 文件 / 文檔 | 檔案 / 文件 | 文件 in China = "file"; in Taiwan = "document" | | render | 渲染 | 算繪 | 渲染 in Taiwan = "exaggerate" (a painting technique) | | traverse | 遍歷 | 走訪 | 遍歷 in Taiwan is reserved for Ergodic theory (遍歷理論) | ### What this project does Automatically check and correct zh-TW text produced by AI, catching cross-strait terminology leaks: - Half-width punctuation (`,` `.` `:`) that should be full-width (`，` `。` `：`) - Mainland-style `""` curly quotes replaced with Taiwan-style `「」` corner brackets - Missing or extra CJK-Latin/digit spacing - Mainland...

Zhtw

安装配置

README 摘要

相关 MCP

Awesome S

Chrome Devtools

Libre Chat

Playwright

Github

Fast