profile
viewpoint

BYVoid/OpenCC 5019

Conversion between Traditional and Simplified Chinese

batsh-dev-team/Batsh 3981

A language that compiles to Bash and Windows Batch

BYVoid/continuation 383

JavaScript asynchronous Continuation-Passing Style transformation (deprecated).

BYVoid/microblog 92

A tiny mircoblog system using Node.js for learning purposes.

BYVoid/MidiToSheetMusic 69

A tiny tool for conversion from MIDI file to sheet music image file

BYVoid/distribox 29

Decentralized Dropbox based on P2P Anti-Entropy protocol

BYVoid/byvhttpd 28

A toy http server based on Qt

BYVoid/MIPS32 28

A MIPS32 CPU implemented by VHDL

batsh-dev-team/Dlist 25

Difference list library for OCaml

issue commentBYVoid/OpenCC

Can not make an instance of OpenCC with configuration file tw2t.json

tw2t was recently added https://github.com/BYVoid/OpenCC/commit/e25b0559fa523fd003d3fd3205ef2a1a381c5746 by @sgalal. What's your version?

songjiz

comment created time in a month

pull request commentBYVoid/OpenCC

Feature/python cicd

Thanks. Done.

mingruimingrui

comment created time in a month

pull request commentBYVoid/OpenCC

Feature/python cicd

Thank you very much. @mingruimingrui, since you have the PyPI admin access, can you suggest how to set up the PYPI_TOKEN?

mingruimingrui

comment created time in a month

push eventBYVoid/OpenCC

mingruimingrui

commit sha 2b2a0c6dce37a7e8de80e81ad308904f0e26ae22

Rename file to python.yml

view details

mingruimingrui

commit sha d84eeb09a2a8c720ea7b3b0db7d1192a8c1c032a

Rename workflow to CI

view details

mingruimingrui

commit sha 6bbe7b6a6fd5ddee0b2c649fff22eed8285c8acd

Test github actions to build python packages

view details

mingruimingrui

commit sha 6d4d3c1a467972eb17c5abad8644fe53051a4e5f

Renamed step to unit-test

view details

mingruimingrui

commit sha 7669dc7e7931fda71ceca7d70d6b2a25cf535174

Allow workflow to be triggered on feature branch

view details

mingruimingrui

commit sha 0b3ef6de3cbdb41c36c6409668568e9ba68410d4

Fixed workflow_dispatch

view details

mingruimingrui

commit sha 258dd9d8aa2dabd74943e7d465f35bde1eaae73b

Fixed matrix

view details

mingruimingrui

commit sha e3d6994415de566b2f4b9f546112fbd367f588a6

Use absolute path to conda.bat in windows image

view details

mingruimingrui

commit sha db998dda1ce92cd7ba22cc08749d8ec8016c9933

Remove input field

view details

mingruimingrui

commit sha 2a49e3f952ebe4953e3483225cc3288c0b298b73

install curl for miniconda

view details

mingruimingrui

commit sha 727aee4697540a4f6e78965e4533318c770e1570

Switch to wildcard pattern to find files to cache

view details

mingruimingrui

commit sha 55c1f3f7b0cda3f4cba2bdfddd724fa30cb7ca98

Fixed typo

view details

mingruimingrui

commit sha 91967f7970ada9e2d11afc1272c4f25fa404a68f

Moved conda init out of script

view details

mingruimingrui

commit sha 271cbbe152fa9eab402da96a180cb916ef18bc42

Use later version of cmake

view details

mingruimingrui

commit sha 97e455f6200b4dc99e7315cfea2ae8b6d26d1d75

Fix typo

view details

mingruimingrui

commit sha e914e7d4a5626a46e0c9e01d0b2f64c9e977668c

Switch to better maintained pypi version of cmake

view details

mingruimingrui

commit sha 4acb4b1e9c07e7a39cd842fa1c7592d986ec88f7

Remove steps to cache output files

view details

mingruimingrui

commit sha 584c4120392447bbb0c788216d9995d96ce2c31e

Uncommented code to perform upload of files to pypi

view details

mingruimingrui

commit sha 118487670a1a79c893ca3f51a81647bc28007bed

Added PYPI_TOKEN env variable

view details

mingruimingrui

commit sha 6d7e30befd69c2c7e37ec5a0bc9a0ecfeec0391a

Remove unused artifacts

view details

push time in a month

PR merged BYVoid/OpenCC

Feature/python cicd

Description

This pull request adds a github action that would automatically build and upload OpenCC python packages to PyPI. The goal is to make the process of deployment smoother and much less of a hassle than it is right now.

Currently, the action is setup to be triggered manually (though it can be modified to be executed on new releases etc.). Just 1 additional thing would need to be done. @BYVoid @sgalal would have to help configure the package secret PYPI_TOKEN.

This was a long-overdue feature that I've only just had time to work on. It was briefly mentioned on https://github.com/BYVoid/OpenCC/issues/477.

Changes

  • Added new github action Build and upload python package to PyPI
  • Added new shell scripts to facilitate the building and uploading of python packages
    • release-pypi-linux.sh
    • release-pypi-macos.sh
    • release-pypi-windows.sh
  • Removed previous helper scripts used for python packaging.
    • bdist_array.sh
    • bdist_array.cmd
    • BUILD.md
    • python/Dockerfile.posix
+151 -187

1 comment

9 changed files

mingruimingrui

pr closed time in a month

push eventBYVoid/OpenCC

Danny Lin

commit sha 455072cdc66a36d72f2361934b78dbd11d8dbf92

加入簡繁轉換:璇<=>璇/璿

view details

push time in 2 months

PR merged BYVoid/OpenCC

加入簡繁轉換:璇<=>璇/璿
+4 -0

0 comment

4 changed files

danny0838

pr closed time in 2 months

Pull request review commentBYVoid/OpenCC

增加簡繁轉換:你=>你妳、妳=>你奶

 奮	奋 奼	姹 妝	妆+妳	你 奶

妳->奶 屬於古代異體字轉換。2013年通用規範漢字表裏(0782號)也沒有找到。

danny0838

comment created time in 2 months

Pull request review commentBYVoid/OpenCC

修正簡繁轉換及地區用字轉換:冢=>冢/塚

 兒	児 內	内 兩	両+冢	塚

該轉換不屬於新字體。

danny0838

comment created time in 2 months

push eventbatsh-dev-team/Dlist

Darren Ldl

commit sha b6258f3340640c3e93aa4d01c81ab33a8f4036cf

Migration to dune, project restructre

view details

Darren Ldl

commit sha 2bbf96e22766db98c964c7345551a5d177d67165

Updated dune file, added initial opam file

view details

Darren Ldl

commit sha 4f57f7cd0231d30b51673da7a64ed2cf52c14074

Filled in basic info for dlist.opam

view details

push time in 2 months

PR merged batsh-dev-team/Dlist

Migration to Dune

Required for https://github.com/BYVoid/Batsh/pull/75

+24 -0

3 comments

8 changed files

XVilka

pr closed time in 2 months

push eventBYVoid/opencc-web

dependabot[bot]

commit sha e95b69e27ff741d8346090b0d899cb9fa3176195

Bump lodash from 4.17.15 to 4.17.19 Bumps [lodash](https://github.com/lodash/lodash) from 4.17.15 to 4.17.19. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.15...4.17.19) Signed-off-by: dependabot[bot] <support@github.com>

view details

push time in 2 months

PR merged BYVoid/opencc-web

Bump lodash from 4.17.15 to 4.17.19 dependencies

Bumps lodash from 4.17.15 to 4.17.19. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/lodash/lodash/releases">lodash's releases</a>.</em></p> <blockquote> <h2>4.17.16</h2> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/lodash/lodash/commit/d7fbc52ee0466a6d248f047b5d5c3e6d1e099056"><code>d7fbc52</code></a> Bump to v4.17.19</li> <li><a href="https://github.com/lodash/lodash/commit/2e1c0f22f425e9c013815b2cd7c2ebd51f49a8d6"><code>2e1c0f2</code></a> Add npm-package</li> <li><a href="https://github.com/lodash/lodash/commit/1b6c282299f4e0271f932b466c67f0f822aa308e"><code>1b6c282</code></a> Bump to v4.17.18</li> <li><a href="https://github.com/lodash/lodash/commit/a370ac81408de2da77a82b3c4b61a01a3b9c2fac"><code>a370ac8</code></a> Bump to v4.17.17</li> <li><a href="https://github.com/lodash/lodash/commit/1144918f3578a84fcc4986da9b806e63a6175cbb"><code>1144918</code></a> Rebuild lodash and docs</li> <li><a href="https://github.com/lodash/lodash/commit/3a3b0fd339c2109563f7e8167dc95265ed82ef3e"><code>3a3b0fd</code></a> Bump to v4.17.16</li> <li><a href="https://github.com/lodash/lodash/commit/c84fe82760fb2d3e03a63379b297a1cc1a2fce12"><code>c84fe82</code></a> fix(zipObjectDeep): prototype pollution (<a href="https://github-redirect.dependabot.com/lodash/lodash/issues/4759">#4759</a>)</li> <li><a href="https://github.com/lodash/lodash/commit/e7b28ea6cb17b4ca021e7c9d66218c8c89782f32"><code>e7b28ea</code></a> Sanitize sourceURL so it cannot affect evaled code (<a href="https://github-redirect.dependabot.com/lodash/lodash/issues/4518">#4518</a>)</li> <li><a href="https://github.com/lodash/lodash/commit/0cec225778d4ac26c2bac95031ecc92a94f08bbb"><code>0cec225</code></a> Fix lodash.isEqual for circular references (<a href="https://github-redirect.dependabot.com/lodash/lodash/issues/4320">#4320</a>) (<a href="https://github-redirect.dependabot.com/lodash/lodash/issues/4515">#4515</a>)</li> <li><a href="https://github.com/lodash/lodash/commit/94c3a8133cb4fcdb50db72b4fd14dd884b195cd5"><code>94c3a81</code></a> Document matches* shorthands for over* methods (<a href="https://github-redirect.dependabot.com/lodash/lodash/issues/4510">#4510</a>) (<a href="https://github-redirect.dependabot.com/lodash/lodash/issues/4514">#4514</a>)</li> <li>Additional commits viewable in <a href="https://github.com/lodash/lodash/compare/4.17.15...4.17.19">compare view</a></li> </ul> </details> <details> <summary>Maintainer changes</summary> <p>This version was pushed to npm by <a href="https://www.npmjs.com/~mathias">mathias</a>, a new releaser for lodash since your current version.</p> </details> <br />

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


<details> <summary>Dependabot commands and options</summary> <br />

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
  • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
  • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
  • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

</details>

+3 -3

0 comment

1 changed file

dependabot[bot]

pr closed time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 0a1fafbf9c703bfc6b098587f984d34f1e4d36b1

加入簡繁轉換:玩<=>玩/翫

view details

push time in 2 months

PR merged BYVoid/OpenCC

加入簡繁轉換:玩<=>玩/翫
+6 -0

4 comments

4 changed files

danny0838

pr closed time in 2 months

pull request commentBYVoid/OpenCC

收錄歷史性轉換表

「垵<=>埯」「鼗<=>鞀」「檾<=>苘」這類轉換的標準過於模糊了,很難找到有此類轉換需求的使用者。我傾向於把它們加入異體字規範化的範疇中。

關於異體字規範化,我之前提到過好幾次,目前想法還不夠成熟,但是基本的思路是明確定義OpenCC的用字標準。目前的STCharacters.txt右邊一列的「OpenCC繁體」,如果排除掉一對多的情況,就是OpenCC的用字標準。

類似的,爲了更好地實現非標準化的「繁體」轉簡體或者其他地區標準繁體,需要定義一個規範轉換詞典,即先把各種異體字標準化,再進行詞的轉換。

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

加入簡繁轉換:玩<=>玩/翫

Please rebase

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

Add 徵羽摩柯

Wikipedia確實有它的問題,重點還是只爲正向最大轉換優化。然而語言學上「詞」的定義過於困難,我不想爭論到底什麼是真正的「詞」。

sgalal

comment created time in 2 months

pull request commentBYVoid/OpenCC

加入簡繁轉換:跖<=>蹠/跖

我已經合併了,臺灣不轉換「蹠」也沒關係。

你可以嘗試搜索:

"蹠疣" -跖 site:*.tw
"跖疣" -蹠 site:*.tw

前者比後者少很多。

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha e960c7a593a4f14dbcfe6b819a060e5213b14a24

加入簡繁轉換:跖<=>蹠/跖

view details

push time in 2 months

PR merged BYVoid/OpenCC

加入簡繁轉換:跖<=>蹠/跖
+9 -0

4 comments

4 changed files

danny0838

pr closed time in 2 months

pull request commentBYVoid/OpenCC

加入簡繁轉換:跖<=>蹠/跖

根據Google搜索臺灣網站(*.tw)結果,「跖疣」要遠多於「蹠疣」。維基百科臺灣正體也沒有轉換。

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

收錄歷史性轉換表

這些轉換應該是OpenCC早期從起來不嚴謹的來源繼承得來,現在看來可以選擇完全不轉換。「硷」和其他的不太一樣。

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

加入簡繁轉換:跖<=>蹠/跖

似乎臺灣香港一般都用「跖」,是否要在臺灣、香港用字加入到「跖」的轉換?

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

加入簡繁轉換:玩<=>玩/翫

可以加入「翫忽」

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha d4821aa97f03fcc9b5a2a803a340f6d5ae0d67ae

加入簡繁轉換:沾<=>沾/霑

view details

push time in 2 months

PR merged BYVoid/OpenCC

加入簡繁轉換:沾<=>沾/霑
+10 -0

0 comment

4 changed files

danny0838

pr closed time in 2 months

pull request commentBYVoid/OpenCC

取消不規範的簡繁轉換:沈=>沉

「沈」是「沉」的異體字

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

收錄歷史性轉換表

左列是現在已經不再用的「簡化字」?

看起來並不像的簡化。

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

石渠清心

commit sha 86c97c9f305ac9812ba1b1cfabc6141b7af7b6f1

修正地名「於潜」 杭州市临安区下辖的於潜镇,根据杭州政府信息公开目录 http://www.linan.gov.cn/col/col1643132/index.html 「於」字不应转换为「于」。

view details

石渠清心

commit sha 4036fbb52c1dac502942cf6b026e837aebff9423

修正格式

view details

push time in 2 months

PR merged BYVoid/OpenCC

修正地名「於潜」

杭州市临安区下辖的於潜镇,根据杭州政府信息公开目录 「於」字不应转换为「于」。

+1 -0

2 comments

1 changed file

Felix2yu

pr closed time in 2 months

pull request commentBYVoid/OpenCC

收錄歷史性轉換表

我不太理解這個表,能否解釋一下左右兩列分別是什麼,以及這個表可能的用途?

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

Damien Gerard

commit sha f5a8a05f5b481c7c7cef6a11142fbf822cb9ab79

fix access violation when configDirectory is empty

view details

push time in 2 months

PR merged BYVoid/OpenCC

fix access violation when configDirectory is empty

configDirectory may be empty, and accessing back() will make it crash

+8 -4

1 comment

1 changed file

milipili

pr closed time in 2 months

pull request commentBYVoid/OpenCC

fix access violation when configDirectory is empty

Thanks.

milipili

comment created time in 2 months

pull request commentBYVoid/OpenCC

增加香港地區詞轉換:幾率=>機會率/概率/機率

既然如此,我們還是需要找到懂得香港用詞的專業者來評判一個標準。在有這一個標準之前,我傾向於不要添加這一套複雜的轉換方案。

香港的譯名問題還跟粵語有關,這部分目前我不準備涉及,否則將會是一個很大的工程。

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha a65988dc69fee040a03feea21d816255bb5e5a8a

修正簡繁轉換:「菕<=>芲(U+82B2)」改為「菕<=>𰰨(U+30C28)」 - 「𰰨(U+30C28)」為「菕」的類推簡化字,二者互做簡繁轉換。「芲(U+82B2)」為「花」之異體字,不應與「菕」互轉。

view details

Danny Lin

commit sha 8539f7c84db94d16fb2daedf2cc67cac13701da0

修正簡繁轉換:「鋂<=>镅」改為「鋂<=>𰾄」

view details

Danny Lin

commit sha 82c00baed087a9f70a148e7dc40b52a9a135afd4

修正簡繁轉換:「鑀<=>锿」改為「鑀<=>𰾭」

view details

Danny Lin

commit sha 47780a6131f5e91edf2cc88b91f3255bf8a5e7f4

修正簡繁轉換:「鏚<=>戚」改為「鏚<=>𬭭」

view details

Danny Lin

commit sha 6d8b01e6f67562e0278985f2e1ba32bb1b8ab705

修正簡繁轉換:「𣍐<=>𠊉」改為「𣍐<=>𫧃」

view details

Danny Lin

commit sha ad2f130de64fa85c65cda4eb16b8e7751a9fa1cc

修正簡繁轉換:「鷿<=>䴙」改為「鷿<=>𬸯」

view details

Danny Lin

commit sha 36c68f9625d65b21ce763e64bbff2e43a8502cd3

修正簡繁轉換:「𪈼<=>𪉓」改為「𪈼<=>𱊜」

view details

Danny Lin

commit sha ba3dc28888c534e25332196c3a132d9d43de3b23

修正簡繁轉換:「𪋿<=>𪎍」改為「𪋿<=>𫧮」

view details

Danny Lin

commit sha 1886d6bf80d9300c6172bda67247614836c2a585

修正簡繁轉換:「㘔<=>㗷」改為「㘔<=>𫬐」

view details

Danny Lin

commit sha e56de896dfbb708a8cad84a9990388ebc3bdacd4

修正簡繁轉換:「繐<=>穗」改為「繐<=>𰬸」

view details

Danny Lin

commit sha 507dcee37297cbdf03c5a1e57170d5be64f24d09

修正簡繁轉換:「譅=>䜧」改為「譅<=>𰶎」

view details

Danny Lin

commit sha 4d19b732202fad01dda2783cfb689b78aca153e6

修正簡繁轉換:「㸇<=>𤎺」改為「𤓎<=>𤎺」、「㸇=>𤎺」 - 以「𤓎」為 OpenCC 正體,「㸇」為異體

view details

push time in 2 months

PR merged BYVoid/OpenCC

一些 SMP 字元相關修正

這些字都是類似的問題,因此合成一個PR方便討論。如個別字有問題可以提出,我再視需要修改或拆成不同PR。

相關字都不是《通用漢字規範表》的規範字,除了「麵」「麪」。

目前相關字都涉及一些明顯不合理的轉換,Unihan也不支持,因此刪去不合理的轉換。

新加入較合理的簡化字都是SMP字元,「麵」「麪」是《通用漢字規範表》由於已有「面」作為首選簡化字因而SMP類推簡化字作為次選以外,其他都是列為首選。

要不要加入SMP類推簡化字是一個有爭議的問題,我認為這些字《通用漢字規範表》都沒有規範,做不做類推簡化都沒有絕對對錯;況且目前OpenCC已經把許多表外字做到Ext-D的類推簡化,這裡若刻意不加入也不一致。當然,長遠來說按 #217 把規範簡化和類推簡化分成不同字典或實做限制輸出字集的功能可能會更好,不過這些都可以之後再做。

相關SMP類推簡化字都是Unihan有定義的,除了「𮮄」,雖然這字是Ext-F且只有J源,但就字形來看我認為沒有不能視作「麪」類推簡化字的理由,因此還是納入。

+28 -22

2 comments

2 changed files

danny0838

pr closed time in 2 months

pull request commentBYVoid/OpenCC

地區詞一對多轉換修正

請先分離出臺灣的部分。

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

增加香港地區詞轉換:幾率=>機會率/概率/機率

目前只有「幾率」這一個詞轉換的話,不必添加一套轉換配置。這樣帶來的維護成本很高。

在加入香港地區詞前,我們要先定義清楚什麼是香港地區詞。據我認知,香港地區詞有很大一部分是粵語詞,這部分不在OpenCC的目的內。香港的官話詞彙(普通話、國語)則流動性很強,也就是同事接受大陸和臺灣的用法。

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

加入規範異體字轉換

異體字規範化是一個大問題,暫且等一下,需要有完整的方案再下手。

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 11c64b0557c2ed9112db1c913f9c665057b6f497

取消簡繁轉換:几=>機 - 簡轉繁刪除「几=>機」:「几」並非「機」的簡化字 - 簡轉繁將「几率=>機率」改為「几率=>幾率」:「几率」原意即為「幾率」,「機率」是另一種思路下的同義詞 - 增加台灣地區詞轉換:機率<=>概率/幾率

view details

push time in 2 months

PR merged BYVoid/OpenCC

取消簡繁轉換:几=>機
  • 簡轉繁刪除「几=>機」:「几」並非「機」的簡化字
  • 簡轉繁將「几率=>機率」改為「几率=>幾率」:「几率」原意即為「幾率」,「機率」是另一種思路下的同義詞
  • 增加台灣地區詞轉換:幾率=>機率
  • 增加反向台灣地區詞轉換:機率=>概率。因台灣外一般只用「概率」
+4 -2

8 comments

3 changed files

danny0838

pr closed time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha b08077c48344564334a9dbcb0325f486d5f6ea2d

取消不規範的簡繁轉換:沈=>沉

view details

push time in 2 months

PR merged BYVoid/OpenCC

取消不規範的簡繁轉換:沈=>沉
+0 -1

1 comment

1 changed file

danny0838

pr closed time in 2 months

pull request commentBYVoid/OpenCC

取消不規範的簡繁轉換:沈=>沉

這個確實應該刪除。

我在想異體字規範化的問題如何解決「沈沒」這樣的詞的轉換,可能還是需要一個詞典,不僅僅是字表來做規範化。我需要單獨起草一個文檔來討論這個問題。

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 84754b9dfa812d0d206a81316c5a6cb28ba11d60

修正簡繁轉換:旋<=>旋鏇

view details

push time in 2 months

PR merged BYVoid/OpenCC

修正簡繁轉換:旋<=>旋鏇
+3 -2

1 comment

3 changed files

danny0838

pr closed time in 2 months

Pull request review commentBYVoid/OpenCC

fix access violation when configDirectory is empty

 ConverterPtr Config::NewFromString(const std::string& json,     name = doc["name"].GetString();   } -  ConfigInternal* impl = (ConfigInternal*)internal;-  if (configDirectory.back() == '/' || configDirectory.back() == '\\')-    impl->configDirectory = configDirectory;-  else-    impl->configDirectory = configDirectory + '/';+  if (!configDirectory.empty()) {+    ConfigInternal* impl = (ConfigInternal*)internal;+    if (configDirectory.back() == '/' || configDirectory.back() == '\\')+      impl->configDirectory = configDirectory;+    else+      impl->configDirectory = configDirectory + '/';+  } else {+    impl->configDirectory.clear();

error: ‘impl’ was not declared in this scope

milipili

comment created time in 2 months

pull request commentBYVoid/OpenCC

日本新字體

「非簡慣優先的簡慣字體」是什麼

danny0838

comment created time in 2 months

issue commentBYVoid/OpenCC

姊=>姐,妳=>你

「姊妹」還是經常會看到的,尤其是意義和「姐妹」已經有了細微差別。在中國大陸,尤其是北方,前者表示嚴格的親屬關係(我姊妹三人),而後者比喻更多(我的姐妹們)。

lvzhenbo

comment created time in 2 months

Pull request review commentBYVoid/OpenCC

取消不規範的簡繁轉換:薰<=>熏

 焖	燜 焘	燾 煴	熅-熏	薰 燻+熏	燻 熏

更合適的做法是,保持「熏」默認不變,加入「燻肉」、「燻烤」、「燻蒸」、「煙燻」等詞。

「熏」本身在「氣焰熏天」這樣的口語不常用的詞中會出現。

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha edfdef6a7f17890b78f5ec4a0dcab24712cde368

取消不規範的簡繁轉換:棂<=>櫺 - 《通用規範漢字表》和《簡化字總表》以「棂」作為「欞」的簡化字,並無「櫺」字。

view details

push time in 2 months

PR merged BYVoid/OpenCC

取消不規範的簡繁轉換:棂<=>櫺
  • 《通用規範漢字表》和《簡化字總表》以「棂」作為「欞」的簡化字,並無「櫺」字。
+1 -2

0 comment

2 changed files

danny0838

pr closed time in 2 months

pull request commentBYVoid/OpenCC

取消不規範的簡繁轉換:擡<=>抬

根據釋義,「擡」似乎是正字 http://chardb.iis.sinica.edu.tw/char/12550

danny0838

comment created time in 2 months

issue commentBYVoid/OpenCC

姊=>姐,妳=>你

「姊妹」中「姊」讀作zǐ。「姊」和「姐」已經有了分化,無需過度轉換。

lvzhenbo

comment created time in 2 months

Pull request review commentBYVoid/OpenCC

取消不規範的簡繁轉換:薰<=>熏

 焖	燜 焘	燾 煴	熅-熏	薰 燻+熏	燻 熏

建議改爲默認「熏」,除非有特別的理由,還是在不能確定的情況下保持原字比錯誤轉換好。準確率優先於召回率。

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

取消不規範的簡繁轉換:擡<=>抬

http://chardb.iis.sinica.edu.tw/meancompare/64E1/62AC 中研院認爲「抬」是「擡」的簡化字

danny0838

comment created time in 2 months

pull request commentBYVoid/OpenCC

Add 徵羽摩柯

沒有關係,並不一定是「詞」。

sgalal

comment created time in 2 months

Pull request review commentBYVoid/OpenCC

取消簡繁轉換:几=>機

 几点几	幾點幾 几点钟	幾點鐘 几版	幾版-几率	機率+几率	幾率

建議直接刪除這個

danny0838

comment created time in 2 months

push eventBYVoid/OpenCC

Danny Lin

commit sha bfc27c4894418df144385286e51c51ef56884bae

修正簡繁轉換:唇<=>脣 (#435) - 《通用規範漢字表》以「唇」為規範字,「脣」為異體字。故簡轉繁不再處理「脣」的版本。 - 「唇」另義為「驚駭」,讀作zhēn。故簡轉繁從分「唇=>脣唇」。 - 台灣、香港以「脣」為正體但民間多數用「唇」,故增加地區用字轉換「脣=>唇」。 - 「硃脣」、「抹硃」、「塗硃」原先用字不統一,現統一用「朱」(作「紅色」解時較常用「朱」)。

view details

push time in 3 months

PR merged BYVoid/OpenCC

修正簡繁轉換:唇<=>脣

「朱」和「硃」原先不統一,只好在這裡處理。雖然二者在此部分應該同義,但目前查到的字典都是用「朱」。

按:「硃」和「朱」都可指硫化汞及代指紅色,古代也常混用。但作「紅色」義時「朱」較常見,特指硫化汞時較常用「硃」,如「硃砂」。

+19 -21

4 comments

5 changed files

danny0838

pr closed time in 3 months

Pull request review commentBYVoid/OpenCC

增加一簡對多繁:你=>你妳

 奮	奋 奼	姹 妝	妆+妳	你 奶

目前可以先加入TSCharacters.txt,同時也麻煩你建立另一個ArchaicVariants.txt,列入

妳	你 奶

這樣目的是避免以後遺忘

danny0838

comment created time in 3 months

Pull request review commentBYVoid/OpenCC

增加一簡對多繁:你=>你妳

 奮	奋 奼	姹 妝	妆+妳	你 奶

沒錯,理論上「妳」確實是「奶」的異體字,但是除了個別古文已經沒有這樣的用法。

事實上,我建議把此類的罕用異體字轉換從TSCharacters.txtSTCharacters.txt裏面分離出來,放入單獨的「古文異體字」(如ArchaicVariants.txt)。先有的其他古文異體字,如「蝎」等字也可以移除。

danny0838

comment created time in 3 months

pull request commentBYVoid/OpenCC

修正簡繁轉換:唇<=>脣

Please rebase again

danny0838

comment created time in 3 months

pull request commentBYVoid/OpenCC

修正簡繁轉換:阪<=>坂

不推薦轉換「山阪」「峭阪」這類幾乎不存在的詞。如今「阪」只要出現,一般都是指日本大阪,其餘則全部都是「坂」。

danny0838

comment created time in 3 months

issue commentBYVoid/OpenCC

演算法缺陷導致s2twp「正則表達式=>正規表示式」無作用

更好的辦法應該是改進分詞算法,使之不再依賴單獨的詞典分詞。可以依據整個詞典鏈來。

danny0838

comment created time in 3 months

Pull request review commentBYVoid/OpenCC

增加一簡對多繁:你=>你妳

 奮	奋 奼	姹 妝	妆+妳	你 奶

建議刪去「妳->奶」轉換

danny0838

comment created time in 3 months

push eventBYVoid/uchardet

Carbo Kuo

commit sha 4e685757780cb3c652fc6c9ec759f62888969ec9

Update README.md Update readme.

view details

push time in 3 months

issue commentBYVoid/uchardet

libuchardet-ios.a能不能支持下iOS Simulator~~~~

Please report bugs to https://gitlab.freedesktop.org/uchardet/uchardet/-/issues

YeaLink89

comment created time in 3 months

issue commentBYVoid/OpenCC

Converting speed slow since ver.1.1.x

It does seem unreasonably slow from your log, but I can't reproduce the problem on my side.

Did you try to run make benchmark?

khli

comment created time in 3 months

issue commentBYVoid/OpenCC

小建议:在PyPI版本号演进时统一打包各平台(无论代码是否存在变化)

Agreed. Please contribute to https://github.com/BYVoid/OpenCC/blob/master/BUILD.md

wyqsmith

comment created time in 3 months

pull request commentBYVoid/OpenCC

Add missing configuration: t2twp

t2tw、tw2t、t2hk、和 hk2t都不涉及地域用詞差別的轉換。t2twp則隱含了大陸用詞轉換到臺灣用詞。

OpenCC標準不包含地域用詞的標準。

sgalal

comment created time in 3 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 06250570382cc9ba768743384f41ee7ba4deadfc

修正簡繁轉換:蝎=>蠍/蝎 - 「蝎」可作為「蠍」之異體字/簡化字,但另兼正字義為「木中蠹蟲」。

view details

push time in 3 months

PR merged BYVoid/OpenCC

修正簡繁轉換:蝎=>蠍/蝎
+2 -1

3 comments

2 changed files

danny0838

pr closed time in 3 months

pull request commentBYVoid/OpenCC

增加一簡對多繁:你=>你妳

我的理由是,「你」到「妳」的轉換幾乎無法自動完成。「它」「牠」也類似。

「奶」也並不是「妳」的簡化字。

danny0838

comment created time in 3 months

pull request commentBYVoid/OpenCC

Add missing configuration: t2twp

我覺得這個並不是很有必要,因爲

  1. 有這樣需求的使用者不常見
  2. 實際上詞彙轉換是大陸常用詞到臺灣常用詞的轉換,說明裏面並沒有描述這一點。
sgalal

comment created time in 3 months

push eventBYVoid/OpenCC

Danny Lin

commit sha e8c775de25671c262503b7538e7f7a26de2c6212

修正台灣地區詞轉換:程序正義

view details

push time in 3 months

PR merged BYVoid/OpenCC

修正台灣地區詞轉換:程序正義

原來的版本會把「程序(不)正義」轉成「程式(不)正義」。

+2 -0

0 comment

1 changed file

danny0838

pr closed time in 3 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 23c1b0495e9a9cc92ff9fd57067628d7fc644037

取消《通用規範漢字表》外異體字的「簡繁轉換」:毶=>毿

view details

push time in 3 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 5ca5f09a4385be055fdf7e0464ce23671cdeb372

取消《通用規範漢字表》外異體字的「簡繁轉換」:𡡎=>𡞱

view details

push time in 3 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 5f76d79940dc4b32e7736835745dc134564b9cd9

取消《通用規範漢字表》外異體字的「簡繁轉換」:𢷬=>𢭏

view details

push time in 3 months

Pull request review commentBYVoid/OpenCC

修正簡繁轉換:阪<=>坂

 宫商角徵羽	宫商角徵羽 射覆	射复 尼乾陀	尼乾陀+山阪	山坂

沒錯,但是這是異體字規範化,不屬於簡繁轉換。

「山阪」這樣的非規範寫法不僅會出現在繁體,也會出現在簡體中,所以不應單獨列入繁轉簡。

danny0838

comment created time in 3 months

issue commentBYVoid/OpenCC

巖 岩

日語「岩」「巌」音讀均爲「がん」,你是說訓讀不同嗎?

sgalal

comment created time in 3 months

pull request commentBYVoid/OpenCC

修正簡繁轉換:唇<=>脣

請重新Rebase

danny0838

comment created time in 3 months

pull request commentBYVoid/OpenCC

加入規範繁轉簡

沒錯,可能要做代碼改動,實現更加複雜的異體字規範化處理。我需要仔細考慮一下如何實現。

danny0838

comment created time in 3 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 0ebaf36514117d1622d381bf5bc290609323f6fc

取消《通用漢字規範表》外異體字的「簡繁轉換」:睪=>睾

view details

push time in 3 months

PR merged BYVoid/OpenCC

取消《通用漢字規範表》外異體字的「簡繁轉換」:睪=>睾

《通用漢字規範表》「睾」是規範字,「睪」不在表中,但也未列為須轉換的異體字。《簡化字總表》說「睪」不簡化,但沒有提到「睾」。Unihan資料庫也沒有把「睪<=>睾」列為繁簡字。

如果 OpenCC 以「睪」為標準字,是可以做轉換,但目前只有繁轉簡時「睪=>睾」,而無簡轉繁「睾=>睪」,也不一致。

我認為可以取消這個表外轉換。如果覺得要做轉換,則簡轉繁應加上「睾=>睪」。

+0 -1

0 comment

1 changed file

danny0838

pr closed time in 3 months

Pull request review commentBYVoid/OpenCC

修正簡繁轉換:阪<=>坂

 圣	聖 圹	壙 场	場+坂	坂 阪

一般來說沒有理由將「坂」轉換爲「阪」

danny0838

comment created time in 3 months

Pull request review commentBYVoid/OpenCC

修正簡繁轉換:阪<=>坂

 宫商角徵羽	宫商角徵羽 射覆	射复 尼乾陀	尼乾陀+山阪	山坂

這些同樣屬於異體字規範化的範疇,我不建議加到這裏,繁體中基本也是使用「坂」。「阪」只在「大阪」中使用。

danny0838

comment created time in 3 months

pull request commentBYVoid/OpenCC

修正簡繁轉換:蝎=>蠍/蝎

請重新Rebase

danny0838

comment created time in 3 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 0fd32a605eaff77c62bbe45a574c78397d46d5a6

取消《通用漢字規範表》外異體字的「簡繁轉換」:檾<=>苘𦼖

view details

push time in 3 months

PR merged BYVoid/OpenCC

取消《通用漢字規範表》外異體字的「簡繁轉換」:檾<=>苘𦼖

這些字都不是《通用漢字規範表》規範字。Unihan 也沒有把它們列為繁簡字。

中國幾乎只用「苘麻」;台灣沒有硬性規定,「檾麻」、「苘麻」都有人用。

我認為可以取消這些表外轉換。

+0 -3

0 comment

2 changed files

danny0838

pr closed time in 3 months

pull request commentBYVoid/OpenCC

增加一簡對多繁:你=>你妳

不建議列入,因爲實際上可能難以自動轉換,所需上下文太多。同時,在香港、臺灣,你、妳的混用也已經很多了。

danny0838

comment created time in 3 months

pull request commentBYVoid/OpenCC

Update TWPhrasesOther, 丟失/遺失

可以考慮單獨加入 封包遺失 | 數據包丟失 這一對轉換

PeterDaveHello

comment created time in 3 months

push eventBYVoid/OpenCC

Danny Lin

commit sha 01a14eca297090c990c775bb782b0130d1cbc4d4

取消台灣地區用字轉換:兇=>凶 (#432) - 在台灣「兇/凶」都很常使用,二字在 Big5 編碼都列為常用字。

view details

push time in 3 months

PR merged BYVoid/OpenCC

取消台灣地區用字轉換:兇=>凶
+7 -8

0 comment

3 changed files

danny0838

pr closed time in 3 months

more