Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #5
This PR adds mappings for the characters in the follwing blocks (as of Unicode 13)
The mappings are copied from https://github.com/Jackchows/Cangjie5 with additional fixes in Jackchows/Cangjie5#209. Among the mappings, some mappings are deliberately discarded because it does not fit within current scope, specfically:
z
for CJK Compatibility Ideographs and CJK Compatibility Ideographs Supplementx
(we don't havex
mapping for CJK Ext. B)The first commit fixes ordering issues in current mappings. It is an editorial fix and does not have observable behaviour changes.
The second commit and the third commit added new mappings ordered by cangjie code. The new mappings are appended to current mappings so the character frequency order is not affected.
When authoring this PR, I came up with two scripts, feel free to re-use it as Ext. H will be hopefully targeted to 2022. (link of scripts)
Cangjie5.txt
of https://github.com/Jackchows/Cangjie5.Current known issues:
use of rotational operator
z
in specific characters:The author of https://github.com/Jackchows/Cangjie5 deliberately used
Z
(defined in Cangjie 6 as a rotation operator, see Section 14 for the rationale) to encode these 6 characters. However this is not consistent to what we already have for such characters in Ext. BWe have three solutions on addressing inconsistency here:
z
for specific new characters and add new mappingThe old mappings for 𠄏𠄔𣀨 will be preserved as compatibility mapping. The new mappings for 𮗙𰒥𫸪𰨇𰲞𬢆 is regarded. Both @LEOYoon-Tsaw and me are ok with using
z
for 𮗙𰒥𫸪𰨇𰲞𬢆. But I am open to different opinion from community.Stay with Cangjie5 code schemes and come up with our own mapping for 𮗙𰒥𫸪𰨇𰲞𬢆. I can revise this PR on the new mappings
remove 𮗙𰒥𫸪𰨇𰲞𬢆 from mappings and postpone until we have consensus on how to encode 𮗙𰒥𫸪𰨇𰲞𬢆.
My preference on these 3 solutions is 1 > 2 > 3.