美文网首页
提取日文字体的CMAP并或者所有字形的编码

提取日文字体的CMAP并或者所有字形的编码

作者: 千羽之城88 | 来源:发表于2020-04-12 16:32 被阅读0次

    最近和知乎的 @steve-cheug 交流。这里做个备份。

    获取字体的CMAP表

    ttx -t camp KozGoPr6N-Regular.otf
    --------------------------------------
    Dumping "KozGoPr6N-Regular.otf" to "KozGoPr6N-Regular.ttx"...
    Dumping 'cmap' table...
    

    然后借助一下python脚本,得到UID:

    #!/usr/bin/python3
    # -*- encoding: utf-8 -*-
    
    import os
    import sys
    import datetime
    
    i=0
    #print(datetime.datetime.now())
    f = open("out.x", "w")
    for line in open(sys.argv[1], encoding="utf-8"):
        if( "CJK" in line ):
            columns = line.split('"')
            if len(columns) >= 4:
                i += 1
                if( "uni" in columns[3] ):
                    # <map code="0xff5d" name="uniFF5D"/>
                    uid = columns[3].replace('uni','')
                    if(len(uid) >= 2):
                        uid = int(uid, 16)
                        if( uid > 10000 ):
                            print(chr(uid) + "  " + str(i))
                            f.write(chr(uid) + '\n')
                elif( 'cid' in columns[3] ):
                    # <map code="0xff5b" name="cid28609"/>
                    uid = columns[3].replace('cid','')
                    if(len(uid) >=2):
                        uid = int(uid)
                        if(uid > 10000):
                            print(chr(uid) + "  " + str(i))
                            f.write(chr(uid) + '\n')
                
    f.close()
    #print(datetime.datetime.now())
    

    得到一个 out.x 的文件。

    python 脚本:

    #! /usr/bin/env python3
    # -*- encoding: utf-8 -*-
    
    # Script for get ttf font camp table
    # 2019/12/26
    # use: python3 cmap01.py 'NotoSansSC-Kクレ.ttf'
    
    from fontTools.ttLib import TTFont
    from fontTools.ttLib.tables._c_m_a_p import CmapSubtable
    import sys,os
    
    fontfile = sys.argv[1]
    font = TTFont(fontfile)
    outfile = open("temp.x", "w") # w=write, a=append
    
    cmap = font['cmap']
    
    for cmap in cmap.tables:
        if( cmap.platformID == 3 and cmap.platEncID in [0, 1, 10]): # window and BMP
            for cid in cmap.cmap.items():
                # write codepoint and chars
                if( cid[0] > 10000 ):
                    outfile.write("%s\t%x\t%s\n" % (cid[0], cid[0], chr(cid[0])))
    outfile.close()
    

    得到的结果如下:

    12002   2ee2    ⻢
    12004   2ee4    ⻤
    12005   2ee5    ⻥
    12006   2ee6    ⻦
    12008   2ee8    ⻨
    12009   2ee9    ⻩
    12010   2eea    ⻪
    12011   2eeb    ⻫
    12012   2eec    ⻬
    12013   2eed    ⻭
    12014   2eee    ⻮
    12015   2eef    ⻯
    12016   2ef0    ⻰
    

    相关文章

      网友评论

          本文标题:提取日文字体的CMAP并或者所有字形的编码

          本文链接:https://www.haomeiwen.com/subject/cwcxoctx.html