美文网首页
\ufeff问题

\ufeff问题

作者: zhengzhoufeng | 来源:发表于2017-10-09 14:18 被阅读0次

    https://stackoverflow.com/questions/17912307/u-ufeff-in-python-string 解释最充分,摘抄如下:

    he Unicode characterU+FEFFis the byte order mark, or BOM, and is used to tell the difference between big- and little-endian UTF-16 encoding. If you decode the web page using the right codec, Python will remove it for you. Examples:

    #!python2

    #coding: utf8

    u = u'ABC'

    e8 = u.encode('utf-8')        # encode without BOM

    e8s = u.encode('utf-8-sig')  # encode with BOM

    e16 = u.encode('utf-16')      # encode with BOM

    e16le = u.encode('utf-16le')  # encode without BOM

    e16be = u.encode('utf-16be')  # encode without BOM

    print 'utf-8    %r' % e8

    print 'utf-8-sig %r' % e8s

    print 'utf-16    %r' % e16

    print 'utf-16le  %r' % e16le

    print 'utf-16be  %r' % e16be

    print

    print 'utf-8  w/ BOM decoded with utf-8    %r' % e8s.decode('utf-8')

    print 'utf-8  w/ BOM decoded with utf-8-sig %r' % e8s.decode('utf-8-sig')

    print 'utf-16 w/ BOM decoded with utf-16    %r' % e16.decode('utf-16')

    print 'utf-16 w/ BOM decoded with utf-16le  %r' % e16.decode('utf-16le')

    Note thatEF BB BFis a UTF-8-encoded BOM. It is not required for UTF-8, but serves only as a signature (usually on Windows).

    Output:

    utf-8    'ABC'

    utf-8-sig '\xef\xbb\xbfABC'

    utf-16    '\xff\xfeA\x00B\x00C\x00'    # Adds BOM and encodes using native processor endian-ness.

    utf-16le  'A\x00B\x00C\x00'

    utf-16be  '\x00A\x00B\x00C'

    utf-8  w/ BOM decoded with utf-8    u'\ufeffABC'    # doesn't remove BOM if present.

    utf-8  w/ BOM decoded with utf-8-sig u'ABC'          # removes BOM if present.

    utf-16 w/ BOM decoded with utf-16    u'ABC'          # *requires* BOM to be present.

    utf-16 w/ BOM decoded with utf-16le  u'\ufeffABC'    # doesn't remove BOM if present.

    Note that theutf-16codedrequiresBOM to be present, or Python won't know if the data is big- or little-endian.

    相关文章

      网友评论

          本文标题:\ufeff问题

          本文链接:https://www.haomeiwen.com/subject/kpfpyxtx.html