SOFTELメモ Developer's blog

会社概要 ブログ 調査依頼 採用情報 ...
技術者募集中

【php】mbstringで使用可能な文字コード名とエイリアス名

問題

mb_convert_encoding で指定できる文字コードってたくさんありますね!

php

答え

以下のスクリプトで一覧を作った結果

<?php
foreach (mb_list_encodings() as $e) {
	echo $e . "\t" . @mb_preferred_mime_name($e) . "\t" . implode(', ', mb_encoding_aliases($e)) . "\n";
}

php5.3.3 環境では以下の通り。

pass		none
auto		unknown
wchar		
byte2be		
byte2le		
byte4be		
byte4le		
BASE64	BASE64	
UUENCODE	x-uuencode	
HTML-ENTITIES	HTML-ENTITIES	HTML, html
Quoted-Printable	Quoted-Printable	qprint
7bit	7bit	
8bit	8bit	binary
UCS-4	UCS-4	ISO-10646-UCS-4, UCS4
UCS-4BE	UCS-4BE	
UCS-4LE	UCS-4LE	
UCS-2	UCS-2	ISO-10646-UCS-2, UCS2, UNICODE
UCS-2BE	UCS-2BE	
UCS-2LE	UCS-2LE	
UTF-32	UTF-32	utf32
UTF-32BE	UTF-32BE	
UTF-32LE	UTF-32LE	
UTF-16	UTF-16	utf16
UTF-16BE	UTF-16BE	
UTF-16LE	UTF-16LE	
UTF-8	UTF-8	utf8
UTF-7	UTF-7	utf7
UTF7-IMAP		
ASCII	US-ASCII	ANSI_X3.4-1968, iso-ir-6, ANSI_X3.4-1986, ISO_646.irv:1991, US-ASCII, ISO646-US, us, IBM367, cp367, csASCII
EUC-JP	EUC-JP	EUC, EUC_JP, eucJP, x-euc-jp
SJIS	Shift_JIS	x-sjis, SHIFT-JIS
eucJP-win	EUC-JP	eucJP-open, eucJP-ms
SJIS-win	Shift_JIS	SJIS-open, SJIS-ms
CP932	Shift_JIS	MS932, Windows-31J, MS_Kanji
CP51932	CP51932	cp51932
JIS	ISO-2022-JP	
ISO-2022-JP	ISO-2022-JP	
ISO-2022-JP-MS	ISO-2022-JP	ISO2022JPMS
Windows-1252	Windows-1252	cp1252
Windows-1254	Windows-1254	CP1254, CP-1254, WINDOWS-1254
ISO-8859-1	ISO-8859-1	ISO_8859-1, latin1
ISO-8859-2	ISO-8859-2	ISO_8859-2, latin2
ISO-8859-3	ISO-8859-3	ISO_8859-3, latin3
ISO-8859-4	ISO-8859-4	ISO_8859-4, latin4
ISO-8859-5	ISO-8859-5	ISO_8859-5, cyrillic
ISO-8859-6	ISO-8859-6	ISO_8859-6, arabic
ISO-8859-7	ISO-8859-7	ISO_8859-7, greek
ISO-8859-8	ISO-8859-8	ISO_8859-8, hebrew
ISO-8859-9	ISO-8859-9	ISO_8859-9, latin5
ISO-8859-10	ISO-8859-10	ISO_8859-10, latin6
ISO-8859-13	ISO-8859-13	ISO_8859-13
ISO-8859-14	ISO-8859-14	ISO_8859-14, latin8
ISO-8859-15	ISO-8859-15	ISO_8859-15
ISO-8859-16	ISO-8859-16	ISO_8859-16
EUC-CN	CN-GB	CN-GB, EUC_CN, eucCN, x-euc-cn, gb2312
CP936	CP936	CP-936, GBK
HZ	HZ-GB-2312	
EUC-TW	EUC-TW	EUC_TW, eucTW, x-euc-tw
BIG-5	BIG5	CN-BIG5, BIG-FIVE, BIGFIVE, CP950
EUC-KR	EUC-KR	EUC_KR, eucKR, x-euc-kr
UHC	UHC	CP949
ISO-2022-KR	ISO-2022-KR	
Windows-1251	Windows-1251	CP1251, CP-1251, WINDOWS-1251
CP866	CP866	CP866, CP-866, IBM-866
KOI8-R	KOI8-R	KOI8-R, KOI8R
KOI8-U	KOI8-U	KOI8-U, KOI8U
ArmSCII-8	ArmSCII-8	ArmSCII-8, ArmSCII8, ARMSCII-8, ARMSCII8
CP850	CP850	CP850, CP-850, IBM-850
JIS-ms	ISO-2022-JP	
CP50220	ISO-2022-JP	
CP50220raw	ISO-2022-JP	
CP50221	ISO-2022-JP	
CP50222	ISO-2022-JP	

php7RC6では以下の通り。

pass		none
auto		unknown
wchar		
byte2be		
byte2le		
byte4be		
byte4le		
BASE64	BASE64	
UUENCODE	x-uuencode	
HTML-ENTITIES	HTML-ENTITIES	HTML, html
Quoted-Printable	Quoted-Printable	qprint
7bit	7bit	
8bit	8bit	binary
UCS-4	UCS-4	ISO-10646-UCS-4, UCS4
UCS-4BE	UCS-4BE	
UCS-4LE	UCS-4LE	
UCS-2	UCS-2	ISO-10646-UCS-2, UCS2, UNICODE
UCS-2BE	UCS-2BE	
UCS-2LE	UCS-2LE	
UTF-32	UTF-32	utf32
UTF-32BE	UTF-32BE	
UTF-32LE	UTF-32LE	
UTF-16	UTF-16	utf16
UTF-16BE	UTF-16BE	
UTF-16LE	UTF-16LE	
UTF-8	UTF-8	utf8
UTF-7	UTF-7	utf7
UTF7-IMAP		
ASCII	US-ASCII	ANSI_X3.4-1968, iso-ir-6, ANSI_X3.4-1986, ISO_646.irv:1991, US-ASCII, ISO646-US, us, IBM367, IBM-367, cp367, csASCII
EUC-JP	EUC-JP	EUC, EUC_JP, eucJP, x-euc-jp
SJIS	Shift_JIS	x-sjis, SHIFT-JIS
eucJP-win	EUC-JP	eucJP-open, eucJP-ms
EUC-JP-2004	EUC-JP	EUC_JP-2004
SJIS-win	Shift_JIS	SJIS-open, SJIS-ms
SJIS-Mobile#DOCOMO	Shift_JIS	SJIS-DOCOMO, shift_jis-imode, x-sjis-emoji-docomo
SJIS-Mobile#KDDI	Shift_JIS	SJIS-KDDI, shift_jis-kddi, x-sjis-emoji-kddi
SJIS-Mobile#SOFTBANK	Shift_JIS	SJIS-SOFTBANK, shift_jis-softbank, x-sjis-emoji-softbank
SJIS-mac	Shift_JIS	MacJapanese, x-Mac-Japanese
SJIS-2004	Shift_JIS	SJIS2004, Shift_JIS-2004
UTF-8-Mobile#DOCOMO	UTF-8	UTF-8-DOCOMO, UTF8-DOCOMO
UTF-8-Mobile#KDDI-A	UTF-8	
UTF-8-Mobile#KDDI-B	UTF-8	UTF-8-Mobile#KDDI, UTF-8-KDDI, UTF8-KDDI
UTF-8-Mobile#SOFTBANK	UTF-8	UTF-8-SOFTBANK, UTF8-SOFTBANK
CP932	Shift_JIS	MS932, Windows-31J, MS_Kanji
CP51932	CP51932	cp51932
JIS	ISO-2022-JP	
ISO-2022-JP	ISO-2022-JP	
ISO-2022-JP-MS	ISO-2022-JP	ISO2022JPMS
GB18030	GB18030	gb-18030, gb-18030-2000
Windows-1252	Windows-1252	cp1252
Windows-1254	Windows-1254	CP1254, CP-1254, WINDOWS-1254
ISO-8859-1	ISO-8859-1	ISO_8859-1, latin1
ISO-8859-2	ISO-8859-2	ISO_8859-2, latin2
ISO-8859-3	ISO-8859-3	ISO_8859-3, latin3
ISO-8859-4	ISO-8859-4	ISO_8859-4, latin4
ISO-8859-5	ISO-8859-5	ISO_8859-5, cyrillic
ISO-8859-6	ISO-8859-6	ISO_8859-6, arabic
ISO-8859-7	ISO-8859-7	ISO_8859-7, greek
ISO-8859-8	ISO-8859-8	ISO_8859-8, hebrew
ISO-8859-9	ISO-8859-9	ISO_8859-9, latin5
ISO-8859-10	ISO-8859-10	ISO_8859-10, latin6
ISO-8859-13	ISO-8859-13	ISO_8859-13
ISO-8859-14	ISO-8859-14	ISO_8859-14, latin8
ISO-8859-15	ISO-8859-15	ISO_8859-15
ISO-8859-16	ISO-8859-16	ISO_8859-16
EUC-CN	CN-GB	CN-GB, EUC_CN, eucCN, x-euc-cn, gb2312
CP936	CP936	CP-936, GBK
HZ	HZ-GB-2312	
EUC-TW	EUC-TW	EUC_TW, eucTW, x-euc-tw
BIG-5	BIG5	CN-BIG5, BIG-FIVE, BIGFIVE
CP950	BIG5	
EUC-KR	EUC-KR	EUC_KR, eucKR, x-euc-kr
UHC	UHC	CP949
ISO-2022-KR	ISO-2022-KR	
Windows-1251	Windows-1251	CP1251, CP-1251, WINDOWS-1251
CP866	CP866	CP866, CP-866, IBM866, IBM-866
KOI8-R	KOI8-R	KOI8-R, KOI8R
KOI8-U	KOI8-U	KOI8-U, KOI8U
ArmSCII-8	ArmSCII-8	ArmSCII-8, ArmSCII8, ARMSCII-8, ARMSCII8
CP850	CP850	CP850, CP-850, IBM850, IBM-850
JIS-ms	ISO-2022-JP	
ISO-2022-JP-2004	ISO-2022-JP-2004	
ISO-2022-JP-MOBILE#KDDI	ISO-2022-JP	ISO-2022-JP-KDDI
CP50220	ISO-2022-JP	
CP50220raw	ISO-2022-JP	
CP50221	ISO-2022-JP	
CP50222	ISO-2022-JP	

php5.3とphp7は -2004 と付く文字コードと、MOBILEと付く文字コードの有無が違う程度だった。

関連するメモ

コメント(2)

kalvo 2021年10月15日 19:45

Windows-31J文字列をUTF-8に変換する方法は?

fytko 2022年5月17日 18:56

Windows-31J文字列をUTF-8に変換する方法は?