如何从整数值打印Java中的扩展ASCII码

Posted

技术标签:

【中文标题】如何从整数值打印Java中的扩展ASCII码【英文标题】:How to print the extended ASCII code in java from integer value 【发布时间】:2014-04-11 22:55:42 【问题描述】:
public static void main(String[] args) 

int i=153;
int j=63;
System.out.println((char)i);
System.out.println((char)j);



OUTPUT:-
?
?

我有一些想法为什么会出现这种奇怪的输出..但是谁能给我一些想法,以便我也可以打印扩展的 ASCII..

【问题讨论】:

JAVA extended ASCII table usage的可能重复 您的代码按预期工作。例如System.out.println((char)67); 将打印“C”。 Extended Ascii doesn't work in console!的可能重复 Braj 如果您将某个值设置为超过 127,那么由于某些 ASCII 限制,它将无法按预期工作...... 【参考方案1】:

ASCII 153 (0x99) 不同于 Unicode U+0099(控制字符)。

解决方案

这个程序应该做你想做的事情:

public class ExtendedAscii 
    public static final char[] EXTENDED =  0x00C7, 0x00FC, 0x00E9, 0x00E2,
            0x00E4, 0x00E0, 0x00E5, 0x00E7, 0x00EA, 0x00EB, 0x00E8, 0x00EF,
            0x00EE, 0x00EC, 0x00C4, 0x00C5, 0x00C9, 0x00E6, 0x00C6, 0x00F4,
            0x00F6, 0x00F2, 0x00FB, 0x00F9, 0x00FF, 0x00D6, 0x00DC, 0x00A2,
            0x00A3, 0x00A5, 0x20A7, 0x0192, 0x00E1, 0x00ED, 0x00F3, 0x00FA,
            0x00F1, 0x00D1, 0x00AA, 0x00BA, 0x00BF, 0x2310, 0x00AC, 0x00BD,
            0x00BC, 0x00A1, 0x00AB, 0x00BB, 0x2591, 0x2592, 0x2593, 0x2502,
            0x2524, 0x2561, 0x2562, 0x2556, 0x2555, 0x2563, 0x2551, 0x2557,
            0x255D, 0x255C, 0x255B, 0x2510, 0x2514, 0x2534, 0x252C, 0x251C,
            0x2500, 0x253C, 0x255E, 0x255F, 0x255A, 0x2554, 0x2569, 0x2566,
            0x2560, 0x2550, 0x256C, 0x2567, 0x2568, 0x2564, 0x2565, 0x2559,
            0x2558, 0x2552, 0x2553, 0x256B, 0x256A, 0x2518, 0x250C, 0x2588,
            0x2584, 0x258C, 0x2590, 0x2580, 0x03B1, 0x00DF, 0x0393, 0x03C0,
            0x03A3, 0x03C3, 0x00B5, 0x03C4, 0x03A6, 0x0398, 0x03A9, 0x03B4,
            0x221E, 0x03C6, 0x03B5, 0x2229, 0x2261, 0x00B1, 0x2265, 0x2264,
            0x2320, 0x2321, 0x00F7, 0x2248, 0x00B0, 0x2219, 0x00B7, 0x221A,
            0x207F, 0x00B2, 0x25A0, 0x00A0 ;

    public static final char getAscii(int code) 
        if (code >= 0x80 && code <= 0xFF) 
            return EXTENDED[code - 0x7F];
        
        return (char) code;
    

    public static final void printChar(int code) 
        System.out.printf("%c%n", getAscii(code));
    

    public static void main(String[] args) 
        printChar(153);
        printChar(63);
    

输出:

Ü ?

从上面的输出可以看出,预期的字符被正确打印。


概念的延伸

另外,我编写了一个程序,可以打印出扩展 Ascii 的 Unicode 值。正如您在下面的输出中看到的那样,很多字符都无法显示为原生 char

代码:

public class ExtendedAscii 
    public static final char[] EXTENDED =  0x00C7, 0x00FC, 0x00E9, 0x00E2,
            0x00E4, 0x00E0, 0x00E5, 0x00E7, 0x00EA, 0x00EB, 0x00E8, 0x00EF,
            0x00EE, 0x00EC, 0x00C4, 0x00C5, 0x00C9, 0x00E6, 0x00C6, 0x00F4,
            0x00F6, 0x00F2, 0x00FB, 0x00F9, 0x00FF, 0x00D6, 0x00DC, 0x00A2,
            0x00A3, 0x00A5, 0x20A7, 0x0192, 0x00E1, 0x00ED, 0x00F3, 0x00FA,
            0x00F1, 0x00D1, 0x00AA, 0x00BA, 0x00BF, 0x2310, 0x00AC, 0x00BD,
            0x00BC, 0x00A1, 0x00AB, 0x00BB, 0x2591, 0x2592, 0x2593, 0x2502,
            0x2524, 0x2561, 0x2562, 0x2556, 0x2555, 0x2563, 0x2551, 0x2557,
            0x255D, 0x255C, 0x255B, 0x2510, 0x2514, 0x2534, 0x252C, 0x251C,
            0x2500, 0x253C, 0x255E, 0x255F, 0x255A, 0x2554, 0x2569, 0x2566,
            0x2560, 0x2550, 0x256C, 0x2567, 0x2568, 0x2564, 0x2565, 0x2559,
            0x2558, 0x2552, 0x2553, 0x256B, 0x256A, 0x2518, 0x250C, 0x2588,
            0x2584, 0x258C, 0x2590, 0x2580, 0x03B1, 0x00DF, 0x0393, 0x03C0,
            0x03A3, 0x03C3, 0x00B5, 0x03C4, 0x03A6, 0x0398, 0x03A9, 0x03B4,
            0x221E, 0x03C6, 0x03B5, 0x2229, 0x2261, 0x00B1, 0x2265, 0x2264,
            0x2320, 0x2321, 0x00F7, 0x2248, 0x00B0, 0x2219, 0x00B7, 0x221A,
            0x207F, 0x00B2, 0x25A0, 0x00A0 ;

    public static void main(String[] args) 
        for (char c : EXTENDED) 
            System.out.printf("%s, ", new String(Character.toChars(c)));
        
    

输出:

Ç, ü, é, â, ä, à, å, ç, ê, ë, è, ï, î, ì, Ä, Å, É, æ, Æ, ô, ö, ò, û, ù , ÿ, Ö, Ü, ¢, £, ¥, ?, ƒ, á, í, ó, ú, ñ, Ñ, ª, º, ¿, ?, ¬, ½, ¼, ¡, «, », ? , ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? , ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ß, ? , ?, ?, ?, µ, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ±, ?, ?, ?, ?, ÷, ?, °, ?, ·, ? , ?, ², ?, ,

参考表: (source)

Dec Hex Unicode     Char    Description
--- --- -------     ----    -----------------------------------
128 80  U+00C7      Ç   latin capital letter c with cedilla
129 81  U+00FC      ü   latin small letter u with diaeresis
130 82  U+00E9      é   latin small letter e with acute
131 83  U+00E2      â   latin small letter a with circumflex
132 84  U+00E4      ä   latin small letter a with diaeresis
133 85  U+00E0      à   latin small letter a with grave
134 86  U+00E5      å   latin small letter a with ring above
135 87  U+00E7      ç   latin small letter c with cedilla
136 88  U+00EA      ê   latin small letter e with circumflex
137 89  U+00EB      ë   latin small letter e with diaeresis
138 8A  U+00E8      è   latin small letter e with grave
139 8B  U+00EF      ï   latin small letter i with diaeresis
140 8C  U+00EE      î   latin small letter i with circumflex
141 8D  U+00EC      ì   latin small letter i with grave
142 8E  U+00C4      Ä   latin capital letter a with diaeresis
143 8F  U+00C5      Å   latin capital letter a with ring above
144 90  U+00C9      É   latin capital letter e with acute
145 91  U+00E6      æ   latin small ligature ae
146 92  U+00C6      Æ   latin capital ligature ae
147 93  U+00F4      ô   latin small letter o with circumflex
148 94  U+00F6      ö   latin small letter o with diaeresis
149 95  U+00F2      ò   latin small letter o with grave
150 96  U+00FB      û   latin small letter u with circumflex
151 97  U+00F9      ù   latin small letter u with grave
152 98  U+00FF      ÿ   latin small letter y with diaeresis
153 99  U+00D6      Ö   latin capital letter o with diaeresis
154 9A  U+00DC      Ü   latin capital letter u with diaeresis
155 9B  U+00A2      ¢   cent sign
156 9C  U+00A3      £   pound sign
157 9D  U+00A5      ¥   yen sign
158 9E  U+20A7      ₧   peseta sign
159 9F  U+0192      ƒ   latin small letter f with hook
160 A0  U+00E1      á   latin small letter a with acute
161 A1  U+00ED      í   latin small letter i with acute
162 A2  U+00F3      ó   latin small letter o with acute
163 A3  U+00FA      ú   latin small letter u with acute
164 A4  U+00F1      ñ   latin small letter n with tilde
165 A5  U+00D1      Ñ   latin capital letter n with tilde
166 A6  U+00AA      ª   feminine ordinal indicator
167 A7  U+00BA      º   masculine ordinal indicator
168 A8  U+00BF      ¿   inverted question mark
169 A9  U+2310      ⌐   reversed not sign
170 AA  U+00AC      ¬   not sign
171 AB  U+00BD      ½   vulgar fraction one half
172 AC  U+00BC      ¼   vulgar fraction one quarter
173 AD  U+00A1      ¡   inverted exclamation mark
174 AE  U+00AB      «   left-pointing double angle quotation mark
175 AF  U+00BB      »   right-pointing double angle quotation mark
176 B0  U+2591      ░   light shade
177 B1  U+2592      ▒   medium shade
178 B2  U+2593      ▓   dark shade
179 B3  U+2502      │   box drawings light vertical
180 B4  U+2524      ┤   box drawings light vertical and left
181 B5  U+2561      ╡   box drawings vertical single and left double
182 B6  U+2562      ╢   box drawings vertical double and left single
183 B7  U+2556      ╖   box drawings down double and left single
184 B8  U+2555      ╕   box drawings down single and left double
185 B9  U+2563      ╣   box drawings double vertical and left
186 BA  U+2551      ║   box drawings double vertical
187 BB  U+2557      ╗   box drawings double down and left
188 BC  U+255D      ╝   box drawings double up and left
189 BD  U+255C      ╜   box drawings up double and left single
190 BE  U+255B      ╛   box drawings up single and left double
191 BF  U+2510      ┐   box drawings light down and left
192 C0  U+2514      └   box drawings light up and right
193 C1  U+2534      ┴   box drawings light up and horizontal
194 C2  U+252C      ┬   box drawings light down and horizontal
195 C3  U+251C      ├   box drawings light vertical and right
196 C4  U+2500      ─   box drawings light horizontal
197 C5  U+253C      ┼   box drawings light vertical and horizontal
198 C6  U+255E      ╞   box drawings vertical single and right double
199 C7  U+255F      ╟   box drawings vertical double and right single
200 C8  U+255A      ╚   box drawings double up and right
201 C9  U+2554      ╔   box drawings double down and right
202 CA  U+2569      ╩   box drawings double up and horizontal
203 CB  U+2566      ╦   box drawings double down and horizontal
204 CC  U+2560      ╠   box drawings double vertical and right
205 CD  U+2550      ═   box drawings double horizontal
206 CE  U+256C      ╬   box drawings double vertical and horizontal
207 CF  U+2567      ╧   box drawings up single and horizontal double
208 D0  U+2568      ╨   box drawings up double and horizontal single
209 D1  U+2564      ╤   box drawings down single and horizontal double
210 D2  U+2565      ╥   box drawings down double and horizontal single
211 D3  U+2559      ╙   box drawings up double and right single
212 D4  U+2558      ╘   box drawings up single and right double
213 D5  U+2552      ╒   box drawings down single and right double
214 D6  U+2553      ╓   box drawings down double and right single
215 D7  U+256B      ╫   box drawings vertical double and horizontal single
216 D8  U+256A      ╪   box drawings vertical single and horizontal double
217 D9  U+2518      ┘   box drawings light up and left
218 DA  U+250C      ┌   box drawings light down and right
219 DB  U+2588      █   full block
220 DC  U+2584      ▄   lower half block
221 DD  U+258C      ▌   left half block
222 DE  U+2590      ▐   right half block
223 DF  U+2580      ▀   upper half block
224 E0  U+03B1      α   greek small letter alpha
225 E1  U+00DF      ß   latin small letter sharp s
226 E2  U+0393      Γ   greek capital letter gamma
227 E3  U+03C0      π   greek small letter pi
228 E4  U+03A3      Σ   greek capital letter sigma
229 E5  U+03C3      σ   greek small letter sigma
230 E6  U+00B5      µ   micro sign
231 E7  U+03C4      τ   greek small letter tau
232 E8  U+03A6      Φ   greek capital letter phi
233 E9  U+0398      Θ   greek capital letter theta
234 EA  U+03A9      Ω   greek capital letter omega
235 EB  U+03B4      δ   greek small letter delta
236 EC  U+221E      ∞   infinity
237 ED  U+03C6      φ   greek small letter phi
238 EE  U+03B5      ε   greek small letter epsilon
239 EF  U+2229      ∩   intersection
240 F0  U+2261      ≡   identical to
241 F1  U+00B1      ±   plus-minus sign
242 F2  U+2265      ≥   greater-than or equal to
243 F3  U+2264      ≤   less-than or equal to
244 F4  U+2320      ⌠   top half integral
245 F5  U+2321      ⌡   bottom half integral
246 F6  U+00F7      ÷   division sign
247 F7  U+2248      ≈   almost equal to
248 F8  U+00B0      °   degree sign
249 F9  U+2219      ∙   bullet operator
250 FA  U+00B7      ·   middle dot
251 FB  U+221A      √   square root
252 FC  U+207F      ⁿ   superscript latin small letter n
253 FD  U+00B2      ²   superscript two
254 FE  U+25A0      ■   black square
255 FF  U+00A0      no-break space

【讨论】:

这仅对一个代码页有效,特别是 CP437。这意味着这些是美式英语窗口控制台中的等效项,但不适用于 Linux 上的 UTF-8 xterm 或西欧的 CP850 控制台等。您必须知道终端使用的编码。跨度> 您使用char[] 代替EXTENDED 而不是int[] 有什么原因吗? @JamesSmith:因为 char 是 2 个字节,而 int 是 4 个字节。这意味着更小的内存占用。 ref 您的好答案。我改变了我的最终代码,我修改了你的代码行:return EXTENDED[code - 0x7F];返回扩展[代码 - 0x80]; .- 我不确定这是一个错误,还是我的输入错误。 结果对我来说是一个结果。输入 128 -> 输出 ü【参考方案2】:

“扩展的 ASCII”是模糊的。 ASCII 有许多扩展,它们为 127 到 255 之间的字节值定义字形。这些被称为code pages。一些较常见的包括:

CP437,原始 IBM PC 上的标准 ISO 8859-1 也称为 Code page 1252,该编码用于大多数西欧语言版本的 Windows,但控制台除外

你真的需要知道你的终端期待什么character encoding,否则你最终会打印garbage。在 Java 中,您应该能够检查 Charset.defaultCharset() (Charset documentation) 的值。

除了单字节“扩展 ASCII”代码页之外,还有更多的字符编码方法。 Unicode 需要比 255 多得多的代码点,因此经常使用各种固定宽度和可变宽度编码。这个页面好像是a good guide to character encoding in Java。

【讨论】:

【参考方案3】:
String iChar = new Character((char)i).toString(); 
String jChar = new Character((char)j).toString(); 

System.out.println(iChar);
System.out.println(jChar);

【讨论】:

你可能很容易做到:String.format("%c", i) 我试过你的示例,但对我来说,Eclipse 控制台上的输出仍然是? ?,我错过了什么吗? @sakura 你确定吗? 153是一些特殊字符,63是“?”,检查this,尝试打印97和98,应该是“a”和“b” @ZhenxiaoHao 是的,我是否需要在 Eclipse 上更改一些设置,才能在 Eclipse 控制台上看到使用 sysout 打印的特殊字符? @sakura 不,与控制台无关,您可以发布您当前的代码吗?我试过了,效果很好。【参考方案4】:

如果这是终端编码问题,我相信这个答案https://***.com/a/362006/4828060

是一种快速而直接的绕过问题的方法。只需添加 -Dfile.encoding=some_encoding 给java命令的参数,

例如 java -Dfile.encoding=UTF-8 … MainClass

【讨论】:

以上是关于如何从整数值打印Java中的扩展ASCII码的主要内容,如果未能解决你的问题,请参考以下文章

为啥C语言编程时输入数字转化为了ASC码

ASC码的运用

c语言字符ASCLL码顺序

Java HtmlCleaner:不处理扩展的 ascii 字符

如何仅从整数值中减去中值

ASCII码