为什么Java不支持无符号整数?
在我看来,这是一个奇怪的遗漏,因为它们允许人们编写不太可能在意外的大输入上产生溢出的代码。
此外,使用无符号整数可以是一种自我文档的形式,因为它们表明无符号整型所要保存的值永远不应该是负数。
最后,在某些情况下,无符号整数对于某些运算(如除法)更有效。
包含这些的缺点是什么?
为什么Java不支持无符号整数?
在我看来,这是一个奇怪的遗漏,因为它们允许人们编写不太可能在意外的大输入上产生溢出的代码。
此外,使用无符号整数可以是一种自我文档的形式,因为它们表明无符号整型所要保存的值永远不应该是负数。
最后,在某些情况下,无符号整数对于某些运算(如除法)更有效。
包含这些的缺点是什么?
当前回答
在“C”规范中,有一些因实用主义原因而被Java抛弃的珍宝,但随着开发人员的需求(闭包等),它们正在慢慢地回归。
我提到第一个是因为它和这个讨论有关;指针值对无符号整数算术的坚持。并且,与这个主题相关的是,在Java的Signed世界中维护Unsigned语义的困难。
我猜如果有人让Dennis Ritchie的另一个自我来建议Gosling的设计团队,他会建议给Signed's一个“无穷大的零”,这样所有的地址偏移请求都会先加上他们的algeaic RING SIZE来消除负值。
这样,向数组抛出的任何偏移量都不会生成SEGFAULT。例如,在一个封装类中,我称之为RingArray的双精度对象需要unsigned行为-在“自旋转循环”上下文中:
// ...
// Housekeeping state variable
long entrycount; // A sequence number
int cycle; // Number of loops cycled
int size; // Active size of the array because size<modulus during cycle 0
int modulus; // Maximal size of the array
// Ring state variables
private int head; // The 'head' of the Ring
private int tail; // The ring iterator 'cursor'
// tail may get the current cursor position
// and head gets the old tail value
// there are other semantic variations possible
// The Array state variable
double [] darray; // The array of doubles
// somewhere in constructor
public RingArray(int modulus) {
super();
this.modulus = modulus;
tail = head = cycle = 0;
darray = new double[modulus];
// ...
}
// ...
double getElementAt(int offset){
return darray[(tail+modulus+offset%modulus)%modulus];
}
// remember, the above is treating steady-state where size==modulus
// ...
上面的RingArray永远不会从负索引中“获得”,即使恶意请求者试图这样做。记住,还有许多合法的请求用于请求先前的(负的)索引值。
注意:外层%模数去掉了对合法请求的引用,而内部%模数掩盖了明显的恶意,因为负数比-模数更负。如果这将出现在Java +..9 || 8+…+ spec,那么问题将真正成为一个“程序员不能“自我旋转”的错误”。
我相信所谓的Java unsigned int“缺陷”可以用上面的一行程序来弥补。
PS:只是为了给上面的RingArray管理提供上下文,这里有一个候选的'set'操作来匹配上面的'get'元素操作:
void addElement(long entrycount,double value){ // to be called only by the keeper of entrycount
this.entrycount= entrycount;
cycle = (int)entrycount/modulus;
if(cycle==0){ // start-up is when the ring is being populated the first time around
size = (int)entrycount; // during start-up, size is less than modulus so use modulo size arithmetic
tail = (int)entrycount%size; // during start-up
}
else {
size = modulus;
head = tail;
tail = (int)entrycount%modulus; // after start-up
}
darray[head] = value; // always overwrite old tail
}
其他回答
I once took a C++ course with someone on the C++ standards committee who implied that Java made the right decision to avoid having unsigned integers because (1) most programs that use unsigned integers can do just as well with signed integers and this is more natural in terms of how people think, and (2) using unsigned integers results in lots easy to create but difficult to debug issues such as integer arithmetic overflow and losing significant bits when converting between signed and unsigned types. If you mistakenly subtract 1 from 0 using signed integers it often more quickly causes your program to crash and makes it easier to find the bug than if it wraps around to 2^32 - 1, and compilers and static analysis tools and runtime checks have to assume you know what you're doing since you chose to use unsigned arithmetic. Also, negative numbers like -1 can often represent something useful, like a field being ignored/defaulted/unset while if you were using unsigned you'd have to reserve a special value like 2^32 - 1 or something similar.
Long ago, when memory was limited and processors did not automatically operate on 64 bits at once, every bit counted a lot more, so having signed vs unsigned bytes or shorts actually mattered a lot more often and was obviously the right design decision. Today just using a signed int is more than sufficient in almost all regular programming cases, and if your program really needs to use values bigger than 2^31 - 1, you often just want a long anyway. Once you're into the territory of using longs, it's even harder to come up with a reason why you really can't get by with 2^63 - 1 positive integers. Whenever we go to 128 bit processors it'll be even less of an issue.
字里行间,我认为逻辑是这样的:
通常,Java设计人员希望简化可用的数据类型 对于日常用途,他们认为最常见的需求是有符号的数据类型 为了实现某些算法,有时需要无符号算术,但是要实现这种算法的程序员也应该具备使用有符号数据类型进行无符号算术的知识
总的来说,我认为这是一个合理的决定。我可能会:
使字节无符号,或者至少为这一数据类型提供了有符号/无符号的替代选项,可能使用不同的名称(使它有符号有利于一致性,但什么时候需要有符号字节?) 不再使用“short”(你上次使用16位符号算术是什么时候?)
不过,只要稍加修饰,对32位以内的无符号值进行运算就不会太糟糕,而且大多数人不需要无符号64位除法或比较。
一旦有符号整型和无符号整型混合在表达式中,事情就开始变得混乱,你可能会丢失信息。将Java限制为有符号int型只能真正解决问题。我很高兴我不必担心整个有符号/无符号的问题,尽管我有时会错过字节中的第8位。
作为处理过无符号算术的人,我可以向您保证,在Java中确实没有必要使用无符号数字。
以C语言为例。让我们这样写:
unsigned int num = -7;
printf("%d", num);
你能猜到上面印的是什么吗?
-7
哇!无符号整数是负的!完全正确。没有真正的正整数。无符号整数只是一个n字节(取决于C语言中的体系结构)的值,它不为符号分配MSB。它不检查分配或读取的数字的实际符号。
在“C”规范中,有一些因实用主义原因而被Java抛弃的珍宝,但随着开发人员的需求(闭包等),它们正在慢慢地回归。
我提到第一个是因为它和这个讨论有关;指针值对无符号整数算术的坚持。并且,与这个主题相关的是,在Java的Signed世界中维护Unsigned语义的困难。
我猜如果有人让Dennis Ritchie的另一个自我来建议Gosling的设计团队,他会建议给Signed's一个“无穷大的零”,这样所有的地址偏移请求都会先加上他们的algeaic RING SIZE来消除负值。
这样,向数组抛出的任何偏移量都不会生成SEGFAULT。例如,在一个封装类中,我称之为RingArray的双精度对象需要unsigned行为-在“自旋转循环”上下文中:
// ...
// Housekeeping state variable
long entrycount; // A sequence number
int cycle; // Number of loops cycled
int size; // Active size of the array because size<modulus during cycle 0
int modulus; // Maximal size of the array
// Ring state variables
private int head; // The 'head' of the Ring
private int tail; // The ring iterator 'cursor'
// tail may get the current cursor position
// and head gets the old tail value
// there are other semantic variations possible
// The Array state variable
double [] darray; // The array of doubles
// somewhere in constructor
public RingArray(int modulus) {
super();
this.modulus = modulus;
tail = head = cycle = 0;
darray = new double[modulus];
// ...
}
// ...
double getElementAt(int offset){
return darray[(tail+modulus+offset%modulus)%modulus];
}
// remember, the above is treating steady-state where size==modulus
// ...
上面的RingArray永远不会从负索引中“获得”,即使恶意请求者试图这样做。记住,还有许多合法的请求用于请求先前的(负的)索引值。
注意:外层%模数去掉了对合法请求的引用,而内部%模数掩盖了明显的恶意,因为负数比-模数更负。如果这将出现在Java +..9 || 8+…+ spec,那么问题将真正成为一个“程序员不能“自我旋转”的错误”。
我相信所谓的Java unsigned int“缺陷”可以用上面的一行程序来弥补。
PS:只是为了给上面的RingArray管理提供上下文,这里有一个候选的'set'操作来匹配上面的'get'元素操作:
void addElement(long entrycount,double value){ // to be called only by the keeper of entrycount
this.entrycount= entrycount;
cycle = (int)entrycount/modulus;
if(cycle==0){ // start-up is when the ring is being populated the first time around
size = (int)entrycount; // during start-up, size is less than modulus so use modulo size arithmetic
tail = (int)entrycount%size; // during start-up
}
else {
size = modulus;
head = tail;
tail = (int)entrycount%modulus; // after start-up
}
darray[head] = value; // always overwrite old tail
}