在c++程序中以编程方式检测字节序

是否有一种编程方法来检测您使用的是大端序还是小端序体系结构?我需要能够编写将在英特尔或PPC系统上执行的代码，并使用完全相同的代码(即，没有条件编译)。

除非你使用的框架已经移植到PPC和英特尔处理器上，否则你将不得不进行条件编译，因为PPC和英特尔平台拥有完全不同的硬件架构、管道、总线等。这使得两者的程序集代码完全不同。

至于查找字节序，请执行以下操作:

short temp = 0x1234;
char* tempChar = (char*)&temp;

您可以让tempChar为0x12或0x34，从中可以知道字节序。

2009-06-16 13:00:03

声明一个int变量:

int variable = 0xFF;

现在使用char*指针指向它的各个部分，并检查这些部分中有什么。

char* startPart = reinterpret_cast<char*>( &variable );
char* endPart = reinterpret_cast<char*>( &variable ) + sizeof( int ) - 1;

根据哪一个指向0xFF字节，现在您可以检测到字节顺序。这需要sizeof(int) > sizeof(char)，但对于所讨论的平台绝对是正确的。

2009-06-16 13:00:05

请看这篇文章:

这里有一些代码来确定是什么您的机器类型 Int num = 1; If (*(char *)&num == 1) ｛ printf (" \ nLittle-Endian \ n "); ｝其他的｛ printf(“大端\ n”); ｝

2009-06-16 13:00:37

参见Endianness - c级代码说明。

// assuming target architecture is 32-bit = 4-Bytes
enum ENDIANNESS{ LITTLEENDIAN , BIGENDIAN , UNHANDLE };


ENDIANNESS CheckArchEndianalityV1( void )
{
    int Endian = 0x00000001; // assuming target architecture is 32-bit    

    // as Endian = 0x00000001 so MSB (Most Significant Byte) = 0x00 and LSB (Least     Significant Byte) = 0x01
    // casting down to a single byte value LSB discarding higher bytes    

    return (*(char *) &Endian == 0x01) ? LITTLEENDIAN : BIGENDIAN;
}

2009-06-16 13:00:52

你可以通过设置int和屏蔽位来做到这一点，但可能最简单的方法是使用内置的网络字节转换操作(因为网络字节顺序总是大端序)。

if ( htonl(47) == 47 ) {
  // Big endian
} else {
  // Little endian.
}

一点点摆弄可能会更快，但这种方法简单，直接，几乎不可能搞砸。

2009-06-16 13:00:53

int i=1;
char *c=(char*)&i;
bool littleendian=c;

2009-06-16 13:01:09

这个怎么样?

#include <cstdio>

int main()
{
    unsigned int n = 1;
    char *p = 0;

    p = (char*)&n;
    if (*p == 1)
        std::printf("Little Endian\n");
    else 
        if (*(p + sizeof(int) - 1) == 1)
            std::printf("Big Endian\n");
        else
            std::printf("What the crap?\n");
    return 0;
}

2009-06-16 13:02:25

要了解更多细节，你可能想要查看这篇codeproject文章Endianness的基本概念:

如何在运行时动态测试Endian类型? 正如《计算机》中解释的那样动画FAQ，可以使用下面的函数看看你的代码是在小端还是大端运行系统:崩溃定义BIG_ENDIAN 0 #定义LITTLE_ENDIAN

int TestByteOrder()
{
   short int word = 0x0001;
   char *byte = (char *) &word;
   return(byte[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

这段代码将值0001h赋给a 16位整数。然后是char指针第一次分配给点的(最低有效)字节整数值。的第一个字节整数是0x01h，然后系统是Little-Endian (0x01h在最低或最不重要，地址)。如果是0x00h，则系统是大端的。

2009-06-16 13:03:53

我会这样做:

bool isBigEndian() {
    static unsigned long x(1);
    static bool result(reinterpret_cast<unsigned char*>(&x)[0] == 0);
    return result;
}

沿着这些思路，您将得到一个只进行一次计算的省时函数。

2009-06-16 13:06:54

我不喜欢基于类型双关的方法——它经常会被编译器警告。这正是工会存在的意义!

bool is_big_endian(void)
{
    union {
        uint32_t i;
        char c[4];
    } bint = {0x01020304};

    return bint.c[0] == 1;
}

这个原则等同于其他人建议的类型大小写，但这更清楚——并且根据C99，它保证是正确的。与直接指针强制转换相比，GCC更喜欢这种方法。

这也比在编译时修复字节序要好得多——对于支持多架构的操作系统(例如Mac OS X上的胖二进制)，这对ppc/i386都适用，否则很容易把事情搞砸。

2009-06-16 13:08:04

这通常在编译时(特别是出于性能原因)通过使用编译器提供的头文件或创建自己的头文件来完成。在Linux上，你有头文件“/usr/include/ endan .h”。

2009-06-16 13:36:04

你也可以通过预处理器使用Boost头文件来做到这一点，这可以在Boost endian中找到。

2009-06-16 14:44:48

这是另一个C版本。它定义了一个名为wickd_cast()的宏，用于通过C99联合字面值和非标准__typeof__操作符实现内联类型双关语。

#include <limits.h>

#if UCHAR_MAX == UINT_MAX
#error endianness irrelevant as sizeof(int) == 1
#endif

#define wicked_cast(TYPE, VALUE) \
    (((union { __typeof__(VALUE) src; TYPE dest; }){ .src = VALUE }).dest)

_Bool is_little_endian(void)
{
    return wicked_cast(unsigned char, 1u);
}

如果整数是单字节值，则字节顺序没有意义，并将生成编译时错误。

2009-06-16 17:55:05

我很惊讶没有人提到预处理器默认定义的宏。但这取决于你的平台;它们比你自己写尾票要干净得多。

例如;如果我们看看GCC定义的内置宏(在x86-64机器上):

:| gcc -dM -E -x c - | grep -i endian

#define __LITTLE_ENDIAN__ 1

在PPC机器上，我得到:

:| gcc -dM -E -x c - | grep -i endian

#define __BIG_ENDIAN__ 1
#define _BIG_ENDIAN 1

(The:| gcc - dm - e -x c - magic打印出所有内置宏。)

2009-06-20 19:15:07

…记得不能用令我惊讶的是，没有人意识到编译器会简单地优化测试，并将一个固定的结果作为返回值。这使得前面答案中的所有代码示例实际上都是无用的。

唯一会返回的是编译时的字节序!是的，我在之前的回答中测试了所有的例子。下面是一个使用Microsoft Visual c++ 9.0 (Visual Studio 2008)的示例。

纯C代码

int32 DNA_GetEndianness(void)
{
    union
    {
        uint8  c[4];
        uint32 i;
    } u;

    u.i = 0x01020304;

    if (0x04 == u.c[0])
        return DNA_ENDIAN_LITTLE;
    else if (0x01 == u.c[0])
        return DNA_ENDIAN_BIG;
    else
        return DNA_ENDIAN_UNKNOWN;
}

拆卸

PUBLIC    _DNA_GetEndianness
; Function compile flags: /Ogtpy
; File c:\development\dna\source\libraries\dna\endian.c
;    COMDAT _DNA_GetEndianness
_TEXT    SEGMENT
_DNA_GetEndianness PROC                    ; COMDAT

; 11   :     union
; 12   :     {
; 13   :         uint8  c[4];
; 14   :         uint32 i;
; 15   :     } u;
; 16   :
; 17   :     u.i = 1;
; 18   :
; 19   :     if (1 == u.c[0])
; 20   :         return DNA_ENDIAN_LITTLE;

    mov    eax, 1

; 21   :     else if (1 == u.c[3])
; 22   :         return DNA_ENDIAN_BIG;
; 23   :     else
; 24   :        return DNA_ENDIAN_UNKNOWN;
; 25   : }

    ret
_DNA_GetEndianness ENDP
END

也许可以为这个函数关闭任何编译时优化，但我不知道。否则，也许可以在汇编中硬编码，尽管那是不可移植的。即使这样，这个也可能被优化掉。这让我觉得我需要一些非常蹩脚的汇编器，为所有现有的cpu /指令集实现相同的代码，以及....不要紧。

此外，这里有人说，字节序在运行时不会改变。错了。现在有双端机器。它们的字节顺序在执行期间可以变化。而且，不仅有小端和大端，还有其他端。

2011-05-04 01:11:06

union {
    int i;
    char c[sizeof(int)];
} x;
x.i = 1;
if(x.c[0] == 1)
    printf("little-endian\n");
else
    printf("big-endian\n");

这是另一个解。类似于Andrew Hare的解决方案。

2012-10-02 10:10:30

正如前面的答案所述，使用工会技巧。

但是上面建议的方法也存在一些问题。最值得注意的是，对于大多数架构来说，未对齐的内存访问是出了名的慢，一些编译器甚至根本无法识别这样的常量谓词，除非字对齐。

因为仅仅是端序测试很无聊，这里有一个(模板)函数，它将根据您的规范翻转输入/输出的任意整数，而不考虑主机架构。

#include <stdint.h>

#define BIG_ENDIAN 1
#define LITTLE_ENDIAN 0

template <typename T>
T endian(T w, uint32_t endian)
{
    // This gets optimized out into if (endian == host_endian) return w;
    union { uint64_t quad; uint32_t islittle; } t;
    t.quad = 1;
    if (t.islittle ^ endian) return w;
    T r = 0;

    // Decent compilers will unroll this (GCC)
    // or even convert straight into single bswap (Clang)
    for (int i = 0; i < sizeof(r); i++) {
        r <<= 8;
        r |= w & 0xff;
        w >>= 8;
    }
    return r;
};

用法:

要将给定的端序转换为主机，请使用:

Host = endian(source, endian_of_source)

要将主机端序转换为给定端序，请使用:

输出= endian(hostsource, endian_you_want_to_output)

生成的代码与在Clang上编写手动程序集一样快，在GCC上稍微慢一点(展开&，<<，>>，|每个字节)，但仍然不错。

2012-10-12 21:22:12

C编译器的工作方式(至少我知道的每个人)必须在编译时决定字节序。即使对于双端处理器(如ARM和MIPS)，您也必须在编译时选择字节顺序。

此外，对于可执行文件(如ELF)，在所有通用文件格式中都定义了字节顺序。虽然可以编写二进制的编码器代码(可能是为了ARM服务器的漏洞?)，但它可能必须在汇编中完成。

2012-11-25 14:56:32

bool isBigEndian()
{
    static const uint16_t m_endianCheck(0x00ff);
    return ( *((const uint8_t*)&m_endianCheck) == 0x0); 
}

2012-11-25 15:58:47

我正在阅读教科书《计算机系统:程序员的视角》，有一个问题是要确定这是由C程序编写的。

我使用指针的特性来这样做:

#include <stdio.h>

int main(void){
    int i=1;
    unsigned char* ii = &i;

    printf("This computer is %s endian.\n", ((ii[0]==1) ? "little" : "big"));
    return 0;
}

因为int占用4个字节，而char只占用1个字节。我们可以使用char指针指向值为1的int类型。因此，如果计算机是小端序的，则char指针所指向的char值为1，否则，其值应为0。

2013-10-15 11:41:56

声明:

nonmacro, C++11解:

union {
  uint16_t s;
  unsigned char c[2];
} constexpr static  d {1};

constexpr bool is_little_endian() {
  return d.c[0] == 1;
}

2014-05-21 04:43:39

正如Coriiander所指出的，这里的大部分(如果不是全部的话)代码将在编译时被优化掉，因此生成的二进制文件不会在运行时检查“字节顺序”。

据观察，给定的可执行文件不应该以两个不同的字节顺序运行，但我不知道是否总是这样，对我来说，在编译时检查似乎是一种hack。所以我编写了这个函数:

#include <stdint.h>

int* _BE = 0;

int is_big_endian() {
    if (_BE == 0) {
        uint16_t* teste = (uint16_t*)malloc(4);
        *teste = (*teste & 0x01FE) | 0x0100;
        uint8_t teste2 = ((uint8_t*) teste)[0];
        free(teste);
        _BE = (int*)malloc(sizeof(int));
        *_BE = (0x01 == teste2);
    }
    return *_BE;
}

MinGW无法优化这段代码，尽管它确实优化了这里的其他代码。我相信这是因为我保留了分配在较小字节内存上的“随机”值(至少有7位)，所以编译器无法知道这个随机值是什么，也不会优化函数。

我还对函数进行了编码，以便只执行一次检查，并为下一次测试存储返回值。

2014-09-28 08:46:33

这是未经测试的，但在我看来，这应该是可行的。因为在小端序上是0x01，在大端序上是0x00。

bool runtimeIsLittleEndian(void)
{
    volatile uint16_t i=1;
    return ((uint8_t*)&i)[0]==0x01; // 0x01=little, 0x00=big
}

2015-02-14 03:10:03

除非端标头只支持gcc，否则它提供了可以使用的宏。

#include "endian.h"
...
if (__BYTE_ORDER == __LITTLE_ENDIAN) { ... }
else if (__BYTE_ORDER == __BIG_ENDIAN) { ... }
else { throw std::runtime_error("Sorry, this version does not support PDP Endian!");
...

2015-04-18 19:08:44

c++的方法是使用Boost，在Boost中，预处理器检查和类型转换被划分到经过非常彻底测试的库中。

Predef库(boost/ Predef .h)识别四种不同的字节序。

end - dian库计划提交给c++标准，支持对end -sensitive数据的各种操作。

正如前面的回答所述，Endianness将成为c++ 20的一部分。

2015-09-11 01:14:27

如果你可以使用c++ 20编译器，比如GCC 8+或Clang 7+，你可以使用std::endian。

注意:std::endian从<type_traits>开始，但在2019年科隆会议上被移动到<bit>。GCC 8、Clang 7、8、9在<type_traits>， GCC 9+和Clang 10+在<bit>。

#include <bit>

if constexpr (std::endian::native == std::endian::big)
{
    // Big-endian system
}
else if constexpr (std::endian::native == std::endian::little)
{
    // Little-endian system
}
else
{
    // Something else
}

2016-07-01 09:11:07

如果你不想要条件编译，你可以写独立的代码。下面是一个例子(摘自Rob Pike):

以独立于端序的方式读取磁盘上以little-endian方式存储的整数:

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);

同样的代码，试图考虑到机器的字节顺序:

i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif

2017-02-17 11:58:49

不要使用联合号!

c++不允许通过联合的类型双关语! 从不是最后写入的联合字段读取是未定义的行为! 许多编译器支持这样做作为扩展，但语言不能保证。

更多细节请参见以下答案:

https://stackoverflow.com/a/11996970

只有两个有效的答案可以保证是可移植的。

第一个答案，如果你有一个支持c++ 20的系统，是从<bit>标头使用std::endian。

C++20 起

constexpr bool is_little_endian = (std::endian::native == std::endian::little);

在c++ 20之前，唯一有效的答案是存储一个整数，然后通过类型双关检查它的第一个字节。与联合的使用不同，这是c++类型系统明确允许的。

同样重要的是要记住，为了获得最佳的可移植性，应该使用static_cast，因为reinterpret_cast是实现定义的。

如果程序试图通过非下列类型之一的glvalue访问对象的存储值，则行为未定义: .．. char或unsigned char类型。

c++ 11 Onwards

enum class endianness
{
    little = 0,
    big = 1,
};

inline endianness get_system_endianness()
{
    const int value { 0x01 };
    const void * address = static_cast<const void *>(&value);
    const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
    return (*least_significant_address == 0x01) ? endianness::little : endianness::big;
}

c++ 11开始(没有enum)

inline bool is_system_little_endian()
{
    const int value { 0x01 };
    const void * address = static_cast<const void *>(&value);
    const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
    return (*least_significant_address == 0x01);
}

c++ 98 / c++ 03

inline bool is_system_little_endian()
{
    const int value = 0x01;
    const void * address = static_cast<const void *>(&value);
    const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
    return (*least_significant_address == 0x01);
}

2019-05-17 17:56:52

c++20解决方案:

constexpr bool compare(auto const c, auto const ...a) noexcept
{
  return [&]<auto ...I>(std::index_sequence<I...>) noexcept
    {
      return ((std::uint8_t(c >> 8 * I) == a) && ...);
    }(std::make_index_sequence<sizeof...(a)>());
}

static constexpr auto is_big_endian_v{
  compare(std::uint32_t(0x01234567), 0x01, 0x23, 0x45, 0x67)
};

static constexpr auto is_little_endian_v{
  compare(std::uint32_t(0x01234567), 0x67, 0x45, 0x23, 0x01)
};

static constexpr auto is_pdp_endian_v{
  compare(std::uint32_t(0x01234567), 0x23, 0x01, 0x67, 0x45)
};

这个任务可以更容易地完成，但是由于某种原因，<bit>头文件并不总是存在。这是一个演示。

2022-08-10 11:50:43

在c++程序中以编程方式检测字节序

推荐文章

最新文章

标签