2024-08-15 09:00:00

API和ABI的区别

我是Linux系统编程的新手,在阅读时遇到了API和ABI Linux系统编程。

API的定义:

API定义了接口 一个软件进行通信 与另一个在源级。

ABI的定义:

而API定义了一个源 接口时,ABI定义了 两个之间的低级二进制接口 或者更多的软件 特定的体系结构。它定义了 应用程序如何与 本身,应用程序如何交互 与内核,以及如何一个 应用程序与库交互。

程序如何在源级进行通信?什么是源级别?它是否与源代码有关?或者库的源代码包含在主程序中?

我所知道的唯一区别是API主要由程序员使用,而ABI主要由编译器使用。


当前回答

(应用二进制接口)结合操作系统的特定硬件平台的规范。它超越了API(应用程序程序接口),后者定义了从应用程序到操作系统的调用。ABI为特定的CPU系列定义API和机器语言。API不能确保运行时兼容性,但ABI可以,因为它定义了机器语言或运行时格式。

礼貌

其他回答

ABI指的是目标文件/库和最终二进制文件的布局,从成功链接、加载和执行某些二进制文件的角度出发,而不会因为二进制文件不兼容而出现链接错误或逻辑错误。

The binary format specification (PE, COFF, ELF, .obj, .o, .a, .lib (import library, static library), .NET assembly, .pyc, COM .dll): the headers, the header format, defining where the sections are and where the import / export / exception tables are and the format of those The instruction set used to encode the bytes in the code section, as well as the specific machine instructions The actual signature of the functions and data as defined in the API (as well as how they are represented in the binary (the next 2 points)) The calling convention of the functions in the code section, which may be called by other binaries (particularly relevant to ABI compatibility being the functions that are actually exported) The way data is represented and aligned in the data section with respect to its type (particularly relevant to ABI compatibility being the data that is actually exported) The system call numbers or interrupt vectors hooked in the code The name decoration of exported functions and data Linker directives in object files Preprocessor / compiler / assembler / linker flags and directives used by the API programmer and how they are interpreted to omit, optimise, inline or change the linkage of certain symbols or code in the library or final binary (be that binary a .dll or the executable in the event of static linking)

The bytecode format of .NET C# is an ABI (general), which includes the .NET assembly .dll format. The virtual machine that interprets the bytecode has a specific ABI that is C++ based, where types need to be marshalled between native C++ types that the native code's specific ABI uses and the boxed types of the virtual machine's ABI when calling bytecode from native code and native code from bytecode. Here I am calling an ABI of a specific program a specific ABI, whereas an ABI in general, such as 'MS ABI' or 'C ABI' simply refers to the calling convention and the way structures are organised, but not a specific embodiment of the ABI by a specific binary that introduces a new level of ABI compatibility concerns.

An API refers to the set of type definitions exported by a particular library imported and used in a particular translation unit, from the perspective of the compiler of a translation unit, to successfully resolve and check type references to be able to compile a binary, and that binary will adhere to the standard of the target ABI, such that if the library that actually implements the API is also compiled to a compatible ABI, it will link and work as intended. If the API is updated the application may still compile, but there will now be a binary incompatibility and therefore a new binary needs to be used.

API包括:

函数,变量,类,对象,常量,它们的名称,类型和定义,以正确的语法和语义方式编码 这些函数实际做什么,以及如何在源语言中使用它们 需要包含的源代码文件/为了使用它们而需要链接到的二进制文件,以及它们的ABI兼容性

API是人类使用的。我们编写源代码。当我们编写程序并想要使用一些库函数时,我们编写如下代码:

long howManyDecibels = 123L;
int ok = livenMyHills(howManyDecibels);

and we needed to know that there is a method livenMyHills(), which takes a long integer parameter. So as a Programming Interface it's all expressed in source code. The compiler turns this into executable instructions which conform to the implementation of this language on this particular operating system. And in this case result in some low level operations on an Audio unit. So particular bits and bytes are squirted at some hardware. So at runtime there's lots of Binary level action going on which we don't usually see.

在二进制级别,必须对在二进制级别传递的字节有一个精确的定义,例如4字节整数中的字节顺序,或者复杂数据结构的布局——是否有填充字节来对齐一些值。这个定义就是ABI。

您的程序(源代码)可以使用提供适当API的模块进行编译。

您的程序(二进制)可以在提供适当ABI的平台上运行。

API限制了类型定义、函数定义、宏,有时还有库应该公开的全局变量。

ABI限制了一个“平台”应该为您的程序运行提供什么。我喜欢从三个层面来考虑:

处理器级——指令集,调用约定 内核级——系统调用约定,特殊的文件路径约定(例如Linux中的/proc和/sys文件),等等。 操作系统级别——对象格式、运行时库等。

考虑一个名为arm-linux-gnueabi-gcc的交叉编译器。“arm”表示处理器架构,“linux”表示内核,“gnu”表示其目标程序使用gnu的libc作为运行时库,不同于arm-linux-androideabi-gcc使用Android的libc实现。

我先回答你们的具体问题。

1.什么是源级别?它是否与源代码有关?

Yes, the term source level refers to the level of source code. The term level refers to the semantic level of the computation requirements as they get translated from the application domain level to the source code level and from the source code level to the machine code level (binary codes). The application domain level refers what end-users of the software want and specify as their computation requirements. The source code level refers to what programmers make of the application level requirements and then specify as a program in a certain language.

程序如何在源级进行通信?或者库的源代码包含在主程序中?

语言API专门指一种语言需要(指定)(即接口)用该语言编写可重用模块的所有东西。可重用程序符合这些接口(API)要求,以便在相同语言的其他程序中重用。每次重用都需要符合相同的API需求。所以,“沟通”这个词指的是重用。

Yes, source code (of a reusable module; in the case of C/C++, .h files ) getting included (copied at pre-processing stage) is the common way of reusing in C/C++ and is thus part of C++ API. Even when you just write a simple function foo() in the global space of a C++ program and then call the function as foo(); any number of times is reuse as per the C++language API. Java classes in Java packages are reusable modules in Java. The Java beans specification is also a Java API enabling reusable programs (beans) to be reused by other modules ( could be another bean) with the help of runtimes/containers (conforming to that specification).

关于语言API和ABI之间的区别,以及面向服务的API与语言API的区别,我在SO方面的回答应该会有所帮助。

Linux共享库最小可运行API vs ABI示例

这个答案是从我的另一个答案中提取出来的:什么是应用程序二进制接口(ABI)?但我觉得它也直接回答了这个问题,而且这些问题不是重复的。

在共享库的上下文中,“拥有稳定的ABI”最重要的含义是,在库更改后不需要重新编译程序。

正如我们将在下面的示例中看到的,即使API没有改变,也可以修改ABI,从而破坏程序。

c

#include <assert.h>
#include <stdlib.h>

#include "mylib.h"

int main(void) {
    mylib_mystruct *myobject = mylib_init(1);
    assert(myobject->old_field == 1);
    free(myobject);
    return EXIT_SUCCESS;
}

mylib.c

#include <stdlib.h>

#include "mylib.h"

mylib_mystruct* mylib_init(int old_field) {
    mylib_mystruct *myobject;
    myobject = malloc(sizeof(mylib_mystruct));
    myobject->old_field = old_field;
    return myobject;
}

mylib.h

#ifndef MYLIB_H
#define MYLIB_H

typedef struct {
    int old_field;
} mylib_mystruct;

mylib_mystruct* mylib_init(int old_field);

#endif

编译和运行良好:

cc='gcc -pedantic-errors -std=c89 -Wall -Wextra'
$cc -fPIC -c -o mylib.o mylib.c
$cc -L . -shared -o libmylib.so mylib.o
$cc -L . -o main.out main.c -lmylib
LD_LIBRARY_PATH=. ./main.out

现在,假设对于标准库的v2,我们希望向mylib_mystruct添加一个名为new_field的新字段。

如果我们在old_field之前添加字段,如下所示:

typedef struct {
    int new_field;
    int old_field;
} mylib_mystruct;

重建了图书馆,但不是主要的。Out,则断言失败!

这是因为这一行:

myobject->old_field == 1

已生成程序集,该程序集试图访问结构体的第一个int,该结构体现在是new_field,而不是预期的old_field。

因此,这个更改破坏了ABI。

但是,如果我们在old_field之后添加new_field:

typedef struct {
    int old_field;
    int new_field;
} mylib_mystruct;

那么旧生成的程序集仍然访问结构的第一个int,程序仍然可以工作,因为我们保持了ABI的稳定。

下面是这个例子在GitHub上的一个全自动版本。

保持此ABI稳定的另一种方法是将mylib_mystruct视为不透明结构,仅通过方法帮助程序访问其字段。这样可以更容易地保持ABI的稳定,但是由于我们要进行更多的函数调用,因此会产生性能开销。

API 与 ABI

在前面的例子中,有趣的是,在old_field之前添加new_field只破坏了ABI,而没有破坏API。

这意味着,如果我们根据标准库重新编译main.c程序,无论如何它都会工作。

然而,如果我们改变了例如函数签名,我们也会破坏API:

mylib_mystruct* mylib_init(int old_field, int new_field);

因为在这种情况下,main.c将完全停止编译。

语义API vs编程API

我们还可以将API更改分为第三种类型:语义更改。

语义API通常是API应该做什么的自然语言描述,通常包含在API文档中。

因此,可以在不破坏程序构建本身的情况下破坏语义API。

例如,如果我们修改了

myobject->old_field = old_field;

to:

myobject->old_field = old_field + 1;

那么这既不会破坏编程API,也不会破坏ABI,但是main.c语义API会破坏。

有两种方法可以通过编程方式检查合约API:

测试一些极端情况。这很简单,但你可能总是错过一个。 正式的验证。更难做到,但产生了正确性的数学证明,本质上是将文档和测试统一为“人”/机器可验证的方式!当然,前提是你的正式描述中没有bug;-)

在Ubuntu 18.10, GCC 8.2.0中测试。