在我看来,Linux使用/proc/self/exe很容易但是我想知道是否有一种方便的方法来找到当前应用程序的目录在C/ c++与跨平台接口。我见过一些项目胡乱摆弄argv[0],但它似乎并不完全可靠。
如果你必须支持,比如说,Mac OS X,它没有/proc/,你会怎么做?使用#ifdefs来隔离平台特定的代码(例如NSBundle)?或者尝试从argv[0], $ path和诸如此类的东西中推断可执行文件的路径,冒着在边缘情况下发现错误的风险?
在我看来,Linux使用/proc/self/exe很容易但是我想知道是否有一种方便的方法来找到当前应用程序的目录在C/ c++与跨平台接口。我见过一些项目胡乱摆弄argv[0],但它似乎并不完全可靠。
如果你必须支持,比如说,Mac OS X,它没有/proc/,你会怎么做?使用#ifdefs来隔离平台特定的代码(例如NSBundle)?或者尝试从argv[0], $ path和诸如此类的东西中推断可执行文件的路径,冒着在边缘情况下发现错误的风险?
当前回答
依我看,没有这种办法。还有一个不明确的地方:如果同一个可执行文件有多个硬链接指向它,您希望得到什么作为答案?(硬链接实际上并不“指向”,它们是同一个文件,只是位于文件系统层次结构中的另一个位置。)
一旦execve()成功执行了一个新的二进制文件,所有关于原始程序参数的信息都将丢失。
其他回答
当然,这并不适用于所有项目。 尽管如此,QCoreApplication::applicationFilePath()在6年的c++ /Qt开发中从未让我失望过。
当然,在尝试使用它之前,应该彻底阅读文档:
警告:在Linux上,此函数将尝试从 /proc文件系统。如果失败,则假定argv[0]包含 可执行文件的绝对文件名。这个函数也假设 应用程序没有更改当前目录。
老实说,我认为#ifdef和其他类似的解决方案根本不应该在现代代码中使用。
我相信更小的跨平台库也存在。让他们把所有特定于平台的东西封装在里面。
Gregory Pakosz的whereami库使用mark4o文章中提到的api为各种平台实现了这一点。如果您“只是”需要一个适用于可移植项目的解决方案,而对各种平台的特性不感兴趣,那么这是最有趣的。
在撰写本文时,受支持的平台有:
窗户 Linux Mac iOS 安卓 QNX Neutrino FreeBSD NetBSD 蜻蜓BSD SunOS
该库由whereami.c和whereami.h组成,并在MIT和WTFPL2下获得许可。将文件放入您的项目中,包括头文件并使用它:
#include "whereami.h"
int main() {
int length = wai_getExecutablePath(NULL, 0, NULL);
char* path = (char*)malloc(length + 1);
wai_getExecutablePath(path, length, &dirname_length);
path[length] = '\0';
printf("My path: %s", path);
free(path);
return 0;
}
根据QNX Neutrino版本的不同,有不同的方法来查找用于启动运行进程的可执行文件的完整路径和名称。我将进程标识符表示为<PID>。试试下面的方法:
If the file /proc/self/exefile exists, then its contents are the requested information. If the file /proc/<PID>/exefile exists, then its contents are the requested information. If the file /proc/self/as exists, then: open() the file. Allocate a buffer of, at least, sizeof(procfs_debuginfo) + _POSIX_PATH_MAX. Give that buffer as input to devctl(fd, DCMD_PROC_MAPDEBUG_BASE,.... Cast the buffer to a procfs_debuginfo*. The requested information is at the path field of the procfs_debuginfo structure. Warning: For some reason, sometimes, QNX omits the first slash / of the file path. Prepend that / when needed. Clean up (close the file, free the buffer, etc.). Try the procedure in 3. with the file /proc/<PID>/as. Try dladdr(dlsym(RTLD_DEFAULT, "main"), &dlinfo) where dlinfo is a Dl_info structure whose dli_fname might contain the requested information.
我希望这能有所帮助。
使用/proc/self/exe是不可移植和不可靠的。在我的Ubuntu 12.04系统上,你必须是root才能阅读/遵循符号链接。这将使Boost示例和可能发布的whereami()解决方案失败。
这篇文章很长,但是讨论了实际的问题,并给出了与测试套件验证一起工作的代码。
The best way to find your program is to retrace the same steps the system uses. This is done by using argv[0] resolved against file system root, pwd, path environment and considering symlinks, and pathname canonicalization. This is from memory but I have done this in the past successfully and tested it in a variety of different situations. It is not guaranteed to work, but if it doesn't you probably have much bigger problems and it is more reliable overall than any of the other methods discussed. There are situations on a Unix compatible system in which proper handling of argv[0] will not get you to your program but then you are executing in a certifiably broken environment. It is also fairly portable to all Unix derived systems since around 1970 and even some non-Unix derived systems as it basically relies on libc() standard functionality and standard command line functionality. It should work on Linux (all versions), Android, Chrome OS, Minix, original Bell Labs Unix, FreeBSD, NetBSD, OpenBSD, BSD x.x, SunOS, Solaris, SYSV, HP-UX, Concentrix, SCO, Darwin, AIX, OS X, NeXTSTEP, etc. And with a little modification probably VMS, VM/CMS, DOS/Windows, ReactOS, OS/2, etc. If a program was launched directly from a GUI environment, it should have set argv[0] to an absolute path.
Understand that almost every shell on every Unix compatible operating system that has ever been released basically finds programs the same way and sets up the operating environment almost the same way (with some optional extras). And any other program that launches a program is expected to create the same environment (argv, environment strings, etc.) for that program as if it were run from a shell, with some optional extras. A program or user can setup an environment that deviates from this convention for other subordinate programs that it launches but if it does, this is a bug and the program has no reasonable expectation that the subordinate program or its subordinates will function correctly.
argv[0]可能的值包括:
/path/to/executable -绝对路径 ../bin/executable -相对于PWD Bin /可执行文件-相对于PWD ./foo -相对于PWD 可执行- basename,在路径中找到 Bin //可执行文件-相对于pwd,非规范 src / . ./bin/executable -相对于pwd,非规范,回溯 bin /。/echoargc -相对于pwd,非规范
你不应该看到的值:
~/bin/executable -在程序运行之前重写。 ~user/bin/executable -在程序运行之前重写 别名-在程序运行之前重写 $shellvariable -在程序运行前重写 *foo* -通配符,在程序运行前重写,不是很有用 喷火?-通配符,在程序运行前重写,不是很有用
此外,这些可能包含非规范的路径名和多层符号链接。在某些情况下,同一个程序可能有多个硬链接。例如,/bin/ls、/bin/ps、/bin/chmod、/bin/rm等可能是到/bin/busybox的硬链接。
要发现自我,请遵循以下步骤:
Save pwd, PATH, and argv[0] on entry to your program (or initialization of your library) as they may change later. Optional: particularly for non-Unix systems, separate out but don't discard the pathname host/user/drive prefix part, if present; the part which often precedes a colon or follows an initial "//". If argv[0] is an absolute path, use that as a starting point. An absolute path probably starts with "/" but on some non-Unix systems it might start with "" or a drive letter or name prefix followed by a colon. Else if argv[0] is a relative path (contains "/" or "" but doesn't start with it, such as "../../bin/foo", then combine pwd+"/"+argv[0] (use present working directory from when program started, not current). Else if argv[0] is a plain basename (no slashes), then combine it with each entry in PATH environment variable in turn and try those and use the first one which succeeds. Optional: Else try the very platform specific /proc/self/exe, /proc/curproc/file (BSD), and (char *)getauxval(AT_EXECFN), and dlgetname(...) if present. You might even try these before argv[0]-based methods, if they are available and you don't encounter permission issues. In the somewhat unlikely event (when you consider all versions of all systems) that they are present and don't fail, they might be more authoritative. Optional: check for a path name passed in using a command line parameter. Optional: check for a pathname in the environment explicitly passed in by your wrapper script, if any. Optional: As a last resort try environment variable "_". It might point to a different program entirely, such as the users shell. Resolve symlinks, there may be multiple layers. There is the possibility of infinite loops, though if they exist your program probably won't get invoked. Canonicalize filename by resolving substrings like "/foo/../bar/" to "/bar/". Note this may potentially change the meaning if you cross a network mount point, so canonization is not always a good thing. On a network server, ".." in symlink may be used to traverse a path to another file in the server context instead of on the client. In this case, you probably want the client context so canonicalization is ok. Also convert patterns like "/./" to "/" and "//" to "/". In shell, readlink --canonicalize will resolve multiple symlinks and canonicalize name. Chase may do similar but isn't installed. realpath() or canonicalize_file_name(), if present, may help.
If realpath() doesn't exist at compile time, you might borrow a copy from a permissively licensed library distribution, and compile it in yourself rather than reinventing the wheel. Fix the potential buffer overflow (pass in sizeof output buffer, think strncpy() vs strcpy()) if you will be using a buffer less than PATH_MAX. It may be easier just to use a renamed private copy rather than testing if it exists. Permissive license copy from android/darwin/bsd: https://android.googlesource.com/platform/bionic/+/f077784/libc/upstream-freebsd/lib/libc/stdlib/realpath.c
Be aware that multiple attempts may be successful or partially successful and they might not all point to the same executable, so consider verifying your executable; however, you may not have read permission — if you can't read it, don't treat that as a failure. Or verify something in proximity to your executable such as the "../lib/" directory you are trying to find. You may have multiple versions, packaged and locally compiled versions, local and network versions, and local and USB-drive portable versions, etc. and there is a small possibility that you might get two incompatible results from different methods of locating. And "_" may simply point to the wrong program.
A program using execve can deliberately set argv[0] to be incompatible with the actual path used to load the program and corrupt PATH, "_", pwd, etc. though there isn't generally much reason to do so; but this could have security implications if you have vulnerable code that ignores the fact that your execution environment can be changed in variety of ways including, but not limited, to this one (chroot, fuse filesystem, hard links, etc.) It is possible for shell commands to set PATH but fail to export it.
You don't necessarily need to code for non-Unix systems but it would be a good idea to be aware of some of the peculiarities so you can write the code in such a way that it isn't as hard for someone to port later. Be aware that some systems (DEC VMS, DOS, URLs, etc.) might have drive names or other prefixes which end with a colon such as "C:", "sys$drive:[foo]bar", and "file:///foo/bar/baz". Old DEC VMS systems use "[" and "]" to enclose the directory portion of the path though this may have changed if your program is compiled in a POSIX environment. Some systems, such as VMS, may have a file version (separated by a semicolon at the end). Some systems use two consecutive slashes as in "//drive/path/to/file" or "user@host:/path/to/file" (scp command) or "file://hostname/path/to/file" (URL). In some cases (DOS and Windows), PATH might have different separator characters — ";" vs ":" and "" vs "/" for a path separator. In csh/tsh there is "path" (delimited with spaces) and "PATH" delimited with colons but your program should receive PATH so you don't need to worry about path. DOS and some other systems can have relative paths that start with a drive prefix. C:foo.exe refers to foo.exe in the current directory on drive C, so you do need to lookup current directory on C: and use that for pwd.
一个在我的系统上的符号链接和包装器的例子:
/usr/bin/google-chrome is symlink to
/etc/alternatives/google-chrome which is symlink to
/usr/bin/google-chrome-stable which is symlink to
/opt/google/chrome/google-chrome which is a bash script which runs
/opt/google/chome/chrome
请注意,用户bill在上面发布了一个HP程序的链接,该程序处理argv[0]的三种基本情况。不过,它需要一些改变:
It will be necessary to rewrite all the strcat() and strcpy() to use strncat() and strncpy(). Even though the variables are declared of length PATHMAX, an input value of length PATHMAX-1 plus the length of concatenated strings is > PATHMAX and an input value of length PATHMAX would be unterminated. It needs to be rewritten as a library function, rather than just to print out results. It fails to canonicalize names (use the realpath code I linked to above) It fails to resolve symbolic links (use the realpath code)
因此,如果你将HP代码和realpath代码结合起来,并修复它们以抵抗缓冲区溢出,那么你应该有一些东西可以正确地解释argv[0]。
下面演示了在Ubuntu 12.04上调用同一个程序的各种方法中argv[0]的实际值。是的,这个程序不小心被命名为echoargc而不是echoargv。这是使用干净的复制脚本完成的,但在shell中手动执行会得到相同的结果(除非您显式启用别名,否则别名在脚本中不起作用)。
cat ~/src/echoargc.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
main(int argc, char **argv)
{
printf(" argv[0]=\"%s\"\n", argv[0]);
sleep(1); /* in case run from desktop */
}
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
echoargc
argv[0]="echoargc"
bin/echoargc
argv[0]="bin/echoargc"
bin//echoargc
argv[0]="bin//echoargc"
bin/./echoargc
argv[0]="bin/./echoargc"
src/../bin/echoargc
argv[0]="src/../bin/echoargc"
cd ~/bin
*echo*
argv[0]="echoargc"
e?hoargc
argv[0]="echoargc"
./echoargc
argv[0]="./echoargc"
cd ~/src
../bin/echoargc
argv[0]="../bin/echoargc"
cd ~/junk
~/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
argv[0]="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
argv[0]="/home/whitis/bin/echoargc"
ln -s ~/bin/echoargc junk1
./junk1
argv[0]="./junk1"
ln -s /home/whitis/bin/echoargc junk2
./junk2
argv[0]="./junk2"
ln -s junk1 junk3
./junk3
argv[0]="./junk3"
gnome-desktop-item-edit --create-new ~/Desktop
# interactive, create desktop link, then click on it
argv[0]="/home/whitis/bin/echoargc"
# interactive, right click on gnome application menu, pick edit menus
# add menu item for echoargc, then run it from gnome menu
argv[0]="/home/whitis/bin/echoargc"
cat ./testargcscript 2>&1 | sed -e 's/^/ /g'
#!/bin/bash
# echoargc is in ~/bin/echoargc
# bin is in path
shopt -s expand_aliases
set -v
cat ~/src/echoargc.c
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
echoargc
bin/echoargc
bin//echoargc
bin/./echoargc
src/../bin/echoargc
cd ~/bin
*echo*
e?hoargc
./echoargc
cd ~/src
../bin/echoargc
cd ~/junk
~/bin/echoargc
~whitis/bin/echoargc
alias echoit=~/bin/echoargc
echoit
echoarg=~/bin/echoargc
$echoarg
ln -s ~/bin/echoargc junk1
./junk1
ln -s /home/whitis/bin/echoargc junk2
./junk2
ln -s junk1 junk3
./junk3
这些例子说明了本文中描述的技术应该在广泛的环境中工作,以及为什么有些步骤是必要的。
编辑:现在,打印argv[0]的程序已经更新为实际找到自己。
// Copyright 2015 by Mark Whitis. License=MIT style
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <limits.h>
#include <assert.h>
#include <string.h>
#include <errno.h>
// "look deep into yourself, Clarice" -- Hanibal Lector
char findyourself_save_pwd[PATH_MAX];
char findyourself_save_argv0[PATH_MAX];
char findyourself_save_path[PATH_MAX];
char findyourself_path_separator='/';
char findyourself_path_separator_as_string[2]="/";
char findyourself_path_list_separator[8]=":"; // could be ":; "
char findyourself_debug=0;
int findyourself_initialized=0;
void findyourself_init(char *argv0)
{
getcwd(findyourself_save_pwd, sizeof(findyourself_save_pwd));
strncpy(findyourself_save_argv0, argv0, sizeof(findyourself_save_argv0));
findyourself_save_argv0[sizeof(findyourself_save_argv0)-1]=0;
strncpy(findyourself_save_path, getenv("PATH"), sizeof(findyourself_save_path));
findyourself_save_path[sizeof(findyourself_save_path)-1]=0;
findyourself_initialized=1;
}
int find_yourself(char *result, size_t size_of_result)
{
char newpath[PATH_MAX+256];
char newpath2[PATH_MAX+256];
assert(findyourself_initialized);
result[0]=0;
if(findyourself_save_argv0[0]==findyourself_path_separator) {
if(findyourself_debug) printf(" absolute path\n");
realpath(findyourself_save_argv0, newpath);
if(findyourself_debug) printf(" newpath=\"%s\"\n", newpath);
if(!access(newpath, F_OK)) {
strncpy(result, newpath, size_of_result);
result[size_of_result-1]=0;
return(0);
} else {
perror("access failed 1");
}
} else if( strchr(findyourself_save_argv0, findyourself_path_separator )) {
if(findyourself_debug) printf(" relative path to pwd\n");
strncpy(newpath2, findyourself_save_pwd, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
realpath(newpath2, newpath);
if(findyourself_debug) printf(" newpath=\"%s\"\n", newpath);
if(!access(newpath, F_OK)) {
strncpy(result, newpath, size_of_result);
result[size_of_result-1]=0;
return(0);
} else {
perror("access failed 2");
}
} else {
if(findyourself_debug) printf(" searching $PATH\n");
char *saveptr;
char *pathitem;
for(pathitem=strtok_r(findyourself_save_path, findyourself_path_list_separator, &saveptr); pathitem; pathitem=strtok_r(NULL, findyourself_path_list_separator, &saveptr) ) {
if(findyourself_debug>=2) printf("pathitem=\"%s\"\n", pathitem);
strncpy(newpath2, pathitem, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
realpath(newpath2, newpath);
if(findyourself_debug) printf(" newpath=\"%s\"\n", newpath);
if(!access(newpath, F_OK)) {
strncpy(result, newpath, size_of_result);
result[size_of_result-1]=0;
return(0);
}
} // end for
perror("access failed 3");
} // end else
// if we get here, we have tried all three methods on argv[0] and still haven't succeeded. Include fallback methods here.
return(1);
}
main(int argc, char **argv)
{
findyourself_init(argv[0]);
char newpath[PATH_MAX];
printf(" argv[0]=\"%s\"\n", argv[0]);
realpath(argv[0], newpath);
if(strcmp(argv[0],newpath)) { printf(" realpath=\"%s\"\n", newpath); }
find_yourself(newpath, sizeof(newpath));
if(1 || strcmp(argv[0],newpath)) { printf(" findyourself=\"%s\"\n", newpath); }
sleep(1); /* in case run from desktop */
}
这里的输出证明了在之前的每一个测试中它确实找到了自己。
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
echoargc
argv[0]="echoargc"
realpath="/home/whitis/echoargc"
findyourself="/home/whitis/bin/echoargc"
bin/echoargc
argv[0]="bin/echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
bin//echoargc
argv[0]="bin//echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
bin/./echoargc
argv[0]="bin/./echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
src/../bin/echoargc
argv[0]="src/../bin/echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
cd ~/bin
*echo*
argv[0]="echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
e?hoargc
argv[0]="echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
./echoargc
argv[0]="./echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
cd ~/src
../bin/echoargc
argv[0]="../bin/echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
cd ~/junk
~/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
rm junk1 junk2 junk3
ln -s ~/bin/echoargc junk1
./junk1
argv[0]="./junk1"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
ln -s /home/whitis/bin/echoargc junk2
./junk2
argv[0]="./junk2"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
ln -s junk1 junk3
./junk3
argv[0]="./junk3"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
上面描述的两次GUI启动也可以正确地找到程序。
There is one potential pitfall. The access() function drops permissions if the program is setuid before testing. If there is a situation where the program can be found as an elevated user but not as a regular user, then there might be a situation where these tests would fail, although it is unlikely the program could actually be executed under those circumstances. One could use euidaccess() instead. It is possible, however, that it might find an inaccessable program earlier on path than the actual user could.
如果您正在编写GPLed代码并使用GNU autotools,那么在许多操作系统(包括Windows和macOS)上处理细节的可移植方法是gnulib的reloctable -prog模块。