在我看来,Linux使用/proc/self/exe很容易但是我想知道是否有一种方便的方法来找到当前应用程序的目录在C/ c++与跨平台接口。我见过一些项目胡乱摆弄argv[0],但它似乎并不完全可靠。
如果你必须支持,比如说,Mac OS X,它没有/proc/,你会怎么做?使用#ifdefs来隔离平台特定的代码(例如NSBundle)?或者尝试从argv[0], $ path和诸如此类的东西中推断可执行文件的路径,冒着在边缘情况下发现错误的风险?
在我看来,Linux使用/proc/self/exe很容易但是我想知道是否有一种方便的方法来找到当前应用程序的目录在C/ c++与跨平台接口。我见过一些项目胡乱摆弄argv[0],但它似乎并不完全可靠。
如果你必须支持,比如说,Mac OS X,它没有/proc/,你会怎么做?使用#ifdefs来隔离平台特定的代码(例如NSBundle)?或者尝试从argv[0], $ path和诸如此类的东西中推断可执行文件的路径,冒着在边缘情况下发现错误的风险?
当前回答
您可以使用argv[0]并分析PATH环境变量。 看:一个可以找到自己的程序示例
其他回答
您可以使用argv[0]并分析PATH环境变量。 看:一个可以找到自己的程序示例
除了mark4o的答案,FreeBSD也有
const char* getprogname(void)
它应该也可以在macOS中使用。它可以通过libbsd在GNU/Linux中获得。
但是我想知道是否有一种方便的方法来找到当前应用程序的目录在C/ c++与跨平台接口。
你不能这样做(至少在Linux上)
因为一个可执行文件可以,在运行它的进程执行期间,重命名(2)它的文件路径到不同的目录(同一文件系统)。请参见syscalls(2)和inode(7)。
在Linux上,一个可执行文件(原则上)甚至可以通过调用unlink(2)来删除(3)本身。Linux内核应该继续分配该文件,直到没有进程再引用它为止。使用proc(5),你可以做一些奇怪的事情(例如重命名(2)/proc/self/exe文件,等等…)
换句话说,在Linux上,“当前应用程序的目录”的概念没有任何意义。
请参阅高级Linux编程和操作系统:三个简单的部分。
还可以在OSDEV上查看一些开源操作系统(包括FreeBSD或GNU Hurd)。其中一些提供了接近POSIX的接口(API)。
考虑使用(在获得许可的情况下)跨平台c++框架,如Qt或POCO,也许可以通过将它们移植到您喜欢的操作系统来对它们做出贡献。
一些特定于操作系统的接口:
Mac OS X: _NSGetExecutablePath() (man 3 dyld) Linux: readlink /proc/self/exe Solaris: getexecname () sysctl CTL_KERN KERN_PROC KERN_PROC_PATHNAME -1 readlink /proc/curproc/file (FreeBSD默认没有procfs) NetBSD: readlink /proc/curproc/exe DragonFly BSD: readlink /proc/curproc/file Windows: GetModuleFileName() with hModule = NULL
也有第三方库可以用来获取这些信息,比如在prideout的回答中提到的whereami,或者如果你使用Qt,评论中提到的QCoreApplication::applicationFilePath()。
可移植的(但不太可靠)方法是使用argv[0]。尽管调用程序可以将它设置为任何东西,但按照惯例,它要么被设置为可执行文件的路径名,要么被设置为使用$ path找到的名称。
一些shell,包括bash和ksh,在执行可执行文件之前将环境变量“_”设置为可执行文件的完整路径。在这种情况下,您可以使用getenv("_")来获取它。然而,这是不可靠的,因为不是所有的shell都这样做,并且它可以被设置为任何东西,或者是在执行程序之前没有更改它的父进程遗留下来的。
使用/proc/self/exe是不可移植和不可靠的。在我的Ubuntu 12.04系统上,你必须是root才能阅读/遵循符号链接。这将使Boost示例和可能发布的whereami()解决方案失败。
这篇文章很长,但是讨论了实际的问题,并给出了与测试套件验证一起工作的代码。
The best way to find your program is to retrace the same steps the system uses. This is done by using argv[0] resolved against file system root, pwd, path environment and considering symlinks, and pathname canonicalization. This is from memory but I have done this in the past successfully and tested it in a variety of different situations. It is not guaranteed to work, but if it doesn't you probably have much bigger problems and it is more reliable overall than any of the other methods discussed. There are situations on a Unix compatible system in which proper handling of argv[0] will not get you to your program but then you are executing in a certifiably broken environment. It is also fairly portable to all Unix derived systems since around 1970 and even some non-Unix derived systems as it basically relies on libc() standard functionality and standard command line functionality. It should work on Linux (all versions), Android, Chrome OS, Minix, original Bell Labs Unix, FreeBSD, NetBSD, OpenBSD, BSD x.x, SunOS, Solaris, SYSV, HP-UX, Concentrix, SCO, Darwin, AIX, OS X, NeXTSTEP, etc. And with a little modification probably VMS, VM/CMS, DOS/Windows, ReactOS, OS/2, etc. If a program was launched directly from a GUI environment, it should have set argv[0] to an absolute path.
Understand that almost every shell on every Unix compatible operating system that has ever been released basically finds programs the same way and sets up the operating environment almost the same way (with some optional extras). And any other program that launches a program is expected to create the same environment (argv, environment strings, etc.) for that program as if it were run from a shell, with some optional extras. A program or user can setup an environment that deviates from this convention for other subordinate programs that it launches but if it does, this is a bug and the program has no reasonable expectation that the subordinate program or its subordinates will function correctly.
argv[0]可能的值包括:
/path/to/executable -绝对路径 ../bin/executable -相对于PWD Bin /可执行文件-相对于PWD ./foo -相对于PWD 可执行- basename,在路径中找到 Bin //可执行文件-相对于pwd,非规范 src / . ./bin/executable -相对于pwd,非规范,回溯 bin /。/echoargc -相对于pwd,非规范
你不应该看到的值:
~/bin/executable -在程序运行之前重写。 ~user/bin/executable -在程序运行之前重写 别名-在程序运行之前重写 $shellvariable -在程序运行前重写 *foo* -通配符,在程序运行前重写,不是很有用 喷火?-通配符,在程序运行前重写,不是很有用
此外,这些可能包含非规范的路径名和多层符号链接。在某些情况下,同一个程序可能有多个硬链接。例如,/bin/ls、/bin/ps、/bin/chmod、/bin/rm等可能是到/bin/busybox的硬链接。
要发现自我,请遵循以下步骤:
Save pwd, PATH, and argv[0] on entry to your program (or initialization of your library) as they may change later. Optional: particularly for non-Unix systems, separate out but don't discard the pathname host/user/drive prefix part, if present; the part which often precedes a colon or follows an initial "//". If argv[0] is an absolute path, use that as a starting point. An absolute path probably starts with "/" but on some non-Unix systems it might start with "" or a drive letter or name prefix followed by a colon. Else if argv[0] is a relative path (contains "/" or "" but doesn't start with it, such as "../../bin/foo", then combine pwd+"/"+argv[0] (use present working directory from when program started, not current). Else if argv[0] is a plain basename (no slashes), then combine it with each entry in PATH environment variable in turn and try those and use the first one which succeeds. Optional: Else try the very platform specific /proc/self/exe, /proc/curproc/file (BSD), and (char *)getauxval(AT_EXECFN), and dlgetname(...) if present. You might even try these before argv[0]-based methods, if they are available and you don't encounter permission issues. In the somewhat unlikely event (when you consider all versions of all systems) that they are present and don't fail, they might be more authoritative. Optional: check for a path name passed in using a command line parameter. Optional: check for a pathname in the environment explicitly passed in by your wrapper script, if any. Optional: As a last resort try environment variable "_". It might point to a different program entirely, such as the users shell. Resolve symlinks, there may be multiple layers. There is the possibility of infinite loops, though if they exist your program probably won't get invoked. Canonicalize filename by resolving substrings like "/foo/../bar/" to "/bar/". Note this may potentially change the meaning if you cross a network mount point, so canonization is not always a good thing. On a network server, ".." in symlink may be used to traverse a path to another file in the server context instead of on the client. In this case, you probably want the client context so canonicalization is ok. Also convert patterns like "/./" to "/" and "//" to "/". In shell, readlink --canonicalize will resolve multiple symlinks and canonicalize name. Chase may do similar but isn't installed. realpath() or canonicalize_file_name(), if present, may help.
If realpath() doesn't exist at compile time, you might borrow a copy from a permissively licensed library distribution, and compile it in yourself rather than reinventing the wheel. Fix the potential buffer overflow (pass in sizeof output buffer, think strncpy() vs strcpy()) if you will be using a buffer less than PATH_MAX. It may be easier just to use a renamed private copy rather than testing if it exists. Permissive license copy from android/darwin/bsd: https://android.googlesource.com/platform/bionic/+/f077784/libc/upstream-freebsd/lib/libc/stdlib/realpath.c
Be aware that multiple attempts may be successful or partially successful and they might not all point to the same executable, so consider verifying your executable; however, you may not have read permission — if you can't read it, don't treat that as a failure. Or verify something in proximity to your executable such as the "../lib/" directory you are trying to find. You may have multiple versions, packaged and locally compiled versions, local and network versions, and local and USB-drive portable versions, etc. and there is a small possibility that you might get two incompatible results from different methods of locating. And "_" may simply point to the wrong program.
A program using execve can deliberately set argv[0] to be incompatible with the actual path used to load the program and corrupt PATH, "_", pwd, etc. though there isn't generally much reason to do so; but this could have security implications if you have vulnerable code that ignores the fact that your execution environment can be changed in variety of ways including, but not limited, to this one (chroot, fuse filesystem, hard links, etc.) It is possible for shell commands to set PATH but fail to export it.
You don't necessarily need to code for non-Unix systems but it would be a good idea to be aware of some of the peculiarities so you can write the code in such a way that it isn't as hard for someone to port later. Be aware that some systems (DEC VMS, DOS, URLs, etc.) might have drive names or other prefixes which end with a colon such as "C:", "sys$drive:[foo]bar", and "file:///foo/bar/baz". Old DEC VMS systems use "[" and "]" to enclose the directory portion of the path though this may have changed if your program is compiled in a POSIX environment. Some systems, such as VMS, may have a file version (separated by a semicolon at the end). Some systems use two consecutive slashes as in "//drive/path/to/file" or "user@host:/path/to/file" (scp command) or "file://hostname/path/to/file" (URL). In some cases (DOS and Windows), PATH might have different separator characters — ";" vs ":" and "" vs "/" for a path separator. In csh/tsh there is "path" (delimited with spaces) and "PATH" delimited with colons but your program should receive PATH so you don't need to worry about path. DOS and some other systems can have relative paths that start with a drive prefix. C:foo.exe refers to foo.exe in the current directory on drive C, so you do need to lookup current directory on C: and use that for pwd.
一个在我的系统上的符号链接和包装器的例子:
/usr/bin/google-chrome is symlink to
/etc/alternatives/google-chrome which is symlink to
/usr/bin/google-chrome-stable which is symlink to
/opt/google/chrome/google-chrome which is a bash script which runs
/opt/google/chome/chrome
请注意,用户bill在上面发布了一个HP程序的链接,该程序处理argv[0]的三种基本情况。不过,它需要一些改变:
It will be necessary to rewrite all the strcat() and strcpy() to use strncat() and strncpy(). Even though the variables are declared of length PATHMAX, an input value of length PATHMAX-1 plus the length of concatenated strings is > PATHMAX and an input value of length PATHMAX would be unterminated. It needs to be rewritten as a library function, rather than just to print out results. It fails to canonicalize names (use the realpath code I linked to above) It fails to resolve symbolic links (use the realpath code)
因此,如果你将HP代码和realpath代码结合起来,并修复它们以抵抗缓冲区溢出,那么你应该有一些东西可以正确地解释argv[0]。
下面演示了在Ubuntu 12.04上调用同一个程序的各种方法中argv[0]的实际值。是的,这个程序不小心被命名为echoargc而不是echoargv。这是使用干净的复制脚本完成的,但在shell中手动执行会得到相同的结果(除非您显式启用别名,否则别名在脚本中不起作用)。
cat ~/src/echoargc.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
main(int argc, char **argv)
{
printf(" argv[0]=\"%s\"\n", argv[0]);
sleep(1); /* in case run from desktop */
}
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
echoargc
argv[0]="echoargc"
bin/echoargc
argv[0]="bin/echoargc"
bin//echoargc
argv[0]="bin//echoargc"
bin/./echoargc
argv[0]="bin/./echoargc"
src/../bin/echoargc
argv[0]="src/../bin/echoargc"
cd ~/bin
*echo*
argv[0]="echoargc"
e?hoargc
argv[0]="echoargc"
./echoargc
argv[0]="./echoargc"
cd ~/src
../bin/echoargc
argv[0]="../bin/echoargc"
cd ~/junk
~/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
argv[0]="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
argv[0]="/home/whitis/bin/echoargc"
ln -s ~/bin/echoargc junk1
./junk1
argv[0]="./junk1"
ln -s /home/whitis/bin/echoargc junk2
./junk2
argv[0]="./junk2"
ln -s junk1 junk3
./junk3
argv[0]="./junk3"
gnome-desktop-item-edit --create-new ~/Desktop
# interactive, create desktop link, then click on it
argv[0]="/home/whitis/bin/echoargc"
# interactive, right click on gnome application menu, pick edit menus
# add menu item for echoargc, then run it from gnome menu
argv[0]="/home/whitis/bin/echoargc"
cat ./testargcscript 2>&1 | sed -e 's/^/ /g'
#!/bin/bash
# echoargc is in ~/bin/echoargc
# bin is in path
shopt -s expand_aliases
set -v
cat ~/src/echoargc.c
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
echoargc
bin/echoargc
bin//echoargc
bin/./echoargc
src/../bin/echoargc
cd ~/bin
*echo*
e?hoargc
./echoargc
cd ~/src
../bin/echoargc
cd ~/junk
~/bin/echoargc
~whitis/bin/echoargc
alias echoit=~/bin/echoargc
echoit
echoarg=~/bin/echoargc
$echoarg
ln -s ~/bin/echoargc junk1
./junk1
ln -s /home/whitis/bin/echoargc junk2
./junk2
ln -s junk1 junk3
./junk3
这些例子说明了本文中描述的技术应该在广泛的环境中工作,以及为什么有些步骤是必要的。
编辑:现在,打印argv[0]的程序已经更新为实际找到自己。
// Copyright 2015 by Mark Whitis. License=MIT style
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <limits.h>
#include <assert.h>
#include <string.h>
#include <errno.h>
// "look deep into yourself, Clarice" -- Hanibal Lector
char findyourself_save_pwd[PATH_MAX];
char findyourself_save_argv0[PATH_MAX];
char findyourself_save_path[PATH_MAX];
char findyourself_path_separator='/';
char findyourself_path_separator_as_string[2]="/";
char findyourself_path_list_separator[8]=":"; // could be ":; "
char findyourself_debug=0;
int findyourself_initialized=0;
void findyourself_init(char *argv0)
{
getcwd(findyourself_save_pwd, sizeof(findyourself_save_pwd));
strncpy(findyourself_save_argv0, argv0, sizeof(findyourself_save_argv0));
findyourself_save_argv0[sizeof(findyourself_save_argv0)-1]=0;
strncpy(findyourself_save_path, getenv("PATH"), sizeof(findyourself_save_path));
findyourself_save_path[sizeof(findyourself_save_path)-1]=0;
findyourself_initialized=1;
}
int find_yourself(char *result, size_t size_of_result)
{
char newpath[PATH_MAX+256];
char newpath2[PATH_MAX+256];
assert(findyourself_initialized);
result[0]=0;
if(findyourself_save_argv0[0]==findyourself_path_separator) {
if(findyourself_debug) printf(" absolute path\n");
realpath(findyourself_save_argv0, newpath);
if(findyourself_debug) printf(" newpath=\"%s\"\n", newpath);
if(!access(newpath, F_OK)) {
strncpy(result, newpath, size_of_result);
result[size_of_result-1]=0;
return(0);
} else {
perror("access failed 1");
}
} else if( strchr(findyourself_save_argv0, findyourself_path_separator )) {
if(findyourself_debug) printf(" relative path to pwd\n");
strncpy(newpath2, findyourself_save_pwd, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
realpath(newpath2, newpath);
if(findyourself_debug) printf(" newpath=\"%s\"\n", newpath);
if(!access(newpath, F_OK)) {
strncpy(result, newpath, size_of_result);
result[size_of_result-1]=0;
return(0);
} else {
perror("access failed 2");
}
} else {
if(findyourself_debug) printf(" searching $PATH\n");
char *saveptr;
char *pathitem;
for(pathitem=strtok_r(findyourself_save_path, findyourself_path_list_separator, &saveptr); pathitem; pathitem=strtok_r(NULL, findyourself_path_list_separator, &saveptr) ) {
if(findyourself_debug>=2) printf("pathitem=\"%s\"\n", pathitem);
strncpy(newpath2, pathitem, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
newpath2[sizeof(newpath2)-1]=0;
realpath(newpath2, newpath);
if(findyourself_debug) printf(" newpath=\"%s\"\n", newpath);
if(!access(newpath, F_OK)) {
strncpy(result, newpath, size_of_result);
result[size_of_result-1]=0;
return(0);
}
} // end for
perror("access failed 3");
} // end else
// if we get here, we have tried all three methods on argv[0] and still haven't succeeded. Include fallback methods here.
return(1);
}
main(int argc, char **argv)
{
findyourself_init(argv[0]);
char newpath[PATH_MAX];
printf(" argv[0]=\"%s\"\n", argv[0]);
realpath(argv[0], newpath);
if(strcmp(argv[0],newpath)) { printf(" realpath=\"%s\"\n", newpath); }
find_yourself(newpath, sizeof(newpath));
if(1 || strcmp(argv[0],newpath)) { printf(" findyourself=\"%s\"\n", newpath); }
sleep(1); /* in case run from desktop */
}
这里的输出证明了在之前的每一个测试中它确实找到了自己。
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
echoargc
argv[0]="echoargc"
realpath="/home/whitis/echoargc"
findyourself="/home/whitis/bin/echoargc"
bin/echoargc
argv[0]="bin/echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
bin//echoargc
argv[0]="bin//echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
bin/./echoargc
argv[0]="bin/./echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
src/../bin/echoargc
argv[0]="src/../bin/echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
cd ~/bin
*echo*
argv[0]="echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
e?hoargc
argv[0]="echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
./echoargc
argv[0]="./echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
cd ~/src
../bin/echoargc
argv[0]="../bin/echoargc"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
cd ~/junk
~/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
argv[0]="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
rm junk1 junk2 junk3
ln -s ~/bin/echoargc junk1
./junk1
argv[0]="./junk1"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
ln -s /home/whitis/bin/echoargc junk2
./junk2
argv[0]="./junk2"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
ln -s junk1 junk3
./junk3
argv[0]="./junk3"
realpath="/home/whitis/bin/echoargc"
findyourself="/home/whitis/bin/echoargc"
上面描述的两次GUI启动也可以正确地找到程序。
There is one potential pitfall. The access() function drops permissions if the program is setuid before testing. If there is a situation where the program can be found as an elevated user but not as a regular user, then there might be a situation where these tests would fail, although it is unlikely the program could actually be executed under those circumstances. One could use euidaccess() instead. It is possible, however, that it might find an inaccessable program earlier on path than the actual user could.