如何将std::string转换为char*或const char*?


当前回答

对于const char *使用.c_str()方法。

你可以使用&mystring[0]来获得一个char *指针,但是有几个问题:你不一定会得到一个以0结尾的字符串,你也不能改变字符串的大小。特别要注意的是,不要添加超过字符串结尾的字符,否则会导致缓冲区溢出(并可能导致崩溃)。

在c++ 11之前,并不能保证所有字符都是同一个连续缓冲区的一部分,但实际上,所有已知的std::string实现都是这样工作的;参见“&s[0]”是否指向std::string中的连续字符?

请注意,许多字符串成员函数将重新分配内部缓冲区,并使您可能保存的任何指针无效。最好立即使用,然后丢弃。

其他回答

鉴于说…

std::string x = "hello";

从string对象中获取char*或const char*

如何获得一个字符指针,是有效的,而x仍然在范围内,并没有进一步修改

c++ 11简化了事情;下面这些都可以访问相同的内部字符串缓冲区:

const char* p_c_str = x.c_str();
const char* p_data  = x.data();
char* p_writable_data = x.data(); // for non-const x from C++17 
const char* p_x0    = &x[0];

      char* p_x0_rw = &x[0];  // compiles iff x is not const...

以上所有指针都将保存相同的值——缓冲区中第一个字符的地址。即使是空字符串也有一个“缓冲区中的第一个字符”,因为c++ 11保证总是在显式分配的字符串内容之后保留一个额外的NUL/0结束符(例如std::string("this\0that", 9)将有一个缓冲区保存"this\0that\0")。

给定以上任意一个指针:

char c = p[n];   // valid for n <= x.size()
                 // i.e. you can safely read the NUL at p[x.size()]

仅适用于非const指针p_writable_data和来自&x[0]:

p_writable_data[n] = c;
p_x0_rw[n] = c;  // valid for n <= x.size() - 1
                 // i.e. don't overwrite the implementation maintained NUL

在字符串的其他地方写入NUL不会改变字符串的大小();string对象允许包含任意数量的NULs——std::string对它们没有特殊处理(c++ 03中相同)。

在c++ 03中,事情要复杂得多(重点区别突出显示):

x.data() returns const char* to the string's internal buffer which wasn't required by the Standard to conclude with a NUL (i.e. might be ['h', 'e', 'l', 'l', 'o'] followed by uninitialised or garbage values, with accidental accesses thereto having undefined behaviour). x.size() characters are safe to read, i.e. x[0] through x[x.size() - 1] for empty strings, you're guaranteed some non-NULL pointer to which 0 can be safely added (hurray!), but you shouldn't dereference that pointer. &x[0] for empty strings this has undefined behaviour (21.3.4) e.g. given f(const char* p, size_t n) { if (n == 0) return; ...whatever... } you mustn't call f(&x[0], x.size()); when x.empty() - just use f(x.data(), ...). otherwise, as per x.data() but: for non-const x this yields a non-const char* pointer; you can overwrite string content x.c_str() returns const char* to an ASCIIZ (NUL-terminated) representation of the value (i.e. ['h', 'e', 'l', 'l', 'o', '\0']). although few if any implementations chose to do so, the C++03 Standard was worded to allow the string implementation the freedom to create a distinct NUL-terminated buffer on the fly, from the potentially non-NUL terminated buffer "exposed" by x.data() and &x[0] x.size() + 1 characters are safe to read. guaranteed safe even for empty strings (['\0']).

访问外部法律索引的后果

无论以何种方式获取指针,都不能访问指针以外的内存,不能访问上述描述中保证存在的字符。尝试这样做会有未定义的行为,即使对于读,也有非常真实的应用程序崩溃和垃圾结果的机会,而且对于写,还会有大量数据、堆栈损坏和/或安全漏洞。

这些指针什么时候失效?

如果调用某个字符串成员函数来修改字符串或保留进一步的容量,则上述任何方法之前返回的任何指针值都将无效。您可以再次使用这些方法来获取另一个指针。(规则与迭代器到字符串的规则相同)。

请参见如何在x离开作用域或在....下面进一步修改后仍使字符指针有效

那么,用哪个更好呢?

从c++ 11开始,使用.c_str()表示ASCIIZ数据,使用.data()表示“二进制”数据(下文将进一步解释)。

在c++ 03中,使用.c_str(),除非确定.data()是足够的,并且优先使用.data()而不是&x[0],因为它对空字符串....是安全的

...尝试充分理解程序,以便在适当的时候使用data(),否则您可能会犯其他错误……

由.c_str()保证的ASCII NUL '\0'字符被许多函数用作标记值,表示相关且可安全访问的数据的结束。这既适用于c++函数,如fstream::fstream(const char* filename,…),也适用于与C语言共享的函数,如strchr()和printf()。

鉴于c++ 03的.c_str()对返回缓冲区的保证是.data()的超集,您总是可以安全地使用.c_str(),但人们有时不会这样做,因为:

using .data() communicates to other programmers reading the source code that the data is not ASCIIZ (rather, you're using the string to store a block of data (which sometimes isn't even really textual)), or that you're passing it to another function that treats it as a block of "binary" data. This can be a crucial insight in ensuring that other programmers' code changes continue to handle the data properly. C++03 only: there's a slight chance that your string implementation will need to do some extra memory allocation and/or data copying in order to prepare the NUL terminated buffer

作为进一步的提示,如果函数的形参需要(const) char*,但不坚持获取x.s size(),则该函数可能需要一个ascii输入,因此.c_str()是一个很好的选择(函数需要知道文本以某种方式在何处结束,因此如果它不是一个单独的形参,则只能是一个约定,如长度前缀或哨兵或一些固定的预期长度)。

如何得到一个字符指针有效,即使x离开范围或进一步修改

你需要将字符串x的内容复制到x外部的一个新的内存区域。这个外部缓冲区可能在很多地方,比如另一个字符串或字符数组变量,由于在不同的作用域(例如命名空间,全局,静态,堆,共享内存,内存映射文件),它可能有或没有与x不同的生命周期。

将std::string x中的文本复制到一个独立的字符数组中:

// USING ANOTHER STRING - AUTO MEMORY MANAGEMENT, EXCEPTION SAFE
std::string old_x = x;
// - old_x will not be affected by subsequent modifications to x...
// - you can use `&old_x[0]` to get a writable char* to old_x's textual content
// - you can use resize() to reduce/expand the string
//   - resizing isn't possible from within a function passed only the char* address

std::string old_x = x.c_str(); // old_x will terminate early if x embeds NUL
// Copies ASCIIZ data but could be less efficient as it needs to scan memory to
// find the NUL terminator indicating string length before allocating that amount
// of memory to copy into, or more efficient if it ends up allocating/copying a
// lot less content.
// Example, x == "ab\0cd" -> old_x == "ab".

// USING A VECTOR OF CHAR - AUTO, EXCEPTION SAFE, HINTS AT BINARY CONTENT, GUARANTEED CONTIGUOUS EVEN IN C++03
std::vector<char> old_x(x.data(), x.data() + x.size());       // without the NUL
std::vector<char> old_x(x.c_str(), x.c_str() + x.size() + 1);  // with the NUL

// USING STACK WHERE MAXIMUM SIZE OF x IS KNOWN TO BE COMPILE-TIME CONSTANT "N"
// (a bit dangerous, as "known" things are sometimes wrong and often become wrong)
char y[N + 1];
strcpy(y, x.c_str());

// USING STACK WHERE UNEXPECTEDLY LONG x IS TRUNCATED (e.g. Hello\0->Hel\0)
char y[N + 1];
strncpy(y, x.c_str(), N);  // copy at most N, zero-padding if shorter
y[N] = '\0';               // ensure NUL terminated

// USING THE STACK TO HANDLE x OF UNKNOWN (BUT SANE) LENGTH
char* y = alloca(x.size() + 1);
strcpy(y, x.c_str());

// USING THE STACK TO HANDLE x OF UNKNOWN LENGTH (NON-STANDARD GCC EXTENSION)
char y[x.size() + 1];
strcpy(y, x.c_str());

// USING new/delete HEAP MEMORY, MANUAL DEALLOC, NO INHERENT EXCEPTION SAFETY
char* y = new char[x.size() + 1];
strcpy(y, x.c_str());
//     or as a one-liner: char* y = strcpy(new char[x.size() + 1], x.c_str());
// use y...
delete[] y; // make sure no break, return, throw or branching bypasses this

// USING new/delete HEAP MEMORY, SMART POINTER DEALLOCATION, EXCEPTION SAFE
// see boost shared_array usage in Johannes Schaub's answer

// USING malloc/free HEAP MEMORY, MANUAL DEALLOC, NO INHERENT EXCEPTION SAFETY
char* y = strdup(x.c_str());
// use y...
free(y);

想要从字符串生成char*或const char*的其他原因

那么,上面你已经看到了如何获取(const) char*,以及如何独立于原始字符串复制文本,但是你可以用它做什么呢?一些随机的例子……

give "C" code access to the C++ string's text, as in printf("x is '%s'", x.c_str()); copy x's text to a buffer specified by your function's caller (e.g. strncpy(callers_buffer, callers_buffer_size, x.c_str())), or volatile memory used for device I/O (e.g. for (const char* p = x.c_str(); *p; ++p) *p_device = *p;) append x's text to an character array already containing some ASCIIZ text (e.g. strcat(other_buffer, x.c_str())) - be careful not to overrun the buffer (in many situations you may need to use strncat) return a const char* or char* from a function (perhaps for historical reasons - client's using your existing API - or for C compatibility you don't want to return a std::string, but do want to copy your string's data somewhere for the caller) be careful not to return a pointer that may be dereferenced by the caller after a local string variable to which that pointer pointed has left scope some projects with shared objects compiled/linked for different std::string implementations (e.g. STLport and compiler-native) may pass data as ASCIIZ to avoid conflicts

我正在使用一个API,其中有很多函数获得char*作为输入。

我创建了一个小类来处理这类问题,并且实现了RAII习惯用法。

class DeepString
{
        DeepString(const DeepString& other);
        DeepString& operator=(const DeepString& other);
        char* internal_; 
    
    public:
        explicit DeepString( const string& toCopy): 
            internal_(new char[toCopy.size()+1]) 
        {
            strcpy(internal_,toCopy.c_str());
        }
        ~DeepString() { delete[] internal_; }
        char* str() const { return internal_; }
        const char* c_str()  const { return internal_; }
};

你可以这样使用它:

void aFunctionAPI(char* input);

//  other stuff

aFunctionAPI("Foo"); //this call is not safe. if the function modified the 
                     //literal string the program will crash
std::string myFoo("Foo");
aFunctionAPI(myFoo.c_str()); //this is not compiling
aFunctionAPI(const_cast<char*>(myFoo.c_str())); //this is not safe std::string 
                                                //implement reference counting and 
                                                //it may change the value of other
                                                //strings as well.
DeepString myDeepFoo(myFoo);
aFunctionAPI(myFoo.str()); //this is fine

我将这个类称为DeepString,因为它正在创建一个现有字符串的深度且唯一的副本(DeepString不可复制)。

比方说, 字符串str =“堆栈”;

1)将字符串转换为char*

  char* s_rw=&str[0]; 

上面的字符*(即。, s_rw)是可读可写的,并且指向基 需要转换为char*的字符串的地址

2)将字符串转换为const char*

   const char* s_r=&str[0];

上面的const char*(即s_r)是可读的但不可写的,并且指向 字符串的基址。

C++17

c++ 17(即将推出的标准)修改了basic_string模板的概要,添加了非const重载data():

图表*数据()noexcept; 返回:一个指针p,对于[0,size()]中的每个i, p + i == &运算符。


CharT const * from std::basic_string<CharT>

std::string const cstr = { "..." };
char const * p = cstr.data(); // or .c_str()

从std::basic_string<图表>

std::string str = { "..." };
char * p = str.data();

C++11

CharT const * from std::basic_string<CharT>

std::string str = { "..." };
str.c_str();

从std::basic_string<图表>

从c++ 11开始,标准说:

The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size(). const_reference operator[](size_type pos) const; reference operator[](size_type pos); Returns: *(begin() + pos) if pos < size(), otherwise a reference to an object of type CharT with value CharT(); the referenced value shall not be modified. const charT* c_str() const noexcept;const charT* data() const noexcept; Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].

有多种方法可以获得非const字符指针。

1. 使用c++ 11的连续存储

std::string foo{"text"};
auto p = &*foo.begin();

Pro

简单而简短 快速(唯一不涉及复制的方法)

Cons

Final '\0'不会被改变/不一定是非const内存的一部分。

2. 使用std::向量<图>

std::string foo{"text"};
std::vector<char> fcv(foo.data(), foo.data()+foo.size()+1u);
auto p = fcv.data();

Pro

简单的 自动内存处理 动态

Cons

需要字符串复制

3.使用std::array<CharT, N>如果N是编译时间常数(并且足够小)

std::string foo{"text"};
std::array<char, 5u> fca;
std::copy(foo.data(), foo.data()+foo.size()+1u, fca.begin());

Pro

简单的 堆栈内存处理

Cons

静态 需要字符串复制

4. 原始内存分配与自动存储删除

std::string foo{ "text" };
auto p = std::make_unique<char[]>(foo.size()+1u);
std::copy(foo.data(), foo.data() + foo.size() + 1u, &p[0]);

Pro

内存占用小 自动删除 简单的

Cons

需要字符串复制 静态(动态使用需要大量代码) 特征比向量或数组少

5. 使用手动处理的原始内存分配

std::string foo{ "text" };
char * p = nullptr;
try
{
  p = new char[foo.size() + 1u];
  std::copy(foo.data(), foo.data() + foo.size() + 1u, p);
  // handle stuff with p
  delete[] p;
}
catch (...)
{
  if (p) { delete[] p; }
  throw;
}

Pro

最大的“控制”

Con

需要字符串复制 错误的最大责任/易感性 复杂的

对于const char *使用.c_str()方法。

你可以使用&mystring[0]来获得一个char *指针,但是有几个问题:你不一定会得到一个以0结尾的字符串,你也不能改变字符串的大小。特别要注意的是,不要添加超过字符串结尾的字符,否则会导致缓冲区溢出(并可能导致崩溃)。

在c++ 11之前,并不能保证所有字符都是同一个连续缓冲区的一部分,但实际上,所有已知的std::string实现都是这样工作的;参见“&s[0]”是否指向std::string中的连续字符?

请注意,许多字符串成员函数将重新分配内部缓冲区,并使您可能保存的任何指针无效。最好立即使用,然后丢弃。