在Go中,字符串是一种基本类型,这意味着它是只读的,对它的每次操作都会创建一个新字符串。
如果我想多次连接字符串而不知道结果字符串的长度,最好的方法是什么?
最天真的做法是:
var s string
for i := 0; i < 1000; i++ {
s += getShortStringFromSomewhere()
}
return s
但这似乎不是很有效。
在Go中,字符串是一种基本类型,这意味着它是只读的,对它的每次操作都会创建一个新字符串。
如果我想多次连接字符串而不知道结果字符串的长度,最好的方法是什么?
最天真的做法是:
var s string
for i := 0; i < 1000; i++ {
s += getShortStringFromSomewhere()
}
return s
但这似乎不是很有效。
当前回答
扩展cd1的答案: 你可以使用append()而不是copy()。 Append()提供了更大的预先准备,占用了更多的内存,但节省了时间。 我在你的上面又加了两个基准测试。 在本地运行
go test -bench=. -benchtime=100ms
在我的thinkpad T400s上,它产生:
BenchmarkAppendEmpty 50000000 5.0 ns/op
BenchmarkAppendPrealloc 50000000 3.5 ns/op
BenchmarkCopy 20000000 10.2 ns/op
其他回答
2018年新增说明
从Go 1.10开始,有一个字符串。建造者类型,请看看这个答案的更多细节。
pre - 201 x的答案
@cd1的基准代码和其他答案是错误的。b.N不应该在基准函数中设置。它由go测试工具动态设置,以确定测试的执行时间是否稳定。
基准测试函数应该运行相同的测试b.N次,循环中的测试应该在每次迭代中都是相同的。所以我通过添加一个内循环来解决这个问题。我还添加了一些其他解决方案的基准:
package main
import (
"bytes"
"strings"
"testing"
)
const (
sss = "xfoasneobfasieongasbg"
cnt = 10000
)
var (
bbb = []byte(sss)
expected = strings.Repeat(sss, cnt)
)
func BenchmarkCopyPreAllocate(b *testing.B) {
var result string
for n := 0; n < b.N; n++ {
bs := make([]byte, cnt*len(sss))
bl := 0
for i := 0; i < cnt; i++ {
bl += copy(bs[bl:], sss)
}
result = string(bs)
}
b.StopTimer()
if result != expected {
b.Errorf("unexpected result; got=%s, want=%s", string(result), expected)
}
}
func BenchmarkAppendPreAllocate(b *testing.B) {
var result string
for n := 0; n < b.N; n++ {
data := make([]byte, 0, cnt*len(sss))
for i := 0; i < cnt; i++ {
data = append(data, sss...)
}
result = string(data)
}
b.StopTimer()
if result != expected {
b.Errorf("unexpected result; got=%s, want=%s", string(result), expected)
}
}
func BenchmarkBufferPreAllocate(b *testing.B) {
var result string
for n := 0; n < b.N; n++ {
buf := bytes.NewBuffer(make([]byte, 0, cnt*len(sss)))
for i := 0; i < cnt; i++ {
buf.WriteString(sss)
}
result = buf.String()
}
b.StopTimer()
if result != expected {
b.Errorf("unexpected result; got=%s, want=%s", string(result), expected)
}
}
func BenchmarkCopy(b *testing.B) {
var result string
for n := 0; n < b.N; n++ {
data := make([]byte, 0, 64) // same size as bootstrap array of bytes.Buffer
for i := 0; i < cnt; i++ {
off := len(data)
if off+len(sss) > cap(data) {
temp := make([]byte, 2*cap(data)+len(sss))
copy(temp, data)
data = temp
}
data = data[0 : off+len(sss)]
copy(data[off:], sss)
}
result = string(data)
}
b.StopTimer()
if result != expected {
b.Errorf("unexpected result; got=%s, want=%s", string(result), expected)
}
}
func BenchmarkAppend(b *testing.B) {
var result string
for n := 0; n < b.N; n++ {
data := make([]byte, 0, 64)
for i := 0; i < cnt; i++ {
data = append(data, sss...)
}
result = string(data)
}
b.StopTimer()
if result != expected {
b.Errorf("unexpected result; got=%s, want=%s", string(result), expected)
}
}
func BenchmarkBufferWrite(b *testing.B) {
var result string
for n := 0; n < b.N; n++ {
var buf bytes.Buffer
for i := 0; i < cnt; i++ {
buf.Write(bbb)
}
result = buf.String()
}
b.StopTimer()
if result != expected {
b.Errorf("unexpected result; got=%s, want=%s", string(result), expected)
}
}
func BenchmarkBufferWriteString(b *testing.B) {
var result string
for n := 0; n < b.N; n++ {
var buf bytes.Buffer
for i := 0; i < cnt; i++ {
buf.WriteString(sss)
}
result = buf.String()
}
b.StopTimer()
if result != expected {
b.Errorf("unexpected result; got=%s, want=%s", string(result), expected)
}
}
func BenchmarkConcat(b *testing.B) {
var result string
for n := 0; n < b.N; n++ {
var str string
for i := 0; i < cnt; i++ {
str += sss
}
result = str
}
b.StopTimer()
if result != expected {
b.Errorf("unexpected result; got=%s, want=%s", string(result), expected)
}
}
环境是OS X 10.11.6, 2.2 GHz英特尔酷睿i7
测试结果:
BenchmarkCopyPreAllocate-8 20000 84208 ns/op 425984 B/op 2 allocs/op
BenchmarkAppendPreAllocate-8 10000 102859 ns/op 425984 B/op 2 allocs/op
BenchmarkBufferPreAllocate-8 10000 166407 ns/op 426096 B/op 3 allocs/op
BenchmarkCopy-8 10000 160923 ns/op 933152 B/op 13 allocs/op
BenchmarkAppend-8 10000 175508 ns/op 1332096 B/op 24 allocs/op
BenchmarkBufferWrite-8 10000 239886 ns/op 933266 B/op 14 allocs/op
BenchmarkBufferWriteString-8 10000 236432 ns/op 933266 B/op 14 allocs/op
BenchmarkConcat-8 10 105603419 ns/op 1086685168 B/op 10000 allocs/op
结论:
CopyPreAllocate is the fastest way; AppendPreAllocate is pretty close to No.1, but it's easier to write the code. Concat has really bad performance both for speed and memory usage. Don't use it. Buffer#Write and Buffer#WriteString are basically the same in speed, contrary to what @Dani-Br said in the comment. Considering string is indeed []byte in Go, it makes sense. bytes.Buffer basically use the same solution as Copy with extra book keeping and other stuff. Copy and Append use a bootstrap size of 64, the same as bytes.Buffer Append use more memory and allocs, I think it's related to the grow algorithm it use. It's not growing memory as fast as bytes.Buffer
建议:
对于OP需要的简单任务,我将使用Append或AppendPreAllocate。它足够快而且容易使用。 如果需要同时读取和写入缓冲区,则使用字节。当然是缓冲区。这就是它的设计目的。
如果你知道你要预分配的字符串的总长度,那么最有效的连接字符串的方法可能是使用内置函数拷贝。如果你事先不知道总长度,不要抄写,而是阅读其他答案。
在我的测试中,这种方法比使用字节快3倍。Buffer,并且比使用运算符+快得多(~ 12000倍)。此外,它使用更少的内存。
我创建了一个测试用例来证明这一点,结果如下:
BenchmarkConcat 1000000 64497 ns/op 502018 B/op 0 allocs/op
BenchmarkBuffer 100000000 15.5 ns/op 2 B/op 0 allocs/op
BenchmarkCopy 500000000 5.39 ns/op 0 B/op 0 allocs/op
下面是测试代码:
package main
import (
"bytes"
"strings"
"testing"
)
func BenchmarkConcat(b *testing.B) {
var str string
for n := 0; n < b.N; n++ {
str += "x"
}
b.StopTimer()
if s := strings.Repeat("x", b.N); str != s {
b.Errorf("unexpected result; got=%s, want=%s", str, s)
}
}
func BenchmarkBuffer(b *testing.B) {
var buffer bytes.Buffer
for n := 0; n < b.N; n++ {
buffer.WriteString("x")
}
b.StopTimer()
if s := strings.Repeat("x", b.N); buffer.String() != s {
b.Errorf("unexpected result; got=%s, want=%s", buffer.String(), s)
}
}
func BenchmarkCopy(b *testing.B) {
bs := make([]byte, b.N)
bl := 0
b.ResetTimer()
for n := 0; n < b.N; n++ {
bl += copy(bs[bl:], "x")
}
b.StopTimer()
if s := strings.Repeat("x", b.N); string(bs) != s {
b.Errorf("unexpected result; got=%s, want=%s", string(bs), s)
}
}
// Go 1.10
func BenchmarkStringBuilder(b *testing.B) {
var strBuilder strings.Builder
b.ResetTimer()
for n := 0; n < b.N; n++ {
strBuilder.WriteString("x")
}
b.StopTimer()
if s := strings.Repeat("x", b.N); strBuilder.String() != s {
b.Errorf("unexpected result; got=%s, want=%s", strBuilder.String(), s)
}
}
我最初的建议是
s12 := fmt.Sprint(s1,s2)
但以上答案使用字节。Buffer - WriteString()是最有效的方法。
我最初的建议是使用反射和类型开关。参见(p *pp) doPrint和(p *pp) printArg 我曾经天真地认为,基本类型没有通用的Stringer()接口。
至少Sprint()内部使用bytes.Buffer。因此
`s12 := fmt.Sprint(s1,s2,s3,s4,...,s1000)`
在内存分配方面是可接受的。
Sprint()连接可用于快速调试输出。 =>否则使用bytes。缓冲……WriteString
strings. join()来自"strings"包
如果你有一个类型不匹配(比如如果你试图连接一个int和一个字符串),你做RANDOMTYPE(你想改变的东西)
EX:
package main
import (
"fmt"
"strings"
)
var intEX = 0
var stringEX = "hello all you "
var stringEX2 = "people in here"
func main() {
s := []string{stringEX, stringEX2}
fmt.Println(strings.Join(s, ""))
}
输出:
hello all you people in here
简单易消化的解决方案。详情请见评论。 Copy覆盖slice的元素。我们对单个元素进行切片,然后覆盖它。
package main
import (
"fmt"
)
var N int = 100000
func main() {
slice1 := make([]rune, N, N)
//Efficient with fast performance, Need pre-allocated memory
//We can add a check if we reached the limit then increase capacity
//using append, but would be fined for data copying to new array. Also append happens after the length of current slice.
for i := 0; i < N; i++ {
copy(slice1[i:i+1], []rune{'N'})
}
fmt.Println(slice1)
//Simple but fast solution, Every time the slice capacity is reached we get a fine of effort that goes
//in copying data to new array
slice2 := []rune{}
for i := 0; i <= N; i++ {
slice2 = append(slice2, 'N')
}
fmt.Println(slice2)
}