字符串格式：%vs..format vs.f-String文字

有多种字符串格式设置方法：

Python<2.6:“您好%s”%namePython 2.6+：“Hello｛｝”.format（name）（使用str.format）Python 3.6+：f“｛name｝”（使用f-string）

哪种情况更好？在什么情况下？

以下方法具有相同的结果，那么有什么区别？name=“爱丽丝”“你好%s”%name“您好｛0｝”.format（名称）f“您好｛name｝”#使用命名参数：“您好%（kwarg）s”%｛'kwarg'：name｝“你好｛kwarg｝”.format（kwarg=name）f“您好｛name｝”字符串格式化何时运行，如何避免运行时性能损失？

如果您试图结束一个重复的问题，该问题只是在寻找一种格式化字符串的方法，请使用How do I put a variable value in a string？。

当前回答

对于python版本>=3.6（参见PEP 498）

s1='albha'
s2='beta'

f'{s1}{s2:>10}'

#output
'albha      beta'

2018-02-14 22:42:32

其他回答

回答第一个问题。格式在许多方面似乎更为复杂。关于%的一个令人讨厌的问题是，它可以接受变量或元组。你会认为以下方法总是有效的：

"Hello %s" % name

然而，如果name恰好是（1，2，3），它将抛出一个TypeError。为了保证它总是打印出来，你需要

"Hello %s" % (name,)   # supply the single argument as a single-item tuple

这太难看了。格式没有这些问题。同样在您给出的第二个示例中，.format示例看起来更简洁。

仅用于向后兼容Python 2.5。

为了回答第二个问题，字符串格式化与任何其他操作同时发生-当计算字符串格式化表达式时。Python不是一种惰性语言，它在调用函数之前会对表达式求值，因此表达式log.debug（“somedebuginfo:%s”%some_info）将首先将字符串求值为，例如“somedebug-info:roflcopters is active”，然后将该字符串传递给log.debug（）。

2011-02-22 18:49:21

如果您的python>=3.6，则F-字符串格式的文字是您的新朋友。

它更简单、更干净、性能更好。

In [1]: params=['Hello', 'adam', 42]

In [2]: %timeit "%s %s, the answer to everything is %d."%(params[0],params[1],params[2])
448 ns ± 1.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [3]: %timeit "{} {}, the answer to everything is {}.".format(*params)
449 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [4]: %timeit f"{params[0]} {params[1]}, the answer to everything is {params[2]}."
12.7 ns ± 0.0129 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

2018-07-04 07:13:44

但是请注意，刚才我在尝试用现有代码中的.format替换所有%时发现了一个问题：“｛｝”.format（unicode_string）将尝试对unicode_string进行编码，并且可能会失败。

看看这个Python交互式会话日志：

Python 2.7.2 (default, Aug 27 2012, 19:52:55) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
; s='й'
; u=u'й'
; s
'\xd0\xb9'
; u
u'\u0439'

s只是一个字符串（在Python3中称为“byte array”），u是一个Unicode字符串（在Python 3中称“string”）：

; '%s' % s
'\xd0\xb9'
; '%s' % u
u'\u0439'

当您将Unicode对象作为参数提供给%operator时，即使原始字符串不是Unicode，它也会生成Unicode字符串：

; '{}'.format(s)
'\xd0\xb9'
; '{}'.format(u)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0439' in position 0: ordinal not in range(256)

但.format函数将引发“UnicodeEncodeError”：

; u'{}'.format(s)
u'\xd0\xb9'
; u'{}'.format(u)
u'\u0439'

并且只有当原始字符串是Unicode时，它才能使用Unicode参数。

; '{}'.format(u'i')
'i'

或者如果参数字符串可以转换为字符串（称为“字节数组”）

2012-09-03 18:15:42

Python 3.6.7比较：

#!/usr/bin/env python
import timeit

def time_it(fn):
    """
    Measure time of execution of a function
    """
    def wrapper(*args, **kwargs):
        t0 = timeit.default_timer()
        fn(*args, **kwargs)
        t1 = timeit.default_timer()
        print("{0:.10f} seconds".format(t1 - t0))
    return wrapper


@time_it
def new_new_format(s):
    print("new_new_format:", f"{s[0]} {s[1]} {s[2]} {s[3]} {s[4]}")


@time_it
def new_format(s):
    print("new_format:", "{0} {1} {2} {3} {4}".format(*s))


@time_it
def old_format(s):
    print("old_format:", "%s %s %s %s %s" % s)


def main():
    samples = (("uno", "dos", "tres", "cuatro", "cinco"), (1,2,3,4,5), (1.1, 2.1, 3.1, 4.1, 5.1), ("uno", 2, 3.14, "cuatro", 5.5),) 
    for s in samples:
        new_new_format(s)
        new_format(s)
        old_format(s)
        print("-----")


if __name__ == '__main__':
    main()

输出：

new_new_format: uno dos tres cuatro cinco
0.0000170280 seconds
new_format: uno dos tres cuatro cinco
0.0000046750 seconds
old_format: uno dos tres cuatro cinco
0.0000034820 seconds
-----
new_new_format: 1 2 3 4 5
0.0000043980 seconds
new_format: 1 2 3 4 5
0.0000062590 seconds
old_format: 1 2 3 4 5
0.0000041730 seconds
-----
new_new_format: 1.1 2.1 3.1 4.1 5.1
0.0000092650 seconds
new_format: 1.1 2.1 3.1 4.1 5.1
0.0000055340 seconds
old_format: 1.1 2.1 3.1 4.1 5.1
0.0000052130 seconds
-----
new_new_format: uno 2 3.14 cuatro 5.5
0.0000053380 seconds
new_format: uno 2 3.14 cuatro 5.5
0.0000047570 seconds
old_format: uno 2 3.14 cuatro 5.5
0.0000045320 seconds
-----

2019-02-05 09:56:38

在格式化正则表达式时，%可能会有所帮助。例如

'{type_names} [a-z]{2}'.format(type_names='triangle|square')

引发IndexError。在这种情况下，您可以使用：

'%(type_names)s [a-z]{2}' % {'type_names': 'triangle|square'}

这避免了将正则表达式写成“｛type_names｝[a-z]｛｛2｝｝”。当您有两个正则表达式时，这可能很有用，其中一个正则表达式单独使用而不使用格式，但两者的连接是格式化的。

2015-04-09 20:41:40

字符串格式：%vs..format vs.f-String文字

推荐文章

最新文章

标签