我如何找到两个子字符串('123STRINGabc' -> '字符串')之间的字符串?
我现在的方法是这样的:
>>> start = 'asdf=5;'
>>> end = '123jasd'
>>> s = 'asdf=5;iwantthis123jasd'
>>> print((s.split(start))[1].split(end)[0])
iwantthis
然而,这似乎非常低效且不符合python规则。有什么更好的方法来做这样的事情吗?
忘了说:
字符串可能不是以start和end开始和结束的。他们可能会有更多的字符之前和之后。
您可以简单地使用这段代码或复制下面的函数。全都整齐地排在一条线上。
def substring(whole, sub1, sub2):
return whole[whole.index(sub1) : whole.index(sub2)]
如果按照如下方式运行该函数。
print(substring("5+(5*2)+2", "(", "("))
你可能会得到这样的输出:
(5*2
而不是
5*2
如果您希望在输出的末尾有子字符串,代码必须如下所示。
return whole[whole.index(sub1) : whole.index(sub2) + 1]
但如果不希望子字符串在末尾,则+1必须在第一个值上。
return whole[whole.index(sub1) + 1 : whole.index(sub2)]
这些解决方案假设起始字符串和最终字符串是不同的。下面是当初始和最终指示符相同时,我用于整个文件的解决方案,假设整个文件是使用readlines()读取的:
def extractstring(line,flag='$'):
if flag in line: # $ is the flag
dex1=line.index(flag)
subline=line[dex1+1:-1] #leave out flag (+1) to end of line
dex2=subline.index(flag)
string=subline[0:dex2].strip() #does not include last flag, strip whitespace
return(string)
例子:
lines=['asdf 1qr3 qtqay 45q at $A NEWT?$ asdfa afeasd',
'afafoaltat $I GOT BETTER!$ derpity derp derp']
for line in lines:
string=extractstring(line,flag='$')
print(string)
给:
A NEWT?
I GOT BETTER!
下面是一个函数,我做了返回一个字符串(s)之间的字符串string1和string2搜索列表。
def GetListOfSubstrings(stringSubject,string1,string2):
MyList = []
intstart=0
strlength=len(stringSubject)
continueloop = 1
while(intstart < strlength and continueloop == 1):
intindex1=stringSubject.find(string1,intstart)
if(intindex1 != -1): #The substring was found, lets proceed
intindex1 = intindex1+len(string1)
intindex2 = stringSubject.find(string2,intindex1)
if(intindex2 != -1):
subsequence=stringSubject[intindex1:intindex2]
MyList.append(subsequence)
intstart=intindex2+len(string2)
else:
continueloop=0
else:
continueloop=0
return MyList
#Usage Example
mystring="s123y123o123pp123y6"
List = GetListOfSubstrings(mystring,"1","y68")
for x in range(0, len(List)):
print(List[x])
output:
mystring="s123y123o123pp123y6"
List = GetListOfSubstrings(mystring,"1","3")
for x in range(0, len(List)):
print(List[x])
output:
2
2
2
2
mystring="s123y123o123pp123y6"
List = GetListOfSubstrings(mystring,"1","y")
for x in range(0, len(List)):
print(List[x])
output:
23
23o123pp123
从Nikolaus Gradwohl的答案进一步,我需要从下面的文件内容(文件名:docker- composition .yml)中获得版本号(即0.0.2)之间('ui:'和'-'):
version: '3.1'
services:
ui:
image: repo-pkg.dev.io:21/website/ui:0.0.2-QA1
#network_mode: host
ports:
- 443:9999
ulimits:
nofile:test
这是它如何为我工作(python脚本):
import re, sys
f = open('docker-compose.yml', 'r')
lines = f.read()
result = re.search('ui:(.*)-', lines)
print result.group(1)
Result:
0.0.2