正则表达式就是定义一个字符串的匹配模式。
与正则表达式相关的文档如下:
例子1:假如我们想要得到6-20位字符串并且该字符串由数字字母下划线构成。
在没有使用正则表达式需要写如下的代码:
def is_true(num):
if 6 <= len(num) <= 20:
for x in num:
if not ('0' <= x <= '9' or 'A' <= x <= 'Z'
or 'a' <= x <= 'z' or x == '_'):
return False
return True
return False
在使用正则表达式之后:
username = input('username = ')
pattern1 = re.compile(r'^[0-9a-zA-Z_]{6,20}$')
m1 = pattern1.match(username)
下面就是一个完整使用正则表达式的例子:
import re
def main():
username = input('username = ')
qqnum = input('qqnum = ')
pattern1 = re.compile(r'^[0-9a-zA-Z_]{6,20}$')
m1 = pattern1.match(username)
if not m1:
print('用户格式错误')
m2 = re.match(r'^[1-9]\d{4,11}$', qqnum) # 在使用r的时候表示字符串为原始的字符串,如果不使用r那么在这里的正则表达式就该写成:'^[1-9]\\d{4,11}$'
if not m2:
print('QQ号格式错误')
if m1 and m2:
print('匹配成功')
if __name__ == '__main__':
main()
map函数的使用:
from re import findall
def main():
content = 'dsada4564sa4ds6a4d6sa'
mylist = findall(r'\d+', content)
mylist = list(map(int, mylist))
print(sum(mylist) / len(mylist)) # 916.8
if __name__ == '__main__':
main()
例子
1.手机号的验证
import re
def main():
pattern = re.compile(r'(?<=\D)1([38][0-9]|[5][0-35-9]|[4][57]|[7][678])\d{8}(?=\D)')
sentence = '大1558454564564架上15502344653读看18883362572洒金18883362572ljl13402311456ll'
for temp in pattern.finditer(sentence):
print(temp) # <_sre.SRE_Match object; span=(16, 27), match='15502344653'>
print(temp.group()) # 15502344653
print(temp.span()) # (16, 27)
if __name__ == '__main__':
main()
2.替换字符串中的不良文件:
import re
def main():
sentence = '我操你大爷的日你二爷干你三舅姥爷Fuck你姑父'
pure = re.sub('[操艹草日干顶妈]|fuck|shit', '*',
sentence, flags=re.IGNORECASE)
print(pure)
sentence = 'Your go your way, I will go mine!' # 会将 , !.改成空格字符串
mylist = re.split(r'[\s,!.]', sentence)
print(mylist)
# 贪婪匹配
sentence = 'aabahdhsjb'
m = re.match(r'a.*b', sentence)
print(m)
# 懒惰匹配
sentence = 'aabahdhsjb'
m = re.match(r'a.*?b', sentence)
print(m)
if __name__ == '__main__':
main()
3.得到字符串中的数字求平均值:
import re
def avage(num):
pattern = re.compile(r'\d+')
m = pattern.findall(num)
totle = 0
i = 0
for x in m:
totle += int(x)
i += 1
return totle / i
def main():
n = input('n = ')
print(avage(n))
if __name__ == '__main__':
main()