Solving problem is about exposing yourself to as many situations as possible like Checking whole string with a regex and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about Checking whole string with a regex, which can be followed any time. Take easy to follow this discuss.
I’m trying to check if a string is a number, so the regex “d+” seemed good. However that regex also fits “78.46.92.168:8000” for some reason, which I do not want, a little bit of code:
class Foo():
_rex = re.compile("d+")
def bar(self, string):
m = _rex.match(string)
if m != None:
doStuff()
And doStuff() is called when the ip adress is entered. I’m kind of confused, how does “.” or “:” match “d”?
Answer #1:
d+
matches any positive number of digits within your string, so it matches the first 78
and succeeds.
Use ^d+$
.
Or, even better: "78.46.92.168:8000".isdigit()
Answer #2:
re.match()
always matches from the start of the string (unlike re.search()
) but allows the match to end before the end of the string.
Therefore, you need an anchor: _rex.match(r"d+$")
would work.
To be more explicit, you could also use _rex.match(r"^d+$")
(which is redundant) or just drop re.match()
altogether and just use _rex.search(r"^d+$")
.
Answer #3:
There are a couple of options in Python to match an entire input with a regex.
Python 2 and 3
In Python 2 and 3, you may use
re.match(r'd+$') # re.match anchors the match at the start of the string, so $ is what remains to add
or – to avoid matching before the final n
in the string:
re.match(r'd+Z') # Z will only match at the very end of the string
Or the same as above with re.search
method requiring the use of ^
/ A
start-of-string anchor as it does not anchor the match at the start of the string:
re.search(r'^d+$')
re.search(r'Ad+Z')
Note that A
is an unambiguous string start anchor, its behavior cannot be redefined with any modifiers (re.M
/ re.MULTILINE
can only redefine the ^
and $
behavior).
Python 3
All those cases described in the above section and one more useful method, re.fullmatch
(also present in the PyPi regex
module):
If the whole string matches the regular expression pattern, return a corresponding match object. Return
None
if the string does not match the pattern; note that this is different from a zero-length match.
So, after you compile the regex, just use the appropriate method:
_rex = re.compile("d+")
if _rex.fullmatch(s):
doStuff()
Answer #4:
Z
matches the end of the string while $
matches the end of the string or just before the newline at the end of the string, and exhibits different behaviour in re.MULTILINE
. See the syntax documentation for detailed information.
>>> s="1234n"
>>> re.search("^d+Z",s)
>>> s="1234"
>>> re.search("^d+Z",s)
<_sre.SRE_Match object at 0xb762ed40>
Answer #5:
Change it from d+
to ^d+$