Question :
Regular Expression to match cross platform newline characters
My program can accept data that has newline characters of n, rn or r (eg Unix, PC or Mac styles)
What is the best way to construct a regular expression that will match whatever the encoding is?
Alternatively, I could use universal_newline support on input, but now I’m interested to see what the regex would be.
Answer #1:
The regex I use when I want to be precise is "rn?|n"
.
When I’m not concerned about consistency or empty lines, I use "[rn]+"
, I imagine it makes my programs somewhere in the order of 0.2% faster.
Answer #2:
The pattern can be simplified to r?n
for a little performance gain, as you probably don’t have to deal with the old Mac style (OS 9 is unsupported since February 2002).