"[^"\r\n]*" matches a single-line string that does not allow the quote character to appear inside the string. Using the negated character class is more efficient than using a lazy dot. "[^"]*" allows the string to span across multiple lines.
"[^"\\\r\n]*(?:\\.[^"\\\r\n]*)*" matches a single-line string in which the quote character can appear if it is escaped by a backslash. Though this regular expression may seem more complicated than it needs to be, it is much faster than simpler solutions which can cause a whole lot of backtracking in case a double quote appears somewhere all by itself rather than part of a string. "[^"\\]*(?:\\.[^"\\]*)*" allows the string to span multiple lines.
You can adapt the above regexes to match any sequence delimited by two (possibly different) characters. If we use b for the starting character, e and the end, and x as the escape character, the version without escape becomes b[^e\r\n]*e, and the version with escape becomes b[^ex\r\n]*(?:x.[^ex\r\n]*)*e.

Post a Comment