Regex Guru

Sunday, 29 June 2008

Jeff Atwood on Regular Expressions

Filed under: Links — Jan Goyvaerts @ 15:31

Last Friday Jeff Atwood makes a case for judicious use of regular expressions in the article Regular Expressions: Now You Have Two Problems on his Coding Horror blog.
Nitpick: In free-spacing mode (RegexOptions.IgnoreWhitespace in .NET), the # starts a comment all by itself, which runs to the end of the line. # comment is three [...]

Tuesday, 27 May 2008

Writing Offline

Filed under: About Regex Guru — Jan Goyvaerts @ 12:56

I’m co-writing a book on regular expressions. This blog will likely be quiet until we’re done writing the book.

Thursday, 8 May 2008

Follow Up with Adequate Testing

Filed under: Regex Trouble — Jan Goyvaerts @ 15:05

The regular expression from the Do Follow plugin is dedicated to a single purpose. Repurposing it for your own code will expose shortcomings that don’t matter for the plugin, but may matter for what you’re trying to do. Never copy-and-paste a regex without testing it.

No Follow The Lazy Dot

Filed under: Regex Trouble — Jan Goyvaerts @ 8:31

The popular Do Follow WordPress plugin uses a rather inefficient regular expression for its job. Here’s how to improve it.

Wednesday, 23 April 2008

PCRE Library for MySQL

Filed under: Regex Libraries — Jan Goyvaerts @ 11:24

A RegexBuddy user pointed me to LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library.
MySQL’s built-in regular expression support uses the POSIX ERE flavor. By todays standards, that flavor offers limited regex functionality. PCRE on the other hand offers all the goodies from Perl and [...]

Tuesday, 15 April 2008

Watch Out for Zero-Length Matches

Filed under: Regex Trouble — Jan Goyvaerts @ 14:51

Zero-length matches are often an unintended result of mistakenly making everything optional in a regular expression. Sometimes they can be useful. In browsers like Firefox, zero-length matches can cause your JavaScript code to loop forever on regex.exec().

Tuesday, 8 April 2008

Unintended Backtracking Can Bite You

Filed under: Regex Trouble — Jan Goyvaerts @ 17:01

Backtracking occurs when the regular expression engine encounters a regex token that does not match the next character in the string. The regex engine will then back up part of what it matched so far, to try different alternatives and/or repetitions. Understanding this process will make all the difference between guessing and understanding [...]

Friday, 4 April 2008

Escape Characters Only When Necessary

Filed under: Regex Philosophy — Jan Goyvaerts @ 12:35

The general rule is to only escape a character only if it really has to be escaped.

Thursday, 3 April 2008

wxRegEx class in wxWidgets

Filed under: Regex Code — Jan Goyvaerts @ 16:24

The wxRegEx class in the wxWidgets library encapsulates the Advanced Regular Expressions engine developed by Tcl. I’ve added a page of detailed documentation for this class to www.regexp.info. RegexBuddy now includes a template for generating C++ source code snippets using wxRegEx.

Wednesday, 26 March 2008

No One-on-One Advice

Filed under: About Regex Guru — Jan Goyvaerts @ 9:32

I’m sorry, but this blog really isn’t a place for personal advice on regular expressions. Suggestions for topics that I should write about are very welcome.

Friday, 21 March 2008

preg_replace_callback

Filed under: Regex Code — Jan Goyvaerts @ 10:48

PHP’s preg_replace_callback() function allows you to do a search-and-replace using a dynamically generated replacement text which can be different for each regex match.

Tuesday, 18 March 2008

If You Do It Differently, Document It Clearly

Filed under: The Guru's Kitchen — Jan Goyvaerts @ 18:19

People get used to established standards. Even bad ones. If you come up with something better, make sure to explain it clearly, or brace yourself for lots of complaints you’re not following the old ways.

Next Page »