Book Review: Teach Yourself Regular Expressions in 10 Minutes by Ben Forta

"So you've got a problem, and you want to use Regular Expressions to solve it. Now you've got two problems." - Scott Hanselman

Although numerous languages (including the .NET languages like C#, VB.NET and Javascript) and tools (editors like VS.NET & EditPlus) support Regular Expressions, it is a daunting topic for developers due to its hieroglyphic & cryptic nature.

When I came across the book, SAMS Teach Yourself Regular Expressions in 10 Minutes I was a little put off by the boastful and exaggerated title. Although the title does serve well to make the subject less daunting, it is a nice marketing gimmick and the author reveals in the preface that it is a tutorial book organized into a series of easy-to-follow 10 10-minute lessons.

I took more time than 10 minutes to read through each of the 10 chapters and appendices and found it insightful & educative. The author follows a hands-on approach & a informal style to guide the reader gently through the arcane details of Regular Expressions. There are numerous examples (including an appendix on Regular Expression Solutions to Common Problems) and the detailed analysis of the patterns discussed helps developers to build the reasoning when they have to write their own patterns independently.

The book does a good job of familiarizing the reader with Regular Expression terminology like Ranges, Metacharacters ( including Class metacharacters), Repeating Matches (Interval, Greedy, Lazy), Position Matching, Subexpressions, Lookahead and Lookbehind, Embedding Conditions and also alerts the reader about the pitfalls involved. For instance, the range [A-z] (unlike [A-Z] which matches all uppercase characters from A to Z or [a-z] which matches all lowercase characters from a to z ) matches all characters between ASCII A to ASCII z AND it also includes characters such as [ and ^, which fall between Z and a in the ASCII table making it dangerous to use.

The author Ben Forta, covers the finer details well; like when he points that special metacharacters like - (hyphen) and ^ (caret) have multiple uses. - (hyphen) is only a metacharacter when used between [ and ]. Outside of a set, – is a literal and will match only - and ^ (caret) negates a set only if in a set (enclosed within [ and ]) and is the first character after the opening ]. Outside of a set, and at the beginning of a pattern, ^ matches the start of string.

Not all regular expression implementations are the same. There is an appendix that highlights differences between syntax and features of Regular Expressions in .NET languages, Visual Studio.NET and Javascript among other languages.

There is a Regular Expression Tester in various language versions available online for download from the book's home page. Content with the freely downloadable Regular Expression Designer that I have been using for a long time, I'm yet to try the Regular Expression Tester.

While there are numerous websites on Regular Expressions (my favorite being RegExLib.com) to learn from, I recommend this book as a good starting point to understand Regular Expressions well & be productive fast.

Comments

  1. Thanks for the review, I'm glad you found the book useful and informative.

    --- Ben

    ReplyDelete

Post a Comment