Understanding and Solving Regular Expression Matching

Regular expressions, colloquially known as regex, are a foundational part of modern computing. They're a staple in string pattern matching, offering a robust way to recognise sequences of characters. LeetCode's Regular Expression Matching problem explores the mechanics of two special characters in regex: . and *.
This is considered a 'hard' coding problem, and you would be incredibly unlucky to get presented with this during an interview. Nevertheless, a little knowledge will help, so let's unravel this challenge.
Understanding the Problem
Problem Statement
Given an input string (s) and a pattern (p), implement regular expression matching with support for . and *.
.matches any single character,*matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Examples
s = "aa",p = "a". Returnsfalse.s = "aa",p = "a*"Returnstrue.
This is a unique challenge as we are essentially being asked to replicate a basic regex engine. It can be tackled with a recursive approach, dynamic programming, or using a backtracking strategy.
Solution: Recursive Approach
A recursive solution breaks down the problem into simpler versions of itself. At every step, we can match a piece of the string against a piece of the pattern until we either have a match, or we otherwise run out of string and/or pattern.
A Solution Using TypeScript
const isMatch = (s: string, p: string): boolean => { // Base case if (!p.length) return !s.length; // Check if the first characters match (or if pattern has a '.') const firstMatch = s.length && (s[0] === p[0] || p[0] === '.'); // Check for a '*' at the second position in the pattern if (p.length >= 2 && p[1] === '*') { return ( // Try not using '*' (i.e., use it to represent zero occurrence) isMatch(s, p.substring(2)) || // Use '*' to represent one or more occurrence (firstMatch && isMatch(s.substring(1), p)) ); } return firstMatch && isMatch(s.substring(1), p.substring(1));};How It Works
Base Case:
If the patternpis empty, then the input stringsshould also be empty for a successful match.First Character Match:
Next, we check if the first characters of the string and pattern match. They match if they are the same or if the pattern has a..Handling
*: If the second character in the pattern is*, we have two choices:- Assume
*represents zero occurrences of the preceding element and match the string with the remaining pattern. - Assume
*represents one occurrence (or more) and move to the next character in the string but use the same pattern. - If there is no
*, we just move to the next characters in both the string and the pattern.
Walking the Solution Step‑By‑Step
Take s = "aab" and p = "c*a*b".
- We notice that the second character in
pis*, so we have two choices. - First, we try using it as zero occurrence, so we match
switha*b. This doesn't work since the first characters don't match. - Next, we consider
*as one or more occurrence. Now,sbecomes"aab"andpremains"cab". - We continue this way, using recursion, until we find a successful match or exhaust our options.
Wrapping up
The Regular Expression Matching problem offers a sneak peek into the inner workings of regex engines. Whilst the recursive solution elucidated above provides a clear approach, it isn't the most optimal.
Dynamic programming can offer a more efficient solution by avoiding redundant computations, but it becomes much more complicated and ‑ perhaps ‑ is a tale for another time.
In essence, understanding and solving this problem equips web developers with deeper insights into string manipulations and pattern recognition.