A regular expression that can be used as a basic filter to detect invalid names.
- I use it in an Indian context. May not be useful for others.
- Use it with an ‘i’ RegEx flag to ignore case.
- Filters out spelling mistakes & gibberish entries; but not all invalid entries!
- Similarly useful for other English nouns as well, and not just names.
- Useful for quick frontend validation in scenarios where users may tap gibberish to quickly skip a mandatory ‘name’ parameter.
^(?!(?:(?:([a-z]) *\1(?: *\1)*)|(?:.*?(?:(?:(?:^|[^d])([a-z])\2\2)|(?:d([a-df-z])\3\3)).*)|(?:.*?([a-z]{3,})\4\4).*|(?:.*(?:^|[^a-z])[^aeiou \.]{4,}(?:$|[^a-z]).*))$)(?:[a-z]+\.? ){0,2}[a-z]+$
Logic §
- Allows up to three words (first & last name).
- Allows abbreviations (ending in ‘.’) in all but the last word.
- Rejects repetition of any subset of characters.
- Uses rules of pronounceable English syllables to filter bad entries (based on my limited research):
- A consonant cannot be consecutively repeated.
- A vowel cannot be consecutively repeated more than twice.
- Exception: e that can be repeated thrice for certain Indian names, for example, those ending with ‘deee’.
- … (more to be updated soon)
Note:
I had written this long back by analyzing invalid entries made in a fin-tech app used by agents to make entries on behalf of other customers. I will update the exact logic and describe parts of the regex soon!
Test it here & get a sample code in Javascript: