Step by Step guide to validate a user name and US & Ethiopian Phone Numbers with PHP preg_match

PHP preg_match examples that can be used for real-life projects are hard to come by. Here I try to elaborate regex in PHP by using actual codes I have used to validate user name, email, phone number, etc. However, as usual, I suggest checking the basics of Regex from the official PHP website. Also, we should use basic PHP string comparison functions like strpos for some of these tasks.

PHP preg_match example to validate user name

A typical username might have the following requirements.

  • It must start with a letter.
  • It should have a reasonable size. For example greater than 4 and not less than 25.
  • It can only contain letters, numbers, and the underscore character.
  • And it can not end with an underscore.

Considering the above requirement, let’s start to write a preg_match pattern that will validate a username. First, we will take a very easy approach and try to make our pattern better as we gain more knowledge of the regexes.

The first requirement is to make our pattern start with letters. To accomplish that, we can use the caret character (^). Which has a special meaning in Regex.

When we use the caret character at the start of a regex pattern, we are telling PHP to match only strings that start with that letter. Hence, a pattern like the below will match strings starting with PHP.

   $input = "PHP is great";
   preg_match('/^PHP/', $input);//Will return true

Similarly, we can use the dollar character ($) to enforce endings in a regex pattern. That means,

   $input = "PHP is great";
   preg_match('/^PHP$/', $input);//Will return false.

Clearly, the above code returns false because the input string ends with ‘great’ not PHP. To make it work we can replace that string with ‘great’.

This is great and all. But it still lacks in answering the first requirements. Because we can’t list all the alphabets and tell PHP to enforce that.

What can we do?

Character classes to specify a range

Instead of listing the alphabets, Regex gives us character class ranges. Ranges are awesome when we want to specify the whole alphabet. Both lowercase and uppercase. Or to check every number. For example:-

$input = "abc";
preg_match('/^[a-zA-Z]$/', $input);//Will return true.

The above code snippet is a great elaboration of ranges. It checks if the input is made up of letters. Either small or capital letters. Similarly, we can apply the range, [0-9] to validate our input for numbers in it.

Awesome. Now we are getting there. We are clearly checking if our user input starts and ends with letters.

The next requirement is the size of the username. To check for the size we use the format, {n} in regex. Where n is the number allowed.

Furthermore, we can specify the minimum and maximum value in a brace like this {n, m}.

So, we are ready to handle the second requirement.

   $input = "abc2020";
   preg_match('/^[a-zA-Z]{4, 25}$/', $input);//Will return true.

Alright, so we are at the third requirement. And it states that our userName can only contain letters, numbers, and underscore. I think at this point we are capable of doing this without introducing any new concept. We can use ranges like this, [0-9a-zA-Z_].

Finally, forbidding underscore at the end. For now, we will stick with a range syntax and we have our final regex with preg_match function like this:-

   function validateUserName($userName) {
     if(preg_match('/^[a-zA-Z][0-9a-zA-Z_]{2,23}[0-9a-zA-Z]$/',  $userName)) {
       return true;
     }
     return false;
   }

Recap of username regex

Let’s recap. Our regex has three groups.

  • ^[a-zA-z] Can only start with letters. Either small or capital letter.
  • [0-9a-zA-Z_]{2,23} Allowed length between 2 and 23. Why? Because of two characters to start with and to end I subtracted them from the start and end of the requirement.
  • [0-9a-zA-Z]$ Can only end with a number and a letters.

Making our username regex shorter and better.

Right now, our regex works just fine. But it could be better and shorter.

First, we will remove all the ranges for capital letters. That is if we have [a-z] we don’t need to add A-Z. Because we can use the i flag. Which in regex means, `make my pattern case-insensitive.

Next, we can use \w character class instead of [0-9a-bA-Z.

Finally, for our third group we can use the caret again (^). This time we will use it to mean, not. So my last group would be [^_].

Here is our improved and final regex to validate the username as per our requirement.

   function validateUserName($userName) {
     if(preg_match('/^[a-z]\w{2,23}[^_]$/i', $userName)) {
       return true;
     }
     return false;
   }

As you can imagine, there are so many ways to accomplish what we did here. So I encourage you to try to come up with your own ways of validating username.

PHP preg_match to validate a US and Ethiopian Phone Numbers

HTML5 forms support different input types. And one of them is the number type. You can use it to force users to put in only numbers.

The problem is when you want to let your users format the phone number. For example, by letting them put dash characters. In those cases, we can use the character classes and range limiters we saw above to validate formatted US and Ethiopian phone numbers like this:-

$phoneNumber = "0911-223344";
preg_match('/[0-9]{4}-[0-9]{6}/', $phoneNumber);//Simple regex to validate ethiopian phone number
preg_match("/[0-9]{3}-[0-9]{3}-[0-9]{4}/", $phoneNumber); // Simple regex to validate US phone number

I can simplify this pattern by replacing the character class [0-9] with a simple \d. Which simply indicates any digits.

PHP preg_match to validate US and Ethiopian Phone Numbers

HTML5 forms support different input types. And one of them is the number type. You can use it to force users to put in only numbers.

The problem is when you want to let your users format the phone number. For example, by letting them put dash character. In those cases, we can use the character classes and range limiters we saw above to validated formatted US and Ethiopian phone numbers like this:-

$phoneNumber = "0911-223344";
preg_match('/[0-9]{4}-[0-9]{6}/', $phoneNumber);//Simple regex to validate ethiopian phone number
preg_match("/[0-9]{3}-[0-9]{3}-[0-9]{4}/", $phoneNumber); // Simple regex to validate US phone number

I can simplify this pattern by replacing the character class [0-9] with a simple \d. Which simply indicates any digits.

Validating Ethiopian phone number with a country code

But, what if users put dash in between numbers. Or if they start by +251, which is the country code.

To accomplish this task, we need to introduce ourselves to sub patterns.

Sub patterns enable us to group and validate a small group of our main pattern. For instance, we can group the start of our phone number to be +251 or any country code. Therefore, in our particular case we need the start to be either +251 or 09. Which, to my knowledge are the common phone number patterns Ethiopia.

Now, my pattern would start with ^+251, which means start with +251. But to match the plus sign (+) I need to escape it because + has it’s own meaning in regex. Escaping special characters is possible using the backward slash ().

Hence my start will be ^+251. Which will validate if phone numbers actuall start with +251. However, they might start with 09 too. To include that we will have to introduce ourselves to the alternate character, which is the pipe (|).

Using the pipe, the start of my phone number will be ^((\+251|0)\d{3}). This will force my patterns to start with either +251 or 0 followed by a digit of size 3.

After the initial +251 or 0 followed by 3 digit numbers, we usually have a dash followed by 6 digit number. So I can do (^(\+251|0)\d{3})-?\d{6}.

Note: the ? right after the dash is to indicate optional characters. That means to mean, it can appear either 0 or 1 times.

Finally, our Ethiopian phone number validation regex looks like this.

$phoneNumber = "+251911-223344";
preg_match('/((^(\+251|0)\d{3})-?\d{6})/');//Simple regex to validate ethiopian phone number