We write about Ruby on Rails, React.js, React Native, remote work, open source, engineering and design.
Regular expressions is a powerful tool in any language. Here I am discussing how to use regular expression in JavaScript.
In JavaScript regular expression can be defined two ways.
1var regex = /hello/gi; // i is for ignore case. g is for global.
2var regex = new RegExp("hello", "ig");
If I am defining regular expression using RegExp then I need to add escape character in certain cases.
1var regex = /hello_\w*/gi;
2var regex = new RegExp("hello_\\w*", "ig"); //notice the extra backslash before \w
When I am defining regular expression using RegExp then \w
needs to be escaped otherwise it would be
taken literally.
test method is to check if a match is found or not. This method returns true or false.
1var regex = /hello/gi;
2var text = "hello_you";
3var bool = regex.test(text);
exec method finds if a match is found or not. It returns an array if a match is found. Otherwise it returns null.
1var regex = /hello_\w*/gi;
2var text = "hello_you";
3var matches = regex.exec(text);
4console.log(matches); //=> hello_you
match method acts exactly like exec method if no g
parameter is passed. When global flag is turned on the match returns an Array containing all the matches.
Note that in exec
the syntax was regex.exec(text)
while in match
method the syntax is
text.match(regex)
.
1var regex = /hello_\w*/i;
2var text = "hello_you and hello_me";
3var matches = text.match(regex);
4console.log(matches); //=> ['hello_you']
Now with global flag turned on.
1var regex = /hello_\w*/gi;
2var text = "hello_you and hello_me";
3var matches = text.match(regex);
4console.log(matches); //=> ['hello_you', 'hello_me']
Once again both exec
and match
method without g
option do not get all the matching values from a string. If you want all the matching values then you need to iterate through the text. Here is an example.
Get both the bug numbers in the following case.
1var matches = [];
2var regex = /#(\d+)/gi;
3var text = "I fixed bugs #1234 and #5678";
4while ((match = regex.exec(text))) {
5 matches.push(match[1]);
6}
7console.log(matches); // ['1234', '5678']
Note that in the above case global flag g
. Without that above code will run forever.
1var matches = [];
2var regex = /#(\d+)/gi;
3var text = "I fixed bugs #1234 and #5678";
4matches = text.match(regex);
5console.log(matches);
In the above case match is used instead of regex . However since match with global flag option brings all the matches there was no need to iterate in a loop.
When a match is made then an array is returned. That array has two methods.
1var regex = /#(\d+)/i;
2var text = "I fixed bugs #1234 and #5678";
3var match = text.match(regex);
4console.log(match.index); //13
5console.log(match.input); //I fixed bugs #1234 and #5678
replace method takes both regexp and string as argument.
1var text = "I fixed bugs #1234 and #5678";
2var output = text.replace("bugs", "defects");
3console.log(output); //I fixed defects #1234 and #5678
Example of using a function to replace text.
1var text = "I fixed bugs #1234 and #5678";
2var output = text.replace(/\d+/g, function (match) {
3 return match * 2;
4});
5console.log(output); //I fixed bugs #2468 and #11356
Another case.
1// requirement is to change all like within <b> </b> to love.
2var text = " I like JavaScript. <b> I like JavaScript</b> ";
3var output = text.replace(/<b>.*?<\/b>/g, function (match) {
4 return match.replace(/like/g, "love");
5});
6console.log(output); //I like JavaScript. <b> I love JavaScript</b>
Example of using special variables.
1$& - the matched substring.
2$` - the portion of the string that precedes the matched substring.
3$' - the portion of the string that follows the matched substring.
4$n - $0, $1, $2 etc where number means the captured group.
1var regex = /(\w+)\s(\w+)/;
2var text = "John Smith";
3var output = text.replace(regex, "$2, $1");
4console.log(output); //Smith, John
1var regex = /JavaScript/;
2var text = "I think JavaScript is awesome";
3var output = text.replace(regex, "before:$` after:$' full:$&");
4console.log(output); //I think before:I think after: is awesome full:JavaScript is awesome
Replace method also accepts captured groups as parameters in the function. Here is an example;
1var regex = /#(\d*)(.*)@(\w*)/;
2var text = "I fixed bug #1234 and twitted to @javascript";
3text.replace(regex, function (_, a, b, c) {
4 log(_); //#1234 and twitted to @javascript
5 log(a); //1234
6 log(b); // and twitted to
7 log(c); // javascript
8});
As you can see the very first argument to function is the fully matched text. Other captured groups are subsequent arguments. This strategy can be applied recursively.
1var bugs = [];
2var regex = /#(\d+)/g;
3var text = "I fixed bugs #1234 and #5678";
4text.replace(regex, function (_, f) {
5 bugs.push(f);
6});
7log(bugs); //["1234", "5678"]
split method can take both string or a regular expression.
An example of split using a string.
1var text = "Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec";
2var output = text.split(",");
3log(output); // ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
An example of split using regular expression.
1var text = "Harry Trump ;Fred Barney; Helen Rigby ; Bill Abel ;Chris Hand ";
2var regex = /\s*;\s*/;
3var output = text.split(regex);
4log(output); // ["Harry Trump", "Fred Barney", "Helen Rigby", "Bill Abel", "Chris Hand "]
The requirement given to me states that I should strictly look for word
java
, ruby
or rails
within word boundary. This can be done like this.
1var text = "java";
2var regex = /\bjava\b|\bruby\b|\brails\b/;
3text.match(regex);
Above code works. However notice the code duplication. This can be refactored to the one given below.
1var text = "rails";
2var regex = /\b(java|ruby|rails)\b/;
3text.match(regex);
Above code works and there is no code duplication. However in this case I am asking regular expression engine to create a captured group which I'll not be using. Regex engines need to do extra work to keep track of captured groups. It would be nice if I could say to regex engine do not capture this into a group because I will not be using it.
?:
is a special symbol that tells regex engine to create non capturing group. Above code can be refactored into the one given below.
1var text = "rails";
2var regex = /\b(?:java|ruby|rails)\b/;
3text.match(regex);
1text = "#container a.filter(.top).filter(.bottom).filter(.middle)";
2matches = text.match(/^[^.]*|\.[^.]*(?=\))/g);
3log(matches);