I’m posting it here and making it easier to find for people that are searching for examples of r gsub. Also posted towards the bottom of this post is some examples to help you get started.

It took me a long time to figure out that there was a regular sytax that I should be using with gsub. Clearly I’m new to R which is why I didn’t realize this. Anyway. Below is a copy of the table that I found on http://www.endmemo.com/program/R/gsub.php, I wish I had found this when I was first researching g sub.

I’m posting it here and making it easier to find for people that are searching for examples of gsub. Also posted, towards the bottom of this post is some examples to help you get started.

 

Syntax Description
\\d Digit, 0,1,2 … 9
\\D Not Digit
\\s Space
\\S Not Space
\\w Word
\\W Not Word
\\t Tab
\\n New line
^ Beginning of the string
$ End of the string
\ Escape special characters, e.g. \\ is “\”, \+ is “+”
| Alternation match. e.g. /(e|d)n/ matches “en” and “dn”
Any character, except \n or line terminator
[ab] a or b
[^ab] Any character except a and b
[0-9] All Digit
[A-Z] All uppercase A to Z letters
[a-z] All lowercase a to z letters
[A-z] All Uppercase and lowercase a to z letters
i+ i at least one time
i* i zero or more times
i? i zero or 1 time
i{n} i occurs n times in sequence
i{n1,n2} i occurs n1 – n2 times in sequence
i{n1,n2}? non greedy match, see example
i{n,} i occures >= n times
[:alnum:] Alphanumeric characters: [:alpha:] and [:digit:]
[:alpha:] Alphabetic characters: [:lower:] and [:upper:]
[:blank:] Blank characters: e.g. space, tab
[:cntrl:] Control characters
[:digit:] Digits: 0 1 2 3 4 5 6 7 8 9
[:graph:] Graphical characters: [:alnum:] and [:punct:]
[:lower:] Lower-case letters in the current locale
[:print:] Printable characters: [:alnum:], [:punct:] and space
[:punct:] Punctuation character: ! ” # $ % & ‘ ( ) * + , – . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
[:space:] Space characters: tab, newline, vertical tab, form feed, carriage return, space
[:upper:] Upper-case letters in the current locale
[:xdigit:] Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f

 

Here are a bunch of r gsub examples I hope they help, feel free to ask questions in the comments and also to make recommendations on my code here.

Message <- “3:30 that monkey! bought  8 > bananas, 1 orange, and > 623 grapes.”

remove time stamp “\\d:\\d”at start of text “^” with gsub in r and replace with text “null time”

> Message <- “3:30 that monkey! bought 8 > bananas, 1 orange, and > 623 grapes.”

> Message <- gsub(“^\\d:\\d\\d”,”null time”, Message)

> print(Message)

[1] “null time that monkey! bought 8 > bananas, 1 orange, and > 623 grapes.”

 

replace a specific number with text using gsub in r

> Message <- “3:30 that monkey! bought 8 bananas, and > 623 grapes.”

> Message<-gsub(“\\s8\\s”, “–EIGHT–“, Message, perl = FALSE)

> print(Message)

[1] “3:30 that monkey! bought–EIGHT–bananas, and > 623 grapes.”

 

replace all numbers preceded by AND followed by a space “\\s” with text using gsub in r

> Message <- “3:30 that monkey! bought 8 bananas, 2 oranges and > 623 grapes.”

> Message<-gsub(“\\s\\d\\s”, ” This was a Number “, Message, perl = FALSE)

> print(Message)

[1] “3:30 that monkey! bought This was a Number bananas, This was a Number oranges and > 623 grapes.”

 

remove an “!” that is followed by a space “\\s” and replace with text ” WHAT? ”

> Message <- “3:30 that monkey! bought 8 > bananas, 1 orange, and > 623 grapes.”

> Message<-gsub(“!\\s”, ” WHAT? “, Message, perl = FALSE)

> print(Message)

[1] “3:30 that monkey WHAT? bought 8 > bananas, 1 orange, and > 623 grapes.”

 

Remove all numbers using gsub in r language

> Message <- “3:30 that monkey! bought 8 > bananas, 1 orange, and > 623 grapes.”

> Message <- gsub(“\\d”,””, Message, perl = FALSE)

> print(Message)

[1] “: that monkey! bought  > bananas,  orange, and >  grapes.”

 

Remove a comma “,” and replace with nothing “” using gsub

> Message <- “3:30 that monkey! bought 8 > bananas, 1 orange, and > 623 grapes.”

> Message <- gsub(“,”,””, Message, perl = FALSE)

> print(Message)

[1] “3:30 that monkey! bought 8 > bananas 1 orange and > 623 grapes.”

 

Remove arrow “>” precededby and followed by a space “\\s” and replaced with a single space ” ”

> Message <- “3:30 that monkey! bought 8 > bananas, 1 orange, and > 623 grapes.”

> Message <- gsub(“\\s>\\s”,” “,Message, perl = FALSE)

> print(Message)

[1] “3:30 that monkey! bought 8 bananas, 1 orange, and 623 grapes.”

 

Remove dash “\” precededby and followed by a space “\\s” and replaced with a single space ” ” using r gsub function in r language.

> Message <- “3:30 that monkey! bought \ 8 > bananas, 1 orange, and > 623 grapes.”

> Message <- gsub(“\\s\\\\s”,””,Message, perl = FALSE)

> print(Message)

[1] “3:30 that monkey! bought  8 > bananas, 1 orange, and > 623 grapes.”

 

 

Some great resources for more info:

 

http://astrostatistics.psu.edu/su07/R/html/base/html/grep.html

https://stackoverflow.com/questions/35655485/replace-with-space-using-gsub-in-r

https://rpubs.com/alucas/160142

 

r gsub

gsub r

r string replace

gsub in r

r language

stringr

gsub example