In this post we will see how to delete repeated words. There is a human tendency to write fast and and when we try to review our writing we will find repeated words side by side. If you observe I written “and” two times. This is human mind tendency to process before we write actual word. Its hard to read entire file for duplicate words if the file is big enough to skim the text. This even cause to skip some words. A better procedure is to use some tools like SED and Perl/Python to do this with the help of Regular Expressions. I have a file abc.txt with following data. cat abc.txtOutput: This is is how it works buddyWhat else else you want Remove repeated words with SED as given below. sed -ri ‘s/(.* )1/1/g’ abc.txt cat abc.txt Output: This is how it works buddyWhat else you wantLet me explain sed command which we used. -r option is for enabling Extended Regular Expression which have grouping option with () braces.-i option for inserting the changes to original file, Be careful with this option as you can not get your original file once modified. (.* ) for mentioning any group of characters and which is followed by same set of characters which is represented by 1. This concept is called back reference, where 1 can store first set of characters enclosed in first...Read More
Search Results for: label/Perl
Get more stuff like this
in your inbox
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.
we respect your privacy and take protecting it seriously
My name is Surendra Kumar Anne. I hail from Vijayawada which is cultural capital of south Indian state of Andhra Pradesh. I am a Linux evangelist who believes in Hard work, A down to earth person, Likes to share knowledge with others, Loves dogs, Likes photography. At present I work at Bank of America as Sr. Analyst Systems and Administration. You can contact me at surendra (@) linuxnix dot com.