I have looked a lot to find the solution but could not find one. I know how to remove all tags using sed
but I need to remove only those HTML tags that are empty or have just tabs or spaces in them and also remove tags explicitly. For example:
<p></p> or <p> </p>
I used the following command to remove all the HTML tags, it works properly but I don't want to remove all tags.
sed -e 's/<[^>]*>//g' myfile.html
same command is used here. Kindly help me out.
Avinash Raj :
You could use the below sed command to remove only the empty tags.\n\nsed 's/<[^\\/][^<>]*> *<\\/[^<>]*>//g' file\n\n\nThrough Perl,\n\nperl -pe 's/<([^<>]*)>\\s*<\\/\\1>//g' file\n",
2014-10-22T05:41:25