Linux / UNIX Tech Support Forum
This is a discussion on Perl / Shell Regex to clean and remove all HTML tags within the Coding in General forums, part of the Development/Scripting category; Hello, I have the text below, and need to extract all domain names ending in .ro I tried a perl ...
|
|||||||
| Coding in General Discussion on PHP/Perl/Python/Ruby/GNU C or C++. MySQL, PgSQL and (X)HTML or any other programming languages you want. |
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Hello,
I have the text below, and need to extract all domain names ending in .ro I tried a perl script Code:
#!/usr/bin/perl -wnl $_=~ s/\Q>\E/ /g and /.+ *\Q.ro\E/i and print $&; Code:
<td ALIGN=CENTER style="padding:.75pt;height:1"> <center> <p><font size="2">92</font></p> </center> </td> <td ALIGN=CENTER style="padding:.75pt;height:1"> <center> <p><font size="2">domain1.ro<br> domain2.com.ro<br> domain3.ro<br> domain4.ro</font></p> </center> </td> Cosmin |
| Sponsored Links | ||
|
|
|
||||
|
How about sed one liner?
Code:
sed -e :a -e 's/<[^>]*>//g;/</N;//ba' file.html Code:
grep '.ro' file.html | sed -e :a -e 's/<[^>]*>//g;/</N;//ba'
__________________
Vivek Gite Linux Evangelist |
![]() |
| Tags |
| linux , perl , perl remove html tags , sed , sed remove html tags , unix |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) |
|
| Thread Tools | |
| Display Modes | |
|
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Add HTML Code Inside HTML File | eawedat | Coding in General | 5 | 05-11-2008 05:53 PM |
| UNIX Shell Change the EOL (\n newline) by nothing - remove new line | permalac | Shell scripting | 5 | 14-05-2008 11:03 PM |
| Perl simple html mail | chiku | Coding in General | 3 | 17-08-2007 07:59 PM |
| Shell,perl plsql programmer required (get paid) | amitoverseas40 | Solaris/OpenSolaris | 0 | 13-08-2005 09:15 PM |
| Shell,perl plsql programmer required (get paid) | amitoverseas40 | Shell scripting | 0 | 13-08-2005 09:12 PM |