nixCraft Linux Forum

nixCraft

Linux / UNIX Tech Support Forum

awk scripting help

This is a discussion on awk scripting help within the Shell scripting forums, part of the Development/Scripting category; Hi. I need a little assistant to write an awk script on linux that reads a file that looks something ...


Go Back   nixCraft Linux Forum > Development/Scripting > Shell scripting

Linux answers from nixCraft.


Shell scripting You can discuss the shell scripting, request shell scripts and scripting techniques

Reply

 

LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 14-11-2009, 08:50 AM
Junior Member
User
 
Join Date: Nov 2009
OS: Debian
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Rep Power: 0
tears is on a distinguished road
Default awk scripting help

Hi. I need a little assistant to write an awk script on linux that reads a file that looks something like this:

>description 1-blah blah blah
ASDFGHJKLMNPTRWQCVNMJKLOTYUQBEBDBJDBJDBJDBJDJJK
HJHJHDTYREWQDAXZCDBNDMMSKSKHGFSHSCVAIOPQTRE
>description 2-blah blah blah
SDFGHTYUIOPQNMZHJKGCDVBDKDJLDJLJDLDJLJLJLJDJDJJLD
YTRQWUATSHNZMKSLDOEBDLDOSLSPWGSSHABAHAHAKLKAA
HSTDFDDVDGDGD
etc

The script should first read the line of the description (or heading seperators):

>description 1-blah blah blah

and then reads the random sequence of letters below it:

ASDFGHJKLMNPTRWQCVNMJKLOTYUQBEBDBJDBJDBJDBJDJJK
HJHJHDTYREWQDAXZCDBNDMMSKSKHGFSHSCVAIOPQTRE

I need to write a script that reads the sequence of letters under each description heading and count +1 if the letter "J" is missing from the sequence. If the sequence has at least one J, then skip to the next sequence.

Can anyone help?

Thank You.

Last edited by tears; 14-11-2009 at 08:52 AM.
Reply With Quote
  #2 (permalink)  
Old 15-11-2009, 12:21 AM
jaysunn's Avatar
Powered By Linux
User
 
Join Date: Apr 2009
Location: 41.332032,-73.089775
OS: RHEL - OSX
Scripting language: BASH - Learning Ruby
Posts: 604
Thanks: 61
Thanked 80 Times in 72 Posts
Rep Power: 10
jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold
Default

Not sure I understand you completely, However try this:

J will be the delimiter and counted. As I said, Not suer if this is what you want.
Code:
#!/usr/bin/perl

use strict;

my $seq_len;
my @a_seq;
my $seq;
my $hdr;

open SEQ_FILE, "<seq.dat"
  or die "can't open file: $!";

while(<SEQ_FILE>)
{
   chomp($_);

   if ( $_ =~ m/description/ )
   {
      print "$_\n";
   }
   else
   {
      @a_seq = split('J', $_);

      foreach $seq (@a_seq)
      {
         $seq_len = length($seq);
         print "$seq $seq_len\n";
      }
   }
}

close SEQ_FILE
  or die "can't close file: $!";
exit

Code:
[root@radio5 ~]# ./perl.pl 
>description 1-blah blah blah
ASDFGH 6
KLMNPTRWQCVNM 13
KLOTYUQBEBDB 12
DB 2
DB 2
DB 2
D 1
 0
K 1
H 1
H 1
HDTYREWQDAXZCDBNDMMSKSKHGFSHSCVAIOPQTRE 39
>description 2-blah blah blah
SDFGHTYUIOPQNMZH 16
KGCDVBDKD 9
LD 2
L 1
DLD 3
L 1
L 1
L 1
D 1
D 1
 0
LD 2
YTRQWUATSHNZMKSLDOEBDLDOSLSPWGSSHABAHAHAKLKAA 45
HSTDFDDVDGDGD 13
[root@radio5 ~]#

Jaysunn
__________________
Have a look at what I have been working on
http://www.shellasaurus.com

Last edited by jaysunn; 15-11-2009 at 12:25 AM.
Reply With Quote
  #3 (permalink)  
Old 15-11-2009, 12:46 AM
Junior Member
User
 
Join Date: Nov 2009
OS: Debian
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Rep Power: 0
tears is on a distinguished road
Default clarification of script question

Thanks Jaysunn. I think my example before had a mistake in it .
Maybe it will be easier for me to directly give an example of what I need:

For the sequence:
>description 1-blah blah blah
ASDFGHJKLMNPTRWQCVNMJKLOTYUQBEBDBJDBJDBJDBJDJJK
HJHJHDTYREWQDAXZCDBNDMMSKSKHGFSHSCVAIOPQTRE

We would NOT count that sequence because the letter J is present.

For the sequence:
>description 2-blah blah blah
SDFGHTYUIOPQNMZHPKGCDVBDKDPLDPLPDLDPLPLPLPDPDPPLD
YTRQWUATSHNZMKSLDOEBDLDOSLSPWGSSHABAHAHAKLKAA
HSTDFDDVDGDGD

We would count that sequence since there is no J present.

So the total number of sequences without a J is 1 (in this example),
in a file containing these two sequences.

Does your script do that?

Last edited by tears; 16-11-2009 at 10:46 AM.
Reply With Quote
  #4 (permalink)  
Old 15-11-2009, 06:02 PM
jaysunn's Avatar
Powered By Linux
User
 
Join Date: Apr 2009
Location: 41.332032,-73.089775
OS: RHEL - OSX
Scripting language: BASH - Learning Ruby
Posts: 604
Thanks: 61
Thanked 80 Times in 72 Posts
Rep Power: 10
jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold
Default

This was written by another expert. Not me.

Try this. And make sure you change infile to the name of your input file.


Code:
awk '/^>/{if(s==1)t++;s=1;next} /J/{s=0} END{if(s==1)t++;print "J-less sequences: "t}' infile
This is sweet code that I am still trying to understand.


Jaysunn
__________________
Have a look at what I have been working on
http://www.shellasaurus.com
Reply With Quote
  #5 (permalink)  
Old 15-11-2009, 08:31 PM
jaysunn's Avatar
Powered By Linux
User
 
Join Date: Apr 2009
Location: 41.332032,-73.089775
OS: RHEL - OSX
Scripting language: BASH - Learning Ruby
Posts: 604
Thanks: 61
Thanked 80 Times in 72 Posts
Rep Power: 10
jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold
Default

Actually,

The above code did not produce what O/P required. Here is another program that should work.
Again, Not my code and I thank "ahmad.diab" and "scrutinizer" for the assistance. I learned from this as well.

Code:
gawk '
/^>/{ t=$0 ; a[t]=0 ;next}
/J/{a[t]++  }
END{ for ( i in a ) { printf "J exist in %s sequence(s) of header %s\n",a[i],i } }
'  input_file
Output
Code:
J exist in 2 sequence(s) of header >description 1-blah blah blah
J exist in 1 sequence(s) of header >description 2-blah blah blah

Jaysunn
__________________
Have a look at what I have been working on
http://www.shellasaurus.com
Reply With Quote
  #6 (permalink)  
Old 16-11-2009, 10:31 AM
Junior Member
User
 
Join Date: Nov 2009
OS: Debian
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Rep Power: 0
tears is on a distinguished road
Thumbs up It worked.

Hi Jaysunn,

The first code below worked well:
awk '/^>/{if(s==1)t++;s=1;next} /J/{s=0} END{if(s==1)t++;print "J-less sequences: "t}' infile

It is exactly what I needed. This site is awesome! I thank everyone who helped. Does the site take a donation?

Maybe you could explain the code a little and I can learn too.

The second code also worked for what it does, but I just needed a total count of J-less sequences. Is there a way to print out the headers for those sequences without a J?

Best,
Tears

Last edited by tears; 16-11-2009 at 10:48 AM.
Reply With Quote
  #7 (permalink)  
Old 16-11-2009, 05:56 PM
jaysunn's Avatar
Powered By Linux
User
 
Join Date: Apr 2009
Location: 41.332032,-73.089775
OS: RHEL - OSX
Scripting language: BASH - Learning Ruby
Posts: 604
Thanks: 61
Thanked 80 Times in 72 Posts
Rep Power: 10
jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold jaysunn is a splendid one to behold
Default

Hey,

Great News that it worked for you.


Quote:
Does the site take a donation?
Hmm. I will be sure to ask the admin in regards to this. However I answer questions for the knowledge that I gain and it makes me feel good to help others.

I work for thanks. You know that little thumbs up icon that you can click to add to someones reputation.
HEHE,

Jaysunn
__________________
Have a look at what I have been working on
http://www.shellasaurus.com
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads

Thread Thread Starter Forum Replies Last Post
Learning Shell Scripting ricc Shell scripting 5 02-04-2009 03:11 PM
perl scripting pansarevai Coding in General 2 17-03-2008 05:34 PM
new to scripting poone1 Shell scripting 0 15-01-2008 01:19 AM
need help on shell scripting rahul_sayz Shell scripting 1 08-12-2007 10:37 AM
scripting ganes Shell scripting 2 06-09-2005 12:04 PM


All times are GMT +5.5. The time now is 08:25 AM.


Powered by vBulletin® Version 3.8.5 - Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.2
©2005-2010 nixCraft. All rights reserved

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38