Linux / UNIX Tech Support Forum
This is a discussion on awk scripting help within the Shell scripting forums, part of the Development/Scripting category; Hi. I need a little assistant to write an awk script on linux that reads a file that looks something ...
|
|||||||
| Shell scripting You can discuss the shell scripting, request shell scripts and scripting techniques |
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Hi. I need a little assistant to write an awk script on linux that reads a file that looks something like this:
>description 1-blah blah blah ASDFGHJKLMNPTRWQCVNMJKLOTYUQBEBDBJDBJDBJDBJDJJK HJHJHDTYREWQDAXZCDBNDMMSKSKHGFSHSCVAIOPQTRE >description 2-blah blah blah SDFGHTYUIOPQNMZHJKGCDVBDKDJLDJLJDLDJLJLJLJDJDJJLD YTRQWUATSHNZMKSLDOEBDLDOSLSPWGSSHABAHAHAKLKAA HSTDFDDVDGDGD etc The script should first read the line of the description (or heading seperators): >description 1-blah blah blah and then reads the random sequence of letters below it: ASDFGHJKLMNPTRWQCVNMJKLOTYUQBEBDBJDBJDBJDBJDJJK HJHJHDTYREWQDAXZCDBNDMMSKSKHGFSHSCVAIOPQTRE I need to write a script that reads the sequence of letters under each description heading and count +1 if the letter "J" is missing from the sequence. If the sequence has at least one J, then skip to the next sequence. Can anyone help? Thank You. Last edited by tears; 14-11-2009 at 08:52 AM. |
| Sponsored Links | ||
|
|
|
|||
|
Thanks Jaysunn. I think my example before had a mistake in it .
Maybe it will be easier for me to directly give an example of what I need: For the sequence: >description 1-blah blah blah ASDFGHJKLMNPTRWQCVNMJKLOTYUQBEBDBJDBJDBJDBJDJJK HJHJHDTYREWQDAXZCDBNDMMSKSKHGFSHSCVAIOPQTRE We would NOT count that sequence because the letter J is present. For the sequence: >description 2-blah blah blah SDFGHTYUIOPQNMZHPKGCDVBDKDPLDPLPDLDPLPLPLPDPDPPLD YTRQWUATSHNZMKSLDOEBDLDOSLSPWGSSHABAHAHAKLKAA HSTDFDDVDGDGD We would count that sequence since there is no J present. So the total number of sequences without a J is 1 (in this example), in a file containing these two sequences. Does your script do that? Last edited by tears; 16-11-2009 at 10:46 AM. |
|
||||
|
This was written by another expert. Not me.
Try this. And make sure you change infile to the name of your input file. Code:
awk '/^>/{if(s==1)t++;s=1;next} /J/{s=0} END{if(s==1)t++;print "J-less sequences: "t}' infile
Jaysunn |
|
||||
|
Actually,
The above code did not produce what O/P required. Here is another program that should work. Again, Not my code and I thank "ahmad.diab" and "scrutinizer" for the assistance. I learned from this as well. Code:
gawk '
/^>/{ t=$0 ; a[t]=0 ;next}
/J/{a[t]++ }
END{ for ( i in a ) { printf "J exist in %s sequence(s) of header %s\n",a[i],i } }
' input_file
Code:
J exist in 2 sequence(s) of header >description 1-blah blah blah J exist in 1 sequence(s) of header >description 2-blah blah blah Jaysunn |
|
|||
|
Hi Jaysunn,
The first code below worked well: awk '/^>/{if(s==1)t++;s=1;next} /J/{s=0} END{if(s==1)t++;print "J-less sequences: "t}' infile It is exactly what I needed. This site is awesome! I thank everyone who helped. Does the site take a donation? Maybe you could explain the code a little and I can learn too. The second code also worked for what it does, but I just needed a total count of J-less sequences. Is there a way to print out the headers for those sequences without a J? Best, Tears Last edited by tears; 16-11-2009 at 10:48 AM. |
|
||||
|
Hey,
Great News that it worked for you. Quote:
I work for thanks. You know that little thumbs up icon that you can click to add to someones reputation. HEHE, Jaysunn |
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) |
|
| Thread Tools | |
| Display Modes | |
|
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Learning Shell Scripting | ricc | Shell scripting | 5 | 02-04-2009 03:11 PM |
| perl scripting | pansarevai | Coding in General | 2 | 17-03-2008 05:34 PM |
| new to scripting | poone1 | Shell scripting | 0 | 15-01-2008 01:19 AM |
| need help on shell scripting | rahul_sayz | Shell scripting | 1 | 08-12-2007 10:37 AM |
| scripting | ganes | Shell scripting | 2 | 06-09-2005 12:04 PM |