Results 1 to 9 of 9

Thread: Making Report By Extracting data from Multiple files

  1. #1
    Junior Member
    Join Date
    Dec 2012
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0

    Lightbulb Making Report By Extracting data from Multiple files

    I have text file for every day containg data in two rows. First Column data is variable in each file.
    But second column data value can be same in all files.

    1st file 2nd file 3rd file 4th file 5th file
    13|200 32|200 45|200 21|200 43|200
    45|400 53|400 22|410 75|400 46|345
    34|345 75|345 45|345 45|410 98|410

    I want to make report like this using script

    200 400 345 410
    13 45 34
    32 53 75
    45 45 22
    21 75 45
    43 46 98


    Thanks

  2. #2
    Senior Member
    Join Date
    Aug 2011
    Posts
    367
    Thanks
    0
    Thanked 55 Times in 51 Posts
    Rep Power
    7

    Default

    explain the logic behind the desired output.

    is it «transposing the matrix/file» ? awk can do this; do some search: it's a frequently asked question.
    «A problem clearly stated is a problem half solved.»

  3. #3
    Junior Member
    Join Date
    Dec 2012
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0

    Default

    I want to make a monthly report based on every day occurance of errors.
    I have tried the following code

    for ((x=200; ++x<=999; ))
    do
    find . -name "File-2012120*" -type f | xargs awk -F "|" '{if($2=$x) print $1"|" $x}' > report.txt
    done

    But it display the data in single column.

  4. #4
    Senior Member
    Join Date
    Aug 2011
    Posts
    367
    Thanks
    0
    Thanked 55 Times in 51 Posts
    Rep Power
    7

    Default

    let's say you'vegot a file named 2012120-1 (we'll start with one file)
    in the format
    Code:
    13|200
    45|400
    34|345
    you want to test if the second field is more than 200 and less than 999; then write the first field on a line, and the second field on the next one.

    is that it?

    you need to phrase precisely things you want to be done, before writing any code!

    what is the hierarchy of the current directory? are all log files right under, or could there be any under sub-directories? is `find' really useful?
    «A problem clearly stated is a problem half solved.»

  5. #5
    Junior Member
    Join Date
    Dec 2012
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0

    Default

    Loop is checking the occurance of 200 in all files to display it under the column name "200".
    If there is missing value like 400 is missing in 3rd file it will display blank or empty space.

    1st file
    13|200
    45|400
    34|345

    2nd file
    32|200
    53|400
    75|345


    will become like (its a conversion of column into rows)
    200 400 345
    13 45 34
    32 53 75


    There are no subdirectories no need to use find. I think two loops will be used in order to do this task
    one outer loop and one inner loop.
    Last edited by Ahsanyousaf; 10th December 2012 at 06:16 PM.

  6. #6
    Senior Member
    Join Date
    Aug 2011
    Posts
    367
    Thanks
    0
    Thanked 55 Times in 51 Posts
    Rep Power
    7

    Default

    it's a much more complicated problem than you seem to think (or that I first thought)

    I can't yet see a quick solution, but I'm sure awk can be used to do such things (as perl, or python).

    it implies, imho, multidimensional (2D) array, using a one-dimension one and filenames as index for the output.
    one-D array stores the "headers": the second field
    2D array stores the "data" (the first field), indexed with the FILENAME and the second field
    indices have to be sorted, because arrays are hashed.
    ...

    I'm sorry, at the moment, it's still vague for me, and I can't help more.
    «A problem clearly stated is a problem half solved.»

  7. #7
    Is that all you got? rockdalinux's Avatar
    Join Date
    May 2005
    Location
    Planet Vegeta
    Posts
    983
    Thanks
    27
    Thanked 70 Times in 61 Posts
    Rep Power
    19

    Default

    Some dirty solution. Go into directory where all files are. Say /foo/bar:
    Code:
    cd /foo/bar
    and run the following code from anywhere (say /home/you/bin/dirty-report-writer.bash )
    Code:
    #!/bin/bash
    # Not a award wining code 
    # current dir 
    n=$(cut -d'|' -f2 * | sort | uniq)
    for i in $n
    do 
        file="/tmp/$i.file.$$.tmp"
        for f in *
        do
            grep "$i" "$f" | cut -d'|' -f1 >>"$file"
        done
    done
    
    ## header 
    echo
    for h in $n
    do
        echo -ne "$h\t"
    done
    echo
    
    ## paste it 
    paste /tmp/*.tmp
    
    ## send it to devnull
    rm -f /tmp/*.tmp
    This is what i got:
    Code:
    200	345	400	410	
    13	43	45	22
    32	75	53	45
    45	45	75	98
    21	46		
    43
    Rocky Jr.
    What's wrong? I hope I am not making you uncomfortable...

    Never send a boy to do a mans job.

  8. #8
    Senior Member
    Join Date
    Aug 2011
    Posts
    367
    Thanks
    0
    Thanked 55 Times in 51 Posts
    Rep Power
    7

    Default

    Code:
    #!/usr/bin/gawk -f
    
    BEGIN{
       FS="|"
    }
    
    {
       !(H[$2])H[$2]++
       !(F[FILENAME])F[FILENAME]++
       D[FILENAME,$2]=$1
    }
    END{
       for(h in H){
          r="1"
          R[r] = R[r] ? R[r]"\t"h : h
          for(f in F){
             r++
             R[r] = R[r] ? R[r]"\t"D[f,h] : D[f,h]
          }
       }
       nb=length(R)
       for(i=1;i<=nb;i++)print R[i]
    }
    Code:
    $ ./myScript file*
    400     410     200     345
    45              13      34
    53              32      75
    22      45      45
    75      45      21
    98      43      46
    of course I won't comment this script, it's easy to read and tell what it does.
    this can be sorted using asorti(), but I'm going to let you do it
    «A problem clearly stated is a problem half solved.»

  9. #9
    Senior Member
    Join Date
    Aug 2011
    Posts
    367
    Thanks
    0
    Thanked 55 Times in 51 Posts
    Rep Power
    7

    Default

    I'm re-reading my last post, and realize my script doesn't print a correct output : the script is wrong !

    edit:
    I got it : it was because of empty values that didn't show up.

    here's what seem to work at the moment
    Code:
    #!/usr/bin/gawk -f
    
    BEGIN{
       FS="|"
    }
    {
       !(H[$2])H[$2]++
       !(F[FILENAME])F[FILENAME]++
       D[FILENAME,$2]=$1
    }
    END{
       for(h in H){
          r = "1"
          R[r] = R[r] ? R[r]"\t"h : h
          for(f in F){
             r++
             d = D[f,h] ? D[f,h] : " "
             R[r] = R[r] ? R[r]"\t"d : d
          }
       }
       nb=length(R)
       for(i=1;i<=nb;i++)print R[i]
    }
    Code:
    ./myscript logFiles*
    400     410     200     345
    45              13      34
    53              32      75
            22      45      45
    75      45      21       
            98      43      46
    Last edited by Watael; 12th December 2012 at 09:39 AM.
    «A problem clearly stated is a problem half solved.»

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Verify two tar files without Extracting
    By inshaf.ccna in forum Shell scripting
    Replies: 2
    Last Post: 21st June 2012, 02:48 AM
  2. making a file while making 2 directories at the same time?
    By powerpoint in forum Shell scripting
    Replies: 3
    Last Post: 20th October 2011, 08:02 PM
  3. No data in mysqld.log or mysql-slow-query.log files
    By spik2kush in forum Databases servers
    Replies: 6
    Last Post: 8th August 2011, 10:15 PM
  4. Squid report for user uploaded files list
    By kuttyjack in forum Proxy Servers
    Replies: 2
    Last Post: 6th August 2010, 11:10 AM
  5. multiple files
    By asim.mcp in forum Domain Name Server
    Replies: 3
    Last Post: 13th March 2010, 11:00 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41