Linux / UNIX Tech Support Forum
This is a discussion on Script to count unique ips in apache access log within the Getting started tutorials forums, part of the Linux Getting Started category; Thought this was cool. We needed a shell script to count the unique IP's in a apache access log that ...
|
|||||||
| Getting started tutorials So much to read, so little time! If that is your problem, we have solution. Read our FAQ and tutorials to help you cut through the clutter of information overload. Only members of "contributors" group can post new tutorials. Other members can just reply to thread. |
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
| Sponsored Links | ||
|
|
|
|||
|
good effort, but here's a more efficient approach, using just awk
Code:
awk '{!a[$1]++}END{for(i in a) if ( a[i] >10 ) print a[i],i }' file
__________________
Python tutorial | PHP manual | Bash Ref | Perl documentation | Awk Examples | Gawk | File Renamer |
|
|||
|
Quote:
The loop may be slower than calling separate commands that don't hold as much info in memory. Last edited by cfajohnson; 28-12-2009 at 12:33 PM. |
|
|||
|
sorry, what do you mean? which loop are you referring to?
__________________
Python tutorial | PHP manual | Bash Ref | Perl documentation | Awk Examples | Gawk | File Renamer |
|
|||
|
|
|||
|
ok, so how do you propose to solve this hypothesis of "explicit loops" may be slower than "implicit ones"
__________________
Python tutorial | PHP manual | Bash Ref | Perl documentation | Awk Examples | Gawk | File Renamer |
|
|||
|
Quote:
|
|
||||
|
Hmm,
Code:
[root@forums2 ~]# time cut -d ' ' -f 1 "$FILE" | sort | uniq -c
6585 10.4.20.236
1 173.10.18.115
1 187.61.17.37
4 217.24.240.68
14 41.223.30.22
3 61.160.216.63
159051 67.72.16.xxx
6613 67.72.16.xxx
159047 67.72.16.xxx
10 75.148.211.109
2 78.138.151.126
real 0m6.954s
user 0m6.952s
sys 0m0.055s
[root@forums2 ~]# time awk '{!a[$1]++}END{for(i in a) if ( a[i] >10 ) print a[i],i }' $FILE
159070 67.72.16.xxx
14 41.223.30.22
159074 67.72.16.xxx
6586 10.4.20.xxx
6614 67.72.16.xxx
real 0m0.214s
user 0m0.201s
sys 0m0.014s
[root@forums2 ~]#
Jaysunn Last edited by jaysunn; 28-12-2009 at 10:44 PM. |
|
||||
|
I've not tested this but it is *possible* that results are cached by kernel. Can you run it on two different hosts with same data file and post it back?
__________________
Vivek Gite Linux Evangelist |
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) |
|
| Thread Tools | |
| Display Modes | |
|
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Apache error 403 Permission access on RHEL5 | samengr | Web servers | 2 | 06-06-2009 02:18 AM |
| Shell script to count number of lines in file specified by the second command-line | seaman77 | Shell scripting | 1 | 16-03-2009 07:46 PM |
| grep command count number of CPU | sidebrake | Shell scripting | 3 | 09-09-2008 11:26 PM |
| Set and access apache from DSL / ADSL connection | paul555 | Web servers | 4 | 17-07-2007 04:38 PM |
| Debian recovery mode read only access make it write access | Donavit | Linux software | 1 | 30-12-2005 12:49 AM |