nixCraft Linux Forum

nixCraft

Linux Tech Support Forum

Parse XML file and store data in array in shell scripting

This is a discussion on Parse XML file and store data in array in shell scripting within the Shell scripting forums, part of the Development/Scripting category; Hello, I have the XML file in the format <Users> <Host> <hostAddress>180.144.226.47</hostAddress> <userName>pwdfe</userName> <password>hjitre</password> <instanceCount>2</instanceCount> </Host> <Host> <hostAddress>180.144.226.87</hostAddress> <userName>trrrer</userName> <password>jhjjhhj</password> ...


Go Back   nixCraft Linux Forum > Development/Scripting > Shell scripting

Register FAQ Members List Calendar Mark Forums Read
  #1 (permalink)  
Old 02-01-2008, 11:16 AM
Junior Member
User
 
Join Date: Jan 2008
My distro: Debian
Posts: 3
Rep Power: 0
Nishanthhampali is on a distinguished road
Default Parse XML file and store data in array in shell scripting

Hello,
I have the XML file in the format

<Users>
<Host>
<hostAddress>180.144.226.47</hostAddress>
<userName>pwdfe</userName>
<password>hjitre</password>
<instanceCount>2</instanceCount>
</Host>
<Host>
<hostAddress>180.144.226.87</hostAddress>
<userName>trrrer</userName>
<password>jhjjhhj</password>
<instanceCount>3</instanceCount>
</Host>
<Host>
<hostAddress>180.455.226.87</hostAddress>
<userName>wewqw</userName>
<password>dfsdfd</password>
<instanceCount>3</instanceCount>
</Host>
</Users>

I have to read this xml file from the shell script and store the value of the tags hostAddress,username,password,instancecount in a separate arrays.

Please help me out in solving this.
Reply With Quote
Sponsored Links
  #2 (permalink)  
Old 02-11-2008, 01:31 PM
agn agn is offline
Member
User
 
Join Date: Feb 2008
My distro: OpenBSD/FreeBSD/Debian/Fedora/RHEL
Posts: 69
Rep Power: 1
agn is on a distinguished road
Default

Code:
for tag in hostAddress username password instancecount
do
    grep  $tag in.xml | tr -d '\t' | sed 's/^<.*>\([^<].*\)<.*>$/\1/'
done
Something like the above might help. I don't use bash, so don't know how arrays are populated.
Reply With Quote
  #3 (permalink)  
Old 02-13-2008, 08:55 PM
Junior Member
User
 
Join Date: Jun 2007
My distro: Debian
Posts: 2
Rep Power: 0
manishkochar is on a distinguished road
Default

Quote:
Originally Posted by agn View Post
Code:
for tag in hostAddress username password instancecount
do
    grep  $tag in.xml | tr -d '\t' | sed 's/^<.*>\([^<].*\)<.*>$/\1/'
done
Something like the above might help. I don't use bash, so don't know how arrays are populated.
The sed expression was the most complex part, stuffing things into an array, is easy

Code:
#!/bin/bash

for tag in hostAddress userName password instanceCount
do
OUT=`grep  $tag in.xml | tr -d '\t' | sed 's/^<.*>\([^<].*\)<.*>$/\1/' `

# This is what I call the eval_trick, difficult to explain in words.
eval ${tag}=`echo -ne \""${OUT}"\"`
done

# So let's stuff the obtained results into 4 different Arrays

H_ARRAY=( `echo ${hostAddress}` )
U_ARRAY=( `echo ${userName}` )
P_ARRAY=( `echo ${password}` )
I_ARRAY=( `echo ${instanceCount}` )

# Ok, time to announce success, let's printout each of the arrays

echo ${H_ARRAY[@]}
echo ${U_ARRAY[@]}
echo ${P_ARRAY[@]}
echo ${I_ARRAY[@]}

# For the benefit of agn - 
# We can now refer to each unique element of the array like this -

echo ${H_ARRAY[0]} 

# The above prints the first item in array H_ARRAY
I chanced upon this thread, because, I am trying to do a similar project.
The specs look rather challenging, for my poor knowledge of sed.
So let's see if agn can crack this one too!

I want to create a list of web-sites that definitely contain pornographic, or adult content, that's not suitable for kids, at school.
I can see that the dmoz offers it's data in an xml format.
I also noticed that the xml file contains descriptive information about each web-site.

Now this is what I want to do -
A shell script, wherein I specify (via PCRE, of course) the look_up_string.
Based on the look_up_string, I want to, collect in a file the names of web-sites. I don't want the whole URL, just the hostname is enough.
I will then later set this hostname in my hosts file, to ensure effective blocking of these sites.

Could anybody help on this?
Reply With Quote
  #4 (permalink)  
Old 02-13-2008, 09:42 PM
agn agn is offline
Member
User
 
Join Date: Feb 2008
My distro: OpenBSD/FreeBSD/Debian/Fedora/RHEL
Posts: 69
Rep Power: 1
agn is on a distinguished road
Default

Am not an expert in sed. Sed is a really weird tool, but the power it contains is awesome.

Regex's are not a good tool to parse html/xml data. You should use an XML parser. Right tool for the right job.
Reply With Quote
  #5 (permalink)  
Old 02-14-2008, 09:44 AM
Junior Member
User
 
Join Date: Jun 2007
My distro: Debian
Posts: 2
Rep Power: 0
manishkochar is on a distinguished road
Default

Quote:
Originally Posted by agn View Post
Am not an expert in sed. Sed is a really weird tool, but the power it contains is awesome.

Regex's are not a good tool to parse html/xml data. You should use an XML parser. Right tool for the right job.
Any links you would like to share?
I've tried everything listed in freshmeat / sourceforge for "dmoz" parsing.
Sadly, none of them really work, and all are poorly documented.
Besides I feel like a retard when it comes to perl.
Reply With Quote
  #6 (permalink)  
Old 02-14-2008, 11:57 AM
agn agn is offline
Member
User
 
Join Date: Feb 2008
My distro: OpenBSD/FreeBSD/Debian/Fedora/RHEL
Posts: 69
Rep Power: 1
agn is on a distinguished road
Default

Am a total newbie to XML parsing. But, XML::Simple[1] looks easy.

[1] XML::Simple - Easy API to maintain XML (esp config file - search.cpan.org

Last edited by agn; 02-14-2008 at 12:07 PM.
Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads

Thread Thread Starter Forum Replies Last Post
Rearange Data from a file to another sebastanov Shell scripting 1 04-16-2008 10:46 AM
need help on shell scripting rahul_sayz Shell scripting 1 12-08-2007 10:37 AM
get data mysql from shell alpa Shell scripting 1 05-14-2007 06:02 PM
Shell scripting - Removing file extension urbanreformer Shell scripting 3 03-07-2007 08:44 PM
Learning Shell Scripting ricc Shell scripting 4 08-30-2005 09:37 AM


All times are GMT +5.5. The time now is 12:11 PM.


Powered by vBulletin® Version 3.7.2 - Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36