nixCraft Linux Forum

nixCraft

Linux Tech Support Forum

Script to strip all slack from php, html, js

This is a discussion on Script to strip all slack from php, html, js within the Shell scripting forums, part of the Development/Scripting category; Hi, I've been trying to find a script that I can run in my www document root to strip all ...


Go Back   nixCraft Linux Forum > Development/Scripting > Shell scripting

Register FAQ Members List Calendar Mark Forums Read
  #1 (permalink)  
Old 05-02-2008, 01:09 AM
Junior Member
User
 
Join Date: Feb 2007
Posts: 19
Rep Power: 0
meowing
Lightbulb Script to strip all slack from php, html, js

Hi,

I've been trying to find a script that I can run in my www document root to strip all spaces/empty tabs from all *.php, *.html, *.js files.
It seems like something many before me would have wanted, but I couldn't find any, not one example.

So I'll headstart this myself, by using RegExp and perl one should be able to get that done, right?

Example input, file.php, with crap indenting dev tab and space:
Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
  <title>Upgrading to Pivot 1.40: checking for changes..</title>
  </head>
<body>
<h1>Upgrading to blabla for changes..</h1>
<p><em>Version 1.6</em></p>
 <p>From blabla other things. </p>
  <p>This simple script checks if you've made the required change backup it first. You never know when you might need it. </p>

<p><strong>Note:</strong> Make sure this file is in your <tt>pivot/</tt> folder. Otherwise it'll give incorrect results!</p>

<?php


if (!file_exists(dirname(__FILE__)."/pv_core.php")) {
	die("<strong>This script should be placed in your Pivot directory.</strong>");
}

include_once("pvlib.php");

load_all_templates();

echo "<h2>Getting rid of miscellaneous old unused files/directories</h2>";

$filestocheck = array(
	"./includes/calendar.php"
);

foreach ($filestocheck as $filetocheck) {
	if (filepresent($filetocheck)) {
		warn("The file <tt>$filetocheck</tt> is still present. You should remove it.");
	} else {
		pass("The file <tt>$filetocheck</tt> is not present.");
	}
}
Which should be replaced by:
Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title>Upgrading to Pivot 1.40: checking for changes..</title></head><body><h1>Upgrading to blabla for changes..</h1>
<p><em>Version 1.6</em></p>
<p>From blabla other things. </p>
<p>This simple script checks if you've made the required change backup it first. You never know when you might need it. </p>
<p><strong>Note:</strong> Make sure this file is in your <tt>pivot/</tt> folder. Otherwise it'll give incorrect results!</p>
<?php
if (!file_exists(dirname(__FILE__)."/pv_core.php")) {
die("<strong>This script should be placed in your Pivot directory.</strong>");
}
include_once("pvlib.php");
load_all_templates();
echo "<h2>Getting rid of miscellaneous old unused files/directories</h2>";
$filestocheck = array(
"./includes/calendar.php"
);
foreach ($filestocheck as $filetocheck) {
if (filepresent($filetocheck)) {
warn("The file <tt>$filetocheck</tt> is still present. You should remove it.");
} else {
pass("The file <tt>$filetocheck</tt> is not present.");
}
}
It should
- remove all empty lines that are not enclosed as text
(so searching linebreak linebreak should replace as linebreak)
- trailing and indenting spaces should de removed
(so searching space|tab linebreak should replace as linebreak,
and linebreak space|tab should replace as just linebreak,
as well as all occurances of double spaces and/or tabs should disappear,
those never make sense in any html either way, except for when
enclosed with PRE/TT tag

Is there anybody out there that did this already? If so, let me know.
Would be great to have a script like this, to get rid of all that slack in production websites.
Reply With Quote
Sponsored Links
  #2 (permalink)  
Old 05-02-2008, 02:21 AM
rockdalinux's Avatar
Contributors
User
 
Join Date: May 2005
Location: Bangalore
My distro: RHEL, HP-UX, Solaris, FreeBSD, Ubuntu
Posts: 554
Rep Power: 6
rockdalinux is a jewel in the rough rockdalinux is a jewel in the rough rockdalinux is a jewel in the rough
Default

You can always use sed to remove all blanks and all leading, trailing whitespace and tabs:
Code:
cat filename | sed '/^$/d;s/^[ \t]*//;s/[ \t]*$//'
cat filename | sed '/^$/d;s/^[ \t]*//;s/[ \t]*$//' > new.filename.txt
Here is your output for php:
PHP Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Upgrading to Pivot 1.40: checking for changes..</title>
</head>
<body>
<h1>Upgrading to blabla for changes..</h1>
<p><em>Version 1.6</em></p>
<p>From blabla other things. </p>
<p>This simple script checks if you've made the required change backup it first. You never know when you might need it. </p>
<p><strong>Note:</strong> Make sure this file is in your <tt>pivot/</tt> folder. Otherwise it'll give incorrect results!</p>
<?php
if (!file_exists(dirname(__FILE__)."/pv_core.php")) {
die(
"<strong>This script should be placed in your Pivot directory.</strong>");
}
include_once(
"pvlib.php");
load_all_templates();
echo 
"<h2>Getting rid of miscellaneous old unused files/directories</h2>";
$filestocheck = array(
"./includes/calendar.php"
);
foreach (
$filestocheck as $filetocheck) {
if (
filepresent($filetocheck)) {
warn("The file <tt>$filetocheck</tt> is still present. You should remove it.");
} else {
pass("The file <tt>$filetocheck</tt> is not present.");
}
}
__________________
Rocky Jr.
You may have my body & soul, but you will never touch my pride!

If you have knowledge, let others light their candles at it.

Certified to work on HP-UX / Sun Solaris / RedHat

Last edited by rockdalinux; 05-02-2008 at 02:23 AM.
Reply With Quote
  #3 (permalink)  
Old 05-03-2008, 01:30 AM
Junior Member
User
 
Join Date: Feb 2007
Posts: 19
Rep Power: 0
meowing
Default

SED, that's a great hint!
I found a nice script to strip html comments too;
http://sed.sourceforge.net/grabbag/s...l_comments.sed
but I'm not really sure how to apply that.

How do I make it work on an entire web document root for example?
(assuming I have a backup of the entire dir)
If you have an example shell command for this, that would really help me.
Also, wouldn't it be great to combine this with the above mentioned idea?
( sed '/^$/d;s/^[ \t]*//;s/[ \t]*$//' and maybe even more)

Thanks in advance if anybody finds the time to respond,
meanwhile I'll try some more sed docs to see if I can get
this idea from the ground.
Reply With Quote
  #4 (permalink)  
Old 05-03-2008, 01:40 AM
rockdalinux's Avatar
Contributors
User
 
Join Date: May 2005
Location: Bangalore
My distro: RHEL, HP-UX, Solaris, FreeBSD, Ubuntu
Posts: 554
Rep Power: 6
rockdalinux is a jewel in the rough rockdalinux is a jewel in the rough rockdalinux is a jewel in the rough
Default

Here is a simple idea..

first backup existing file as .backup
Replace and update file

Code:
#!/bin/bash
DIR="$1"
if [ $# -eq 0 ]; then
	echo "$(basename $0) dir"
	exit 1
fi

for f in $DIR/*
do
	if [ ! -f $f.bakup ]; then
		/bin/cp $f $f.backup
	fi
	out="/tmp/out.$$.tmp
	cat $f | sed '/^$/d;s/^[ \t]*//;s/[ \t]*$//' > $out
	/bin/mv $out $f
done
Try code in dummy setup; once okay; move to production.
__________________
Rocky Jr.
You may have my body & soul, but you will never touch my pride!

If you have knowledge, let others light their candles at it.

Certified to work on HP-UX / Sun Solaris / RedHat
Reply With Quote
  #5 (permalink)  
Old 05-03-2008, 02:37 AM
Junior Member
User
 
Join Date: Feb 2007
Posts: 19
Rep Power: 0
meowing
Default

Let's see if I get this..
Code:
#!/bin/bash
DIR="$1"
if [ $# -eq 0 ]; then
	echo "$(basename $0) dir"
	exit 1
fi

for f in $DIR/*
do
	if [ ! -f $f.bakup ]; then
		/bin/cp $f $f.backup
	fi
	out="/tmp/out.$$.tmp
	cat $f | sed '/^$/d;s/^[ \t]*//;s/[ \t]*$//' > $out
	/bin/mv $out $f
done
You mean .bakup then would be .php and/or .html etc.?

I'd like to apply this to all .css, .js, .htm*, .php* files.

I used to have this with perl, but I lost the script, it was something like;

Code:
#!/bin/sh
$ perl -pi -e 's|[old_string]|[new_string]|g' [file]
but I don't really remember much of it. It could run entirely RegExp based.

Last edited by meowing; 05-03-2008 at 03:19 AM.
Reply With Quote
Reply

Bookmarks

Tags
linux , unix


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads

Thread Thread Starter Forum Replies Last Post
Slack How To's etc F.Zappa Slackware 4 03-24-2008 09:22 PM
Perl simple html mail chiku Coding in General 3 08-17-2007 07:59 PM
HTML variable nathan86 Linux software 3 03-23-2005 12:29 PM
HTML Program cannot open in cgi-bin folder sonaikar Linux software 5 02-04-2005 04:45 PM


All times are GMT +5.5. The time now is 05:53 PM.


Powered by vBulletin® Version 3.7.2 - Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36