Vim as XML Editor: More Setup
The tools listed in this chapter are less basic/crucial than those in the previous chapter, and are optional for many users.
Ruby is a very nice object-oriented programming language from Japan. Some scripts in this howto are written in Ruby so I recommend to install it. Alternatively you could translate the scripts to your favourite language.
$ ruby -v
otherwise install it.
On Linux
After having downloaded and unpacked the archive read the README, under "How to compile and install".
$ mkdir del/compile/ruby
$ cd del/compile/ruby
$ wget [...snipped...].tar.gz
$ md5sum --check
5d52c7d0e6a6eb6e3bc68d77e794898e *ruby-1.8.1.tar.gz
ruby-1.8.1.tar.gz: OK
$ tar -xzf ruby-1.8.1.tar.gz
$ cd ruby-1.8.1/
$ mkdir /home/tobi/bulk/run/ruby
$ mkdir /home/tobi/bulk/run/ruby/1_8_1
$ autoconf
$ ./configure --prefix=/home/tobi/bulk/run/ruby/1_8_1
$ make
$ make test
test succeeded
$ make install
$ ed
a
#!/usr/bin/env sh
${HOME}/bulk/run/ruby/1_8_1/bin/ruby "$@"
.
w /home/tobi/data/commands/ruby_1.8.1
60
q
$ chmod 700 ~/data/commands/ruby_1.8.1
$ ruby_1.8.1 -v
ruby 1.8.1 (2003-12-25) [i686-linux]
$ ruby_1.8.1 test/runner.rb
$ ed
a
#!/usr/bin/env sh
${HOME}/bulk/run/ruby/1_8_1/bin/irb "$@"
.
w /home/tobi/data/commands/irb_1.8.1
59
q
$ chmod 700 ~/data/commands/irb_1.8.1
$ irb_1.8.1
irb(main):001:0> puts 6
6
=> nil
irb(main):002:0> puts 6
6
=> nil
With the Ruby that came with my distro, readline doesn't work;
[up]
results in ^[[A.
With the Ruby I installed IRB works (although I have
to hit [escape]
before entering [up]
).
On Windows
Sorry, there isn't any info regarding Windows.
"XMLStarlet is a set of command line utilities (tools) which can be used to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for plain text files using UNIXgrep
,sed
,awk
,diff
,patch
,join
, etc commands."
On Linux
install_xmlstar
#!/bin/bash -x
# This is just an example you could use as basis for your script.
# (do not run it without having revised and adjusted it)
# The --with-[...]-src paths must point to the libxml and libxslt
# sources.
# The sources are available after install_libxml finished, for
# example.
# Set the version numbers below.
# Be online, then do
# tobi ~/del $ ~/data/run/install_xmlstar
# this doesn't really make sense ...
av_command="antivir -rs -z"
my_home=/home/tobi
if [ ! $HOME == $my_home ]; then
exit
fi
if [ `whoami` != 'tobi' ]; then
exit
fi
# set these:
ver_xmlstar=0.8.1
ver_libxml=2.6.5
ver_libxslt=1.1.2
run_top=${HOME}/bulk/run/xmlstar
run=${run_top}/${ver_xmlstar}
compile=${HOME}/del/compile_libxml
command=${HOME}/data/commands/xmlstar
if [ -d $run ]; then
echo ${run}' exists, exiting'
exit
else
if [ ! -d $run_top ]; then
mkdir $run_top
fi
if [ ! -d $run ]; then
mkdir $run
fi
fi
cd $compile
######################################################################
url_xmlstar="[... snipped URL ...]\
xmlstarlet-${ver_xmlstar}.tar.gz"
file_xmlstar=`basename ${url_xmlstar}`
if [ ! -f download/$file_xmlstar ]; then
cd download
wget $url_xmlstar
$av_command $file_xmlstar
# if [ $? != 0 ]; then
if [ $? -ne 0 ]; then
exit
fi
cd ../
fi
tar -xzf download/${file_xmlstar}
cd xmlstarlet-${ver_xmlstar}
./configure --prefix=${run} \
--with-libxml-src=${compile}/libxml2-${ver_libxml} \
--with-libxslt-src=${compile}/libxslt-${ver_libxslt}
make
make tests
make install
######################################################################
# if [ ! -f $command ]; then
cat > $command << EOF
#!/usr/bin/env sh
# may get overwritten
${run}/bin/xml "\$@"
EOF
chmod 700 $command
# fi
xmlstar --version
On Windows
version
-win32.zip)
I
added the directory containing xml.exe
to the system path.
This makes the system path longer and requires a restart,
but batch files support only up to nine arguments which often is not enough
when using XMLStarlet.
I think that xml is a confusingly generic name for
a command so I renamed it to xmlstar
by renaming xml.exe
to
xmlstar.exe.
Try it out
Caution
Whenever you filter your data through a tool it can get corrupted.
If something went wrong you can use u
to undo the
filtering.
<?
xml
version
=
"
1.0
"
encoding
=
"
UTF-8
"
?>
<
html
xmlns
=
"
http://www.w3.org/1999/xhtml
"
>
<
head
>
<
title
>
foo<
/
title
>
<
/
head
>
<
body
>
<
div
style
=
"
text-align:center
"
>
<
p
id
=
"
foo
"
style
=
"
color:green
"
class
=
"
blammo
"
>
foo
<
/
p
>
<
/
div
>
<
/
body
>
<
/
html
>
Then do
:%!xmlstar ed --delete //@style
You should get something like this:
<?
xml
version
=
"
1.0
"
encoding
=
"
UTF-8
"
?>
<
html
xmlns
=
"
http://www.w3.org/1999/xhtml
"
>
<
head
>
<
title
>
foo<
/
title
>
<
/
head
>
<
body
>
<
div
>
<
p
id
=
"
foo
"
class
=
"
blammo
"
>
foo
<
/
p
>
<
/
div
>
<
/
body
>
<
/
html
>
Sometimes I receive HTML files generated by Microsoft Word; Often they are very bloated. Tidy can make them around five times smaller, and can help with turning them into valid XHTML. The results can't be guaranteed to be really good code regarding semantics and structure, but the files become much easier to work with.
On Linux
$ tidy -help
bash: tidy: command not found
$ cd bulk/run/
$ mkdir tidy && cd tidy
$ wget [... snipped URL ...].tgz
$ md5sum --check
476326c3d44292108111841a42bd27f6 *tidy_linux_x86.tgz
tidy_linux_x86.tgz: OK
$ tar -xzf tidy_linux_x86.tgz
$ ed
a
#!/usr/bin/env sh
${HOME}/bulk/run/tidy/bin/tidy "$@"
.
w /home/tobi/data/commands/tidy
54
q
$ chmod 700 ~/data/commands/tidy
$ tidy -v
HTML Tidy for Linux/x86 released on 1st November 2003
$
On Windows
tidy.bat
@echo off
\path\to\
tidy.exe -config /path/to/
tidyrc.txt
-f /log/errors/here/
tidyerrs.txt %1 %2 %3 %4 %5 %6 %7 %8 %9
Settings
word-2000: yes
clean: yes
doctype: strict
bare: yes
drop-font-tags: yes
drop-proprietary-attributes: yes
enclose-block-text: yes
escape-cdata: yes
logical-emphasis: yes
output-xhtml: yes