Sunday, 29 September 2013

Using XPath in bash with not()

Using XPath in bash with not()

This is follow-up to a previous question about using XPath in bash.
I have a set of XML files, most of which encode relations to other files:
<file>
<fileId>xyz123</fileId>
<fileContents>Blah blah Blah</fileContents>
<relatedFiles>
<otherFile
href='http://sub.domain.abc.edu/directory/index.php?p=collections/pageview&amp;id=123&#8203;4'>
<title>Some resource</title>
</otherFile>
<otherFile
href='http://sub.domain.abc.edu/directory/index.php?p=collections/pageview&amp;id=4321'>
<title>Some other resource</title>
</otherFile>
</relatedFiles>
</file>
The answer to the previous question helped me process the majority of
these files successfully. However, there are some files in the set that do
not include any relatedFiles/otherFile elements. I want to be able to
process those files separately and move them into an "other" folder. I
thought I could do this with an XPath not() function, but I get a "command
not found" error for that line when I run the script.
#!/bin/bash
mkdir other
for f in *.xml; do
fid=$(xpath -e '//fileId/text()' "$f" 2>/dev/null)
for uid in $(xpath -e '//otherFile/@href' "$f" 2>/dev/null | awk -F=
'{gsub(/"/,"",$0); print $4}'); do
echo "Moving $f to ${fid:3}_${uid}.xml"
cp "$f" "${fid:3}_${uid}.xml"
done
if $(xpath -e 'not(//otherFile)' "$f" 2>/dev/null); then
echo "Moving $f to other/${fid:3}.xml"
cp "$f" "other/${fid:3}.xml"
fi
rm "$f"
done
How can I filter out files that do not contain certain elements using
XPath in bash? Thanks in advance.

No comments:

Post a Comment