Tuesday, February 23, 2010

Released xlmsh 1.0.2 and marklogic 1.0

I have released an update to xmlsh (1.0.2) and to marklogic extension (1.0).
These include support for the new help command and consolidated usage.
Still looking for volunteers to help add to the online help content, right now only the synopsis is available for most commands and a link to the web site for details. I would like to add all the options to the built in help data ...

Many major bug fixes and improvements.

Wednesday, February 17, 2010

New help command

I'm just about to release 1.0.2. which includes a rudimentary "help" command.
I expect to expand on this greatly, but for now it prints out the URL on http://www.xmlsh.org associated with the command, and on browser enabled systems launches the default browser (Using the Java Desktop class) pointing to the associated help page.

This works for all builtin, internal, and supplimentary commands as well as extension modules.

I went back and forth literally dozens of times in my mind about how best to do this. I almost implemented it using Java annotations which would have been really cool ...
but got stuck on how to cleanly annotate script based commands. Some commands are actually .xsh scripts not .java classes so these would need an ancillary annotation method.
I finally tossed that whole idea (and implementation) and settled on a help.xml file per package. Right now this just contains the command name and URL, but I expect to enahnce it to provide usage text (text based only) and maybe short help text for each command.

I dont expect to ever provide full text formatted help text (in pure text form) because the state-of-the-art for formatted documents has just evolved past plain text. Gone are the days of nroff, and -man. They still exist ... but I dont want to move back to that technology, especially since its impossible to write a good terminal app using Java. (lack of the basics for console IO such as unbuffered charactor reading, clearing screens, cursor positioning).

I want the best of both worlds ! A docbook sourced help system that can generate HTML, PDF and TEXT yes plain TEXT ! But alas ... its even againsted my stated philosophy to go back to text. Can win for loosing :)

Also coming is a very simplistic more command. The best that can be done without unbuffered console IO. (have to hit ENTER to read characters in java. 10 year old bug that sun refuses to belive is important).

Sunday, February 14, 2010

PATH and XPATH

Thanks to a suggestion from a new user, I realized I had not documented how the XPATH environment variable is implemented. I've corrected that now by documenting it, but it exposed some problems. That is, XPATH cannot be set to more then 1 directory in the environment prior to calling xmlsh. XPATH is a XDM Sequence variable (like an "array"). Instead of how PATH is a : or ; separated string. I think this makes much more sense but it leads to compatibility issues.

The worse is that you cant preset XPATH before calling xmlsh ! I never thought this through completely but thanks to user feedback I am now.
Also PATH and XPATH are treated differently, which is, well, inconsistent to say the least.

What I've decided to do is this.

On startup, xmlsh will read both PATH and XPATH and parse them into an XDM sequence according to the OS path separator (";" or ":") and then from then on they will remain sequence variables in xmlsh. You can operate on them like any other sequence. Directory separators will be converted to "/" (as is already done).

On calling subprocesses, both variables will be re-serialized as a single string using the same (but reverse) algorithm. The result is subprocesses will see the same single-string, path separated and native OS directory separator strings as were passed into xmlsh.

This way PATH and XPATH can be treated the same, and both can be initialized prior to invoking xmlsh using normal OS environment settings.

The one problem is that this may break existing code in xmlsh which attempts to change the PATH variable using
PATH="$PATH;/mydir"

Instead you'll need to do sequence operations like
PATH+=/mydir
or

PATH=($PATH /mydir)

or
PATH=<[ $PATH , "mydir" ]>



Since PATH will become a sequence variable this syntax wont produce the desired result.
Its possible I could try to hack this by parsing all string assignments to PATH , but I'm not excited about introducing that hack.

Suggestions anyone ? Do will this break any of your existing code ?
I'm relying on the presumption that PATH will have been set prior to invoking xmlsh in most (all?) existing scripts.

Friday, February 5, 2010

xmlsh Phone home !

With 1.0 I'm focusing more on refinements, performance and usability then feature enhancements.
It is critical to know how much xmlsh is being used 'in the wild' and what features are being used.
Unfortunately I have no idea. Sourceforge posts the # of downloads but I dont even know if those are new users or the same userbase downloading each new version.

Calabash (Norm Walsh's xproc implementation) has a "Phone Home" feature, enabled by default.
This keeps track of what steps were run and on exit 'phones home' (posts an XML file to a server) with the statistics. This is critical information to be able to analyze usage and focus optimizations.

I have not implemented such a thing in xmlsh. I am a bit shy of doing so because I dont want to offend people, and even though the data is anonymous it has the impression of being an invasive thing to do. Plus there is a performance impact of both measuring the data (slight) and sending the results (more).

However Norm has told me noone has yet complained about his "Phone Home" feature. And from it he has collected valuable stats which he can use to improve the product for everyone.

Any opinions on this ? Certainly it should be an optional feature. But if I made it "opt in" instead of "opt out" I suspect noone would bother to turn it on.

Maybe this is something that on first use only xmlsh could prompt for the option, as well as indicate how to turn it off in the future.

Any ideas anyone ? How to gather statistics for the good of all users, without being invasive.


Obsolete $_ syntax in <[ expr ]>

Is it to late to revoke a syntax feature ?
I've determined in hindsight that the $_ variable exposed in the<[ xquery expr ]> is not right.
This concatenates all positional parameters into a single sequence. If any of the positional parameters are a sequence > 1 then the result loses information. For example

set <[ 1,2,3 ]> 4 5
echo $#
echo <[ $_[2] ]>

produces

3
2


Very non obvious. The fact that $1 is a sequence has been lost and the $_ is a concatenation of 1,2,3,4,5


To fix this I now predeclare distinct variables $_1 $_2 ... for all positional parameters.
This preserves the sequences in the positional parameters.

$ echo <[ $_1 ]>
1 2 3


This means you can access all positional parameters within the <[ ]> expression with no loss of fidelity. I'd like to get rid of $_ as it is now redundant. However someone might be using it ... so I'm keeping it in for now.

A side effect of this, however, is that there is no way to access $# within the <[ ]> or to iterate over all positional parameters in one query. This is a fundamental limit of xquery and XDM which do not allow nested sequences. It also means that if you actually assign a global variable _1 then it is overwritten by the positional parameter $1 within the <[ ]> expression.

So is it too late now that I've published "1.0" of xmlsh ? Even if there is efficiency and usability issues ? Would anyone be affected ?

I have no idea because I have no metrics of how much xmlsh is being used ....
which gets to my next post ...