Sunday, April 5, 2009

Named ports

The latest version of xmlsh has support for "Named Ports".
The original incentive was to be able to more closely map to xproc port semantics, even though I may end up not using them for xproc. I decided to include them anyway because I think this is one area that the unix shells are deficient, or too tied to legacy decisions.

In unix, processes access environment supplied streams by file number (0,1,2 ...).
There are by default 3 standard ones (stdin,stdout,stderr) corresponding to 0,1,2.
The unix shells give access to these via the standard IO redirection cmd > file, cmd < file etc. If you want access to anything except the first 2 you have to use a numeric modifier. e.g. cmd 2>err . You can do this with any file descriptor like cmd 30>file 40<file2 . I've always thought that this was a bit of a hack. Its certainly very closely tied to the OS. Ported versions of shells often cant support this syntax for anything but the predefined 3. Java runtimes have no direct portable way to do this either. Even if it did, it wouldn't work in xmlsh because shell instances are not run as separate sub processes. It could be done in xmlsh by keeping a mapping table of redirected ports that the shell redirected so that internal commands could access.

Instead I choose to go with named ports like xproc uses. I've completed the plumbing to support this but not yet happy with the syntax. The part I'm happy with is this. For any command you can augment a redirection with (port). For example
cmd (output)>file1 (alternate)>file2 (input)<file3

Inside cmd, the XCommand (via XEnvironment) exposes these ports as named ports and you can get at them via getInput(name) or getOutput(port).

But suppose you want to specify the name of the port to the command and not have it confused with the name of a file. As an experiment I've added general support for port naming so that any command which wants a stream can pass in the "filename" and it can access Files, URLs , Variables, Ports and expressions interchangeably. For example (and largely for testing) I have implemented xidentity to support optional filenames so you can do

xidentity file
xidentity http://url
xidentity {variable}
xidentity $xml_variable
xidentity <[ <doc/> ]>


and now with ports you can do

xidentity "(in)" (in)<file




Note the "(in)" is quoted. By the time xidentity gets its, its simply the string (in) but to get it through the parser you have to quote it. I find this less then ideal. But if I remove ( from being a magic token then it breaks all sorts of things, including the ability to distinguish between arguments and IO redirection.
( I.E in "cmd (in)<file" does cmd get 1 arg or 0 ? )

However the ability to pass in port names in the same context as filenames or URI's is very compelling. Alternative suggestions welcome. Maybe an extended scheme ? like port:name e.g.

xidentity port:in (in)<file

Suggestions and comments welcome

1 comment:

  1. I've removed the literal (port) syntax for command arguments. This no longer works

    xidentity (in) (in)<file.xml

    Named ports still exist but I dont like overloading the shell redirection syntax with literal filenames. Thinking about instead using my alternate idea of using a URI schema like port:portname

    example
    xidentity port:in (in)<file

    Comments welcome !

    ReplyDelete

Due to comment spam, moderation is turned on. I will approve all non-spam comments.