diff --git a/README b/README new file mode 100644 index 0000000..7383739 --- /dev/null +++ b/README @@ -0,0 +1,298 @@ +* Overview + +This package provides a set of C functions for splitting a string into +words. The splitting process is highly configurable and allows for +considerable flexibility. The default splitting rules are similar to +those used in Bourne shell. The splitting process includes tilde +expansion, variable expansion, quote removal, command substitution, +and path expansion. Each of these phases can be turned off by the caller. + +The following code fragment shows the basic usage: + + /* This variable controls the splitting */ + wordsplit_t ws; + int rc; + + /* Provide variable definitions */ + ws.ws_env = (const char **) environ; + /* Provide a function for expanding commands */ + ws.ws_command = runcom; + /* Split input_string into words */ + rc = wordsplit(input_string, &ws, + WRDSF_QUOTE /* Handle both single and + double quoted strings as words. */ + | WRDSF_SQUEEZE_DELIMS /* Compress adjacent delimiters */ + | WRDSF_PATHEXPAND /* Expand pathnames */ + | WRDSF_SHOWERR); /* Show errors */ + if (rc == 0) { + /* Success. The resulting words are returned in the NULL-terminated + array ws.ws_wordv. Number of words is in ws.ws_wordc */ + } + /* Reclaim the allocated memory */ + wordsplit_free(&ws); + +For a detailed discussion, please see the man page wordsplit.3 inluded +in the package. + +* Description + +The package is designed as a drop-in facility for use in larger +programs. It consists of the following files: + + wordsplit.h - Interface header. + wordsplit.c - Main source file. + wordsplit.3 - Documentation. + +For most uses, you will need only these three. The rest of files +are for building the autotest-based testsuite: + + wsp.c - Auxiliary test program. + wordsplit.at - The source for the testsuite. + +* Incorporating wordsplit into your project + +The project is designed to be used as a git submodule. First, select +the location DIR for the wordsplit directory within your project. Then +add the submodule: + + git submodule add git://git.gnu.org.ua/wordsplit.git DIR + +The rest is quite straightforward: you need to add wordsplit.c to your +sources and add both wordsplit.c and wordsplit.h to the distributed files. + +There are two methods of doing so: direct incorporation and +incorporation via VPATH. The discussion below will describe both +methods based on the assumption that your project is using GNU +autotools framework. If you are using plain makefiles, these +instructions are easy to convert to such use as well. + +** Direct incorporation + +Add the subdir-objects option to the invocation of AM_INIT_AUTOMAKE macro in +your configure.ac: + + AM_INIT_AUTOMAKE([subdir-objects]) + +In your Makefile.am, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h +to the sources and -Iwordsplit to the cpp flags. For example: + + program_SOURCES = main.c \ + wordsplit/wordsplit.c \ + wordsplit/wordsplit.h + AM_CPPFLAGS = -I$(srcdir)/wordsplit + +You can also put wordsplit.h in the noinst_HEADERS variable, if you like: + + program_SOURCES = main.c \ + wordsplit/wordsplit.c + noinst_HEADERS = wordsplit/wordsplit.h + AM_CPPFLAGS = -I$(srcdir)/wordsplit + +If you are building an installable library and wish to make wordsplit functions +available, install wordsplit.h to $(pkgincludedir), e.g. + + lib_LTLIBRARIES = libmy.la + libmy_la_SOURCES = main.c \ + wordsplit/wordsplit.c + AM_CPPFLAGS = -I$(srcdir)/wordsplit + pkginclude_HEADERS = wordsplit/wordsplit.h + +** Vpath-based incorporation + +Modify the VPATH variable in your Makefile.am: + + VPATH += $(srcdir)/wordsplit + +Notice the use of "+=": it is necessary for the vpath builds to work. + +Add wordsplit.o to the name_LIBADD or name_LDADD variable, depending on +the nature of the object being built. + +Modify AM_CPPFLAGS as shown in the previous section: + + AM_CPPFLAGS = -I$(srcdir)/wordsplit + +Add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to the EXTRA_DIST +variable. + +An example Makefile.am: + + program_SOURCES = main.c + LDADD = wordsplit.o + noinst_HEADERS = wordsplit/wordsplit.h + VPATH += $(srcdir)/wordsplit + EXTRA_DIST = wordsplit/wordsplit.c wordsplit/wordsplit.h + +* The testsuite + +The package contains two files for building the testsuite: wsp.c, +which is used to build the auxiliary binary wsp, and wordsplit.at, +which is translated by GNU autotest into a testsuite shell script. + +The discussion below is for those who wish to include wordsplit +testsuite into their project. It assumes the following layout of the +hosting project: + + lib/ + Directory holding the library that incorporates wordsplit.o. + This discussion assumes the library name is libmy.a + lib/wordsplit + Wordsplit sources. + +The testsuite will be built in lib. + +** Additional files + +Three additional files are necessary for the testsuite: atlocal.in, +wordsplit-version.h, and package.m4. + +The file atlocal.in is a simple shell script that sets the PATH +environment variable for the testsuite. It contains just one line: + + PATH=$srcdir/wordsplit:$PATH + +The file wordsplit-version.h provides the version definition for the +test program wsp.c. Use the following script to create it: + + version=$(cd wordsplit; git describe) + cat > wordsplit-version.h < package.m4 < + Git: +[2] Grecs - a library for parsing structured configuration files + Home: + Git: +[3] GNU Dico - a dictionary server + Home: + Git: +[4] GNU Direvent - filesystem event watching daemon + Home: + Git: +[5] GNU Mailutils - a general-purpose mail package + Home: + Git: +[6] GNU Rush - a restricted user shell for remote access + Home: + Git: +[7] fileserv - simple http server for serving static files + Home: + Git: +[8] vmod-dbrw - Database-driven rewrite rules for Varnish Cache + Home: + Git: + +* Bug reporting + +Please send bug reports, questions, suggestions and criticism to +. When sending bug reports, please make sure to provide +the following information: + + 1. Wordsplit invocation flags. + 2. Input string. + 3. Produced output. + 4. Expected output. + +* Copying + +Copyright (C) 2009-2019 Sergey Poznyakoff + +Permission is granted to anyone to make or distribute verbatim copies +of this document as received, in any medium, provided that the +copyright notice and this permission notice are preserved, +thus giving the recipient permission to redistribute in turn. + +Permission is granted to distribute modified versions +of this document, or of portions of it, under the above conditions, +provided also that they carry prominent notices stating who last +changed them. + +Local Variables: +mode: outline +paragraph-separate: "[ ]*$" +version-control: never +End: + diff --git a/bootstrap b/bootstrap deleted file mode 100755 index 4cc4178..0000000 --- a/bootstrap +++ /dev/null @@ -1,162 +0,0 @@ -#! /bin/sh -cd $(dirname $0) -version=$(git describe) - -function genfiles() { - cat > wordsplit-version.h < package.m4 < atlocal.in - - -function mk_testsuite() { - sed -e 's|MODDIR|$moddir|' <<\EOF -# ################## -# Testsuite -# ################## -EXTRA_DIST = testsuite wordsplit.at package.m4 -DISTCLEANFILES = atconfig -MAINTAINERCLEANFILES = Makefile.in $(TESTSUITE) - -TESTSUITE = $(srcdir)/testsuite -M4=m4 -AUTOTEST = $(AUTOM4TE) --language=autotest -$(TESTSUITE): wordsplit.at - $(AM_V_GEN)$(AUTOTEST) -I $(srcdir) wordsplit.at -o $(TESTSUITE).tmp - $(AM_V_at)mv $(TESTSUITE).tmp $(TESTSUITE) - -atconfig: $(top_builddir)/config.status - cd $(top_builddir) && ./config.status MODDIR/$@ - -clean-local: - @test ! -f $(TESTSUITE) || $(SHELL) $(TESTSUITE) --clean - -check-local: atconfig atlocal $(TESTSUITE) - @$(SHELL) $(TESTSUITE) - -noinst_PROGRAMS = wsp -wsp_SOURCES = wsp.c wordsplit-version.h -EOF - echo "wsp_LDADD = $1" -} - -function common_notice() { - cat < Makefile.am - mk_atlocal - common_notice -} - -function mk_shared() { - (cat < Makefile.am - mk_atlocal - common_notice -} - -function mk_static() { - (cat < Makefile.am - mk_atlocal - common_notice -} - -function mk_embedded() { - (mk_testsuite wordsplit.o - echo "AM_CPPFLAGS = " - )> Makefile.am - mk_atlocal - cat <&2 - exit 1 -fi - -moddir=$2 - -case $1 in - installed|shared|static|standalone|embedded) - genfiles - mk_$1 - ;; - clean) - rm -f Makefile.am package.m4 wordsplit-version.h atlocal.in - ;; - *) - usage - ;; -esac - - diff --git a/wordsplit.3 b/wordsplit.3 index 400c2ee..139c73e 100644 --- a/wordsplit.3 +++ b/wordsplit.3 @@ -14,7 +14,7 @@ .\" You should have received a copy of the GNU General Public License .\" along with wordsplit. If not, see . .\" -.TH WORDSPLIT 3 "July 7, 2019" "WORDSPLIT" "Wordsplit User Reference" +.TH WORDSPLIT 3 "July 9, 2019" "WORDSPLIT" "Wordsplit User Reference" .SH NAME wordsplit \- split string into words .SH SYNOPSIS @@ -62,7 +62,10 @@ The function .B wordsplit_free_words frees only the memory allocated for elements of .I ws_wordv -and initializes +after which it resets +.I ws_wordv to +.B NULL +and .I ws_wordc to zero. .PP @@ -73,15 +76,17 @@ wordsplit_t ws; int rc; if (wordsplit(s, &ws, WRDSF_DEFFLAGS)) { - wordsplit_perror(&ws); - return; -} -for (i = 0; i < ws.ws_wordc; i++) { - /* do something with ws.ws_wordv[i] */ + for (i = 0; i < ws.ws_wordc; i++) { + /* do something with ws.ws_wordv[i] */ + } } wordsplit_free(&ws); .EE .PP +Notice, that \fBwordsplit_free\fR must be called after each invocation +of \fBwordsplit\fR or \fBwordsplit_len\fR, even if it resulted in +error. +.PP The function .B wordsplit_getwords returns in \fIwordv\fR an array of words, and in \fIwordc\fR the number @@ -135,49 +140,37 @@ wordsplit_free(&ws); .EE .SH OPTIONS The number of flags is limited to 32 (the width of \fBuint32_t\fR data -type) and each bit is occupied by a corresponding flag. However, the -number of features \fBwordsplit\fR provides required still -more. Additional features can be requested by setting a corresponding -\fIoption bit\fR in the \fBws_option\fR field of the \fBstruct -wordsplit\fR argument. To inform wordsplit functions that this field -is initialized the \fBWRDSF_OPTIONS\fR flag must be set. +type). By the time of this writing each bit is already occupied by a +corresponding flag. However, the number of features \fBwordsplit\fR +provides requires still more. Additional features can be requested by +setting a corresponding \fIoption bit\fR in the \fBws_option\fR field +of the \fBstruct wordsplit\fR argument. To inform wordsplit functions +that this field is initialized the \fBWRDSF_OPTIONS\fR flag must be set. .PP Option symbolic names begin with \fBWRDSO_\fR. They are discussed in detail in the subsequent chapters. .SH EXPANSION Expansion is performed on the input after it has been split into -words. There are several kinds of expansion, which of them are -performed is controlled by appropriate bits set in the \fIflags\fR -argument. Whatever expansion kinds are enabled, they are always run -in the same order as described in this section. +words. The kinds of expansion to be performed are controlled by the +appropriate bits set in the \fIflags\fR argument. Whatever expansion +kinds are enabled, they are always run in the order described in this +section. .SS Whitespace trimming Whitespace trimming removes any leading and trailing whitespace from the initial word array. It is enabled by the .B WRDSF_WS -flag. Whitespace trimming is needed only if you redefine -word delimiters (\fIws_delim\fR member) so that they don't contain -whitespace characters (\fB\(dq \\t\\n\(dq\fR). -.SS Tilde expansion -Tilde expansion is enabled if the -.B WRDSF_PATHEXPAND -bit is set. It expands all words that begin with an unquoted tilde -character (`\fB~\fR'). If tilde is followed immediately by a slash, -it is replaced with the home directory of the current user (as -determined by his \fBpasswd\fR entry). A tilde alone is handled the -same way. Otherwise, the characters between the tilde and first slash -character (or end of string, if it doesn't contain any) are treated as -a login name. and are replaced (along with the tilde itself) with the -home directory of that user. If there is no user with such login -name, the word is left unchanged. +flag. Whitespace trimming is enabled automatically if the word +delimiters (\fIws_delim\fR member) contain whitespace characters +(\fB\(dq \\t\\n\(dq\fR), which is the default. .SS Variable expansion Variable expansion replaces each occurrence of .BI $ NAME or .BI ${ NAME } -with the value of the variable \fINAME\fR. It is enabled if the -flag \fBWRDSF_NOVAR\fR is not set. The caller is responsible for -supplying the table of available variables. Two mechanisms are -provided: environment array and a callback function. +with the value of the variable \fINAME\fR. It is enabled by default +and can be disabled by setting the \fBWRDSF_NOVAR\fR flag. The caller +is responsible for supplying the table of available variables. Two +mechanisms are provided: environment array and a callback function. .PP Environment array is a \fBNULL\fR-terminated array of variables, stored in the \fIws_env\fR member. The \fBWRDSF_ENV\fR flag must be @@ -204,8 +197,8 @@ function itself shall be defined as int getvar (char **ret, const char *var, size_t len, void *clos); .EE .PP -The function shall look up for the variable identified by the first -\fIlen\fR bytes of the string \fIvar\fR. If such variable is found, +The function shall look up the variable identified by the first +\fIlen\fR bytes of the string \fIvar\fR. If the variable is found, the function shall store a copy of its value (allocated using \fBmalloc\fR(3)) in the memory location pointed to by \fBret\fR, and return \fBWRDSE_OK\fR. If the variable is not found, the function shall @@ -216,7 +209,7 @@ If \fIws_getvar\fR returns .BR WRDSE_USERERR , it must store the pointer to the error description string in .BR *ret . -In any case (whether returning \fB0\fR or \fBWRDSE_USERERR\fR) , the +In any case (whether returning \fB0\fR or \fBWRDSE_USERERR\fR), the data returned in \fBret\fR must be allocated using .BR malloc (3). .PP @@ -225,10 +218,11 @@ If both and .I ws_getvar are used, the variable is first looked up in -.IR ws_env , -and if not found there, the +.IR ws_env . +If it is not found there, the .I ws_getvar -function is called. +callback is invoked. +This order is reverted if the \fBWRDSO_GETVARPREF\fR option is set. .PP During variable expansion, the forms below cause .B wordsplit @@ -255,14 +249,61 @@ Otherwise, the value of \fIvariable\fR is substituted. .BI ${ variable :+ word } .BR "Use Alternate Value" . If \fIvariable\fR is null or unset, nothing is substituted, otherwise the -expansion of \fIword\fR is substituted. +expansion of \fIword\fR is substituted. +.PP +Unless the above forms are used, a reference to an undefined variable +expands to empty string. Three flags affect this behavior. If the +\fBWRDSF_UNDEF\fR flag is set, expanding undefined variable triggers +a \fBWRDSE_UNDEF\fR error. If the \fBWRDSF_WARNUNDEF\fR flag is set, +a non-fatal warning is emitted for each undefined variable. Finally, +if the \fBWRDSF_KEEPUNDEF\fR flag is set, references to undefined +variables are left unexpanded. +.PP +If two or three of these flags are set simultaneously, the behavior is +undefined. +.SS Positional argument expansion +\fIPositional arguments\fR are special parameters that can be +referenced in the input string by their ordinal number. The numbering +begins at \fB0\fR. The syntax for referencing positional arguments is +the same as for the variables, except that argument index is used +instead of the variable name. If the index is between 0 and 9, the +\fB$\fIN\fR form is acceptable. Otherwise, the index must be enclosed +in curly braces: \fB${\fIN\fB}\fR. +.PP +During argument expansion, references to positional arguments are +replaced with the corresponding values. +.PP +Argument expansion is requested by the \fBWRDSO_PARAMV\fR option bit. +The NULL-terminated array of variables shall be supplied in the +.I ws_paramv +member. The +.I ws_paramc +member shall be initialized to the number of elements in +.IR ws_paramv . +.PP +Setting the \fBWRDSO_PARAM_NEGIDX\fR option together with +\fBWRDSO_PARAMV\fR enables negative positional argument references. +A negative reference has the form \fB${-\fIN\fB}\fR. It is expanded +to the value of the argument with index \fB\fIws_paramc\fR \- \fIN\fR. .SS Quote removal -Quote removal translates unquoted escape sequences into corresponding bytes. -An escape sequence is a backslash followed by one or more characters. By -default, each sequence \fB\\\fIC\fR appearing in unquoted words is -replaced with the character \fIC\fR. In doubly-quoted strings, two -backslash sequences are recognized: \fB\\\\\fR translates to a single -backslash, and \fB\\\(dq\fR translates to a double-quote. +During quote removal, single or double quotes surrounding a sequence +of characters are removed and the sequence itself is treated as a +single word. Characters within single quotes are treated verbatim. +Characters within double quotes undergo variable expansion and +backslash interpretation (see below). +.PP +Recognition of single quoted strings is enabled by the +\fBWRDSF_SQUOTE\fR flag. Recognition of double quotes is enabled by +the \fBWRDSF_DQUOTE\fR flag. The macro \fBWRDSF_QUOTE\fR enables both. +.SS Backslash interpretation +Backslash interpretation translates unquoted +.I escape sequences +into corresponding characters. An escape sequence is a backslash followed +by one or more characters. By default, each sequence \fB\\\fIC\fR +appearing in unquoted words is replaced with the character \fIC\fR. In +doubly-quoted strings, two backslash sequences are recognized: +\fB\\\\\fR translates to a single backslash, and \fB\\\(dq\fR +translates to a double-quote. .PP Two flags are provided to modify this behavior. If .I WRDSF_CESCAPES @@ -292,16 +333,16 @@ The \fBWRDSF_ESCAPE\fR flag allows the caller to customize escape sequences. If it is set, the \fBws_escape\fR member must be initialized. This member provides escape tables for unquoted words (\fBws_escape[0]\fR) and quoted strings (\fBws_escape[1]\fR). Each -table is a string consisting of even number of charactes. In each +table is a string consisting of an even number of charactes. In each pair of characters, the first one is a character that can appear after backslash, and the following one is its translation. For example, the above table of C escapes is represented as -\fB\(dqa\\ab\\bf\\fn\\nr\\rt\\tv\\v\(dq\fR. +\fB\(dq\\\\\\\\"\\"a\\ab\\bf\\fn\\nr\\rt\\tv\\v\(dq\fR. .PP It is valid to initialize \fBws_escape\fR elements to zero. In this case, no backslash translation occurs. .PP -The handling of octal and hex escapes is controlled by the following +Interpretation of octal and hex escapes is controlled by the following bits in \fBws_options\fR: .TP .B WRDSO_BSKEEP_WORD @@ -357,9 +398,9 @@ The substitution function should be defined as follows: void *\fIclos\fB);\fR .RE .PP -First \fIlen\fR bytes of \fIcmd\fR contain the command invocation as -it appeared between -.BR $( and ), +On input, the first \fIlen\fR bytes of \fIcmd\fR contain the command +invocation as it appeared between +.BR $( " and " ), with all expansions performed. .PP The \fIargv\fR parameter contains the command @@ -381,11 +422,27 @@ is returned, a pointer to the error description string must be stored in When \fBWRDSE_OK\fR or \fBWRDSE_USERERR\fR is returned, the data stored in \fB*ret\fR must be allocated using .BR malloc (3). -.SS Pathname expansion -Pathname expansion is performed if the \fBWRDSF_PATHEXPAND\fR flag is -set. Each unquoted word is scanned for characters -.BR * , ? ", and " [ . -If one of these appears, the word is considered a \fIpattern\fR (in +.SS Tilde and pathname expansion +Both expansions are performed if the +.B WRDSF_PATHEXPAND +flag is set. +.PP +.I Tilde expansion +affects any word that begins with an unquoted tilde +character (\fB~\fR). If the tilde is followed immediately by a slash, +it is replaced with the home directory of the current user (as +determined by his \fBpasswd\fR entry). A tilde alone is handled the +same way. Otherwise, the characters between the tilde and first slash +character (or end of string, if it doesn't contain any) are treated as +a login name. and are replaced (along with the tilde itself) with the +home directory of that user. If there is no user with such login +name, the word is left unchanged. +.PP +During +.I pathname expansion +each unquoted word is scanned for characters +.BR * ", " ? ", and " [ . +If any of these appears, the word is considered a \fIpattern\fR (in the sense of .BR glob (3)) and is replaced with an alphabetically sorted list of file names matching the @@ -429,9 +486,9 @@ the last word. For example, if the input to the above fragment were The data type \fBwordsplit_t\fR has three members that contain output data upon return from \fBwordsplit\fR or \fBwordsplit_len\fR, and a number of members that the caller can initialize on input in -order to customize the function behavior. Each its member has a -corresponding flag bit, which must be set in the \fIflags\fR argument -in order to instruct the \fBwordsplit\fR function to use it. +order to customize the function behavior. For each input member there +is a corresponding flag bit, which must be set in the \fIflags\fR argument +in order to instruct the \fBwordsplit\fR function to use the member. .SS OUTPUT .TP .BI size_t " ws_wordc" @@ -441,17 +498,6 @@ from \fBwordsplit\fR. .BI "char ** " ws_wordv Array of resulting words. Accessible upon successful return from \fBwordsplit\fR. -.TP -.BI "size_t " ws_wordi -Total number of words processed. This field is intended for use with -.B WRDSF_INCREMENTAL -flag. If that flag is not set, the following relation holds: -.BR "ws_wordi == ws_wordc - ws_offs" . -.TP -.BI "int " ws_errno -Error code, if the invocation of \fBwordsplit\fR or -\fBwordsplit_len\fR failed. This is the same value as returned from -the function in that case. .PP The caller should not attempt to free or reallocate \fIws_wordv\fR or any elements thereof, nor to modify \fIws_wordc\fR. @@ -463,6 +509,52 @@ the caller should use It is more effective than copying the contents of .I ws_wordv manually. +.TP +.BI "size_t " ws_wordi +Total number of words processed. This field is intended for use with +.B WRDSF_INCREMENTAL +flag. If that flag is not set, the following relation holds: +.BR "ws_wordi == ws_wordc - ws_offs" . +.TP +.BI "int " ws_errno +Error code, if the invocation of \fBwordsplit\fR or +\fBwordsplit_len\fR failed. This is the same value as returned from +the function in that case. +.TP +.BI "char *" ws_errctx +On error, context in which the error occurred. For +.BR WRDSE_UNDEF , +it is the name of the undefined variable. For +.B WRDSE_GLOBERR +- the pattern that caused error. +.sp +The caller should treat this member as +.BR "const char *" . +.PP +The following members are used if the variable expansion was requested +and the input string contained an +.B Assign Default Values +form (\fB${\fIvariable\fB:=\fIword\fB}\fR). +.TP +.BI "char **" ws_envbuf +Modified environment. It follows the same arrangement as \fIws_env\fR +on input (see the \fBWRDSF_ENV_KV\fR flag). If \fIws_env\fR was NULL (or +\fBWRDSF_ENV\fR was not set), but the \fIws_getvar\fR callback was +used, the \fIws_envbuf\fR array will contain only the modified variables. +.TP +.BI "size_t " ws_envidx +Number of entries in +.IR ws_envbuf . +.PP +If positional parameters were used (see the \fBWRDSO_PARAMV\fR option) +and any of them were modified during processing, the following two +members supply the modified parameter array. +.TP +.BI "char ** " ws_parambuf +Array of positional parameters. +.TP +.BI "size_t " ws_paramidx +Number of positional parameters. .SS INPUT .TP .BI "size_t " ws_offs @@ -569,12 +661,12 @@ one containing variable name, and the next one with its value. .TP .BI "int (*" ws_getvar ") (char **ret, const char *var, size_t len, void *clos)" -Points to the function that will be used during variable expansion to -look up for the value of the environment variable named \fBvar\fR. +Points to the function that will be used during variable expansion for +environment variable lookups. This function is used if the variable expansion is enabled (i.e. the .B WRDSF_NOVAR flag is not set), and the \fBWRDSF_GETVAR\fR flag is set. - +.sp If both .B WRDSF_ENV and @@ -583,14 +675,15 @@ are set, the variable is first looked up in the .I ws_env array and, if not found there, .I ws_getvar -is called. - +is called. If the \fBWRDSO_GETVARPREF\fR option is set, this order is +reverted. +.sp The name of the variable is specified by the first \fIlen\fR bytes of the string \fIvar\fR. The \fIclos\fR parameter supplies the user-specific data (see below the description of \fIws_closure\fR member) and the \fBret\fR parameter points to the memory location where output data is to be stored. On success, the function must -store ther a pointer to the string with the value of the variable and +store there a pointer to the string with the value of the variable and return 0. On error, it must return one of the error codes described in the section .BR "ERROR CODES" . @@ -598,7 +691,7 @@ If \fIws_getvar\fR returns .BR WRDSE_USERERR , it must store the pointer to the error description string in .BR *ret . -In any case (whether returning \fB0\fR or \fBWRDSE_USERERR\fR) , the +In any case (whether returning \fB0\fR or \fBWRDSE_USERERR\fR), the data returned in \fBret\fR must be allocated using .BR malloc (3). .TP @@ -629,7 +722,7 @@ If \fIws_command\fR returns .BR WRDSE_USERERR , it must store the pointer to the error description string in .BR *ret . -In any case (whether returning \fB0\fR or \fBWRDSE_USERERR\fR) , the +In any case (whether returning \fB0\fR or \fBWRDSE_USERERR\fR), the data returned in \fBret\fR must be allocated using .BR malloc (3). @@ -639,6 +732,17 @@ command substitution disabled. The \fIclos\fR parameter supplies user-specific data (see the description of \fIws_closure\fR member). +.PP +The following two members are consulted if the \fBWRDSO_PARAMV\fR +option is set. They provide an array of positional parameters. +.TP +.BI "char const **" ws_paramv +Positional parameters. These are accessible in the input string using +the notation \fB$\fIN\fR or \fB${\fIN\fB}\fR, where \fIN\fR is the +0-based parameter number. +.TP +.BI "size_t " ws_paramc +Number of positional parameters. .SH FLAGS The following macros are defined for use in the \fBflags\fR argument. .TP @@ -657,7 +761,7 @@ delimiter, replace \fBC\fR escapes appearing in the input string with the corresponding characters. .TP .B WRDSF_APPEND -Append the words found to the array resulting from a previous call to +Append the resulting words to the array left from a previous call to \fBwordsplit\fR. .TP .B WRDSF_DOOFFS @@ -671,7 +775,9 @@ These are not counted in the returned .IR ws_wordc . .TP .B WRDSF_NOCMD -Don't do command substitution. +Don't do command substitution. The \fBWRDSO_NOCMDSPLIT\fR option set +together with this flag prevents splitting command invocations +into separate words (see the \fBOPTIONS\fR section). .TP .B WRDSF_REUSE The parameter \fIws\fR resulted from a previous call to @@ -686,7 +792,9 @@ Print errors using Consider it an error if an undefined variable is expanded. .TP .B WRDSF_NOVAR -Don't do variable expansion. +Don't do variable expansion. The \fBWRDSO_NOVARSPLIT\fR option set +together with this flag prevents variable references from being split +into separate words (see the \fBOPTIONS\fR section). .TP .B WRDSF_ENOMEMABRT Abort on @@ -721,7 +829,8 @@ Return delimiters. .TP .B WRDSF_SED_EXPR Treat -.BR sed (1) expressions as words. +.BR sed (1) +expressions as words. .TP .B WRDSF_DELIM .I ws_delim @@ -792,8 +901,7 @@ See the section for a detailed discussion. .TP .B WRDSF_PATHEXPAND -Perform pathname and tilde expansion. If this flag is set, the -\fIws_options\fR member must also be initialized. See the +Perform pathname and tilde expansion. See the subsection .B "Pathname expansion" for details. @@ -822,32 +930,60 @@ metacharacters. .PP .TP .B WRDSO_BSKEEP_WORD -Quote removal: when an unrecognized escape sequence is encountered in a word, -preserve it on output. If that bit is not set, the backslash is -removed from such sequences. +Backslash interpretation: when an unrecognized escape sequence is +encountered in a word, preserve it on output. If that bit is not set, +the backslash is removed from such sequences. .TP .B WRDSO_OESC_WORD -Quote removal: handle octal escapes in words. +Backslash interpretation: handle octal escapes in words. .TP .B WRDSO_XESC_WORD -Quote removal: handle hex escapes in words. +Backslash interpretation: handle hex escapes in words. .TP .B WRDSO_BSKEEP_QUOTE -Quote removal: when an unrecognized escape sequence is encountered in -a doubly-quoted string, preserve it on output. If that bit is not -set, the backslash is removed from such sequences. +Backslash interpretation: when an unrecognized escape sequence is +encountered in a doubly-quoted string, preserve it on output. If that +bit is not set, the backslash is removed from such sequences. .TP .B WRDSO_OESC_QUOTE -Quote removal: handle octal escapes in doubly-quoted strings. +Backslash interpretation: handle octal escapes in doubly-quoted strings. .TP .B WRDSO_XESC_QUOTE -Quote removal: handle hex escapes in doubly-quoted strings. +Backslash interpretation: handle hex escapes in doubly-quoted strings. .TP .B WRDSO_MAXWORDS The \fBws_maxwords\fR member is initialized. This is used to control the number of words returned by a call to \fBwordsplit\fR. For a detailed discussion, refer to the chapter .BR "LIMITING THE NUMBER OF WORDS" . +.TP +.B WRDSO_NOVARSPLIT +When \fBWRDSF_NOVAR\fR is set, don't split variable references, even +if they contain whitespace. E.g. +.B ${VAR:-foo bar} +will be treated as a single word. +.TP +.B WRDSO_NOCMDSPLIT +When \fBWRDSF_NOCMD\fR is set, don't split whatever looks like command +invocation, even if it contains whitespace. E.g. +.B $(command arg) +will be treated as a single word. +.TP +.B WRDSO_PARAMV +Positional arguments are supplied in +.I ws_paramv +and +.IR ws_paramc . +See the subsection +.B Positional argument expansion +for a discussion. +.TP +.B WRDSO_PARAM_NEGIDX +Used together with \fBWRDSO_PARAMV\fR, this allows for negative +positional argument references. A negative argument reference has the +form \fB${-\fIN\fB}\fR. It is expanded to the value of the argument +with index \fB\fIws_paramc\fR \- \fIN\fR, i.e. \fIN\fRth if counting +from the end. .SH "ERROR CODES" .TP .BR WRDSE_OK ", " WRDSE_EOF @@ -1015,8 +1151,10 @@ char **shell_parse(char *s) .EE .SH AUTHORS Sergey Poznyakoff +.SH BUGS +Backtick command expansion is not supported. .SH "BUG REPORTS" -Report bugs to . +Report bugs to . .SH COPYRIGHT Copyright \(co 2009-2019 Sergey Poznyakoff .br