diff --git a/README b/README index c08b680..59f3436 100644 --- a/README +++ b/README @@ -1,15 +1,18 @@ * Overview -This package provides a set of C functions for splitting a string into -words. The splitting process is highly configurable and allows for -considerable flexibility. The default splitting rules are similar to -those used in Bourne shell. The splitting process includes tilde -expansion, variable expansion, quote removal, command substitution, -and path expansion. Each of these phases can be turned off by the caller. +This package provides a set of C functions for parsing input strings. +Default parsing rules are are similar to those used in Bourne shell. +This includes tilde expansion, variable expansion, quote removal, word +splitting, command substitution, and path expansion. Parsing is +controlled by a number of settings which allow the caller to alter +processing at each of these phases or even to disable any of them. +Thus, wordsplit can be used for parsing inputs in different formats, +from simple character-delimited entries, as in /etc/passwd, and up to +complex shell statements. The following code fragment shows the basic usage: - /* This variable controls the splitting */ + /* This variable controls parsing */ wordsplit_t ws; int rc; @@ -31,7 +34,7 @@ The following code fragment shows the basic usage: /* Reclaim the allocated memory */ wordsplit_free(&ws); -For a detailed discussion, please see the man page wordsplit.3 inluded +For a detailed discussion, please see the man page wordsplit.3 included in the package. * Description @@ -51,21 +54,26 @@ are for building the autotest-based testsuite: * Incorporating wordsplit into your project -The project is designed to be used as a git submodule. First, select -the location DIR for the wordsplit directory within your project. Then -add the submodule: +The project is designed to be used as a git submodule. To incorporate +it into your project, first select the location for the wordsplit +directory within your project. Then add the submodule at this +location. The rest is quite straightforward: you need to add +wordsplit.c to your sources and add both wordsplit.c and wordsplit.h +to the distributed files. - git submodule add git://git.gnu.org.ua/wordsplit.git DIR - -The rest is quite straightforward: you need to add wordsplit.c to your -sources and add both wordsplit.c and wordsplit.h to the distributed files. - -There are two methods of doing so: direct incorporation and -incorporation via VPATH. The discussion below will describe both -methods based on the assumption that your project is using GNU -autotools framework. If you are using plain makefiles, these +The following will describe each step in detail. For the rest of this +discussion it is supposed that 'wordsplit' is the name of the location +selected for the submodule. It is also supposed that your project +uses GNU autotools framework. If you are using plain makefiles, these instructions are easy to convert to such use as well. +To add the submodule do: + + git submodule add git://git.gnu.org.ua/wordsplit.git wordsplit + +There are two methods of including the sources to the project: direct +incorporation and incorporation via VPATH. + ** Direct incorporation Add the subdir-objects option to the invocation of AM_INIT_AUTOMAKE macro in @@ -88,8 +96,8 @@ You can also put wordsplit.h in the noinst_HEADERS variable, if you like: noinst_HEADERS = wordsplit/wordsplit.h AM_CPPFLAGS = -I$(srcdir)/wordsplit -If you are building an installable library and wish to make wordsplit functions -available, install wordsplit.h to $(pkgincludedir), e.g. +If you are building an installable library and wish to export the +wordsplit API, install wordsplit.h to $(pkgincludedir), e.g. lib_LTLIBRARIES = libmy.la libmy_la_SOURCES = main.c \ @@ -97,7 +105,7 @@ available, install wordsplit.h to $(pkgincludedir), e.g. AM_CPPFLAGS = -I$(srcdir)/wordsplit pkginclude_HEADERS = wordsplit/wordsplit.h -** Vpath-based incorporation +** VPATH-based incorporation Modify the VPATH variable in your Makefile.am: @@ -105,13 +113,13 @@ Modify the VPATH variable in your Makefile.am: Notice the use of "+=": it is necessary for the vpath builds to work. -Define the nodist_program_SOURCES variable: +Add wordsplit.c to the nodist_program_SOURCES variable: nodist_program_SOURCES = wordsplit.c The nodist_ prefix is necessary to prevent Make from trying to distribute this file from the current directory (where it doesn't -exist of course). It will find it using VPATH during compilation. +exist of course). During compilation it will be located using VPATH. Finally, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to the EXTRA_DIST variable and modify AM_CPPFLAGS as shown in the @@ -196,7 +204,7 @@ Add the following lines to your configure.ac: ** lib/Makefile.am -The makefile in lib must be modified to build the auxiliary program +The Makefile.am in lib must be modified to build the auxiliary program wsp and create the testsuite script. This is done by the following fragment: @@ -228,17 +236,18 @@ fragment: * History First version of wordsplit appeared in March 2009 as a part of the -Wydawca[1] project. Its main usage there was to assist in -configuration file parsing. The parser subsystem proved to be quite -useful and it soon forked into a separate project - Grecs[2]. This -package had been since used (as a git submodule) in a number of other -projects, such as GNU Dico[3] and Direvent[4], to name a few. +Wydawca[1] project. Its main usage was to assist in configuration +file parsing. The parser subsystem proved to be quite useful and +soon evolved into a separate project - Grecs[2]. This package had been +since used (as a git submodule) in a number of other projects, such as +GNU Dico[3] and Direvent[4], to name a few. In 2010 the wordsplit sources were incorporated to the GNU Mailutils[5] package, where they replaced the obsolete argcv module. Mailutils uses its own configuration package, which meant that using Grecs was not expedient. Therefore the sources had been exported from -Grecs and are kept in sync with the changes in it. +Grecs. Since then both Mailutils and Grecs versions are periodically +synchronized. Several other projects, such as GNU Rush[6] and fileserv[7], followed the suite. It was therefore decided that it would be advisable to diff --git a/wordsplit.3 b/wordsplit.3 index 139c73e..e742030 100644 --- a/wordsplit.3 +++ b/wordsplit.3 @@ -333,7 +333,7 @@ The \fBWRDSF_ESCAPE\fR flag allows the caller to customize escape sequences. If it is set, the \fBws_escape\fR member must be initialized. This member provides escape tables for unquoted words (\fBws_escape[0]\fR) and quoted strings (\fBws_escape[1]\fR). Each -table is a string consisting of an even number of charactes. In each +table is a string consisting of an even number of characters. In each pair of characters, the first one is a character that can appear after backslash, and the following one is its translation. For example, the above table of C escapes is represented as @@ -600,10 +600,10 @@ flag must be set. By default, it's value is \fB\(dq#\(dq\fR. Escape tables for unquoted words (\fBws_escape[0]\fR) and quoted strings (\fBws_escape[1]\fR). These are used to translate escape sequences (\fB\\\fIC\fR) into characters. Each table is a string -consisting of even number of charactes. In each pair of characters, +consisting of even number of characters. In each pair of characters, the first one is a character that can appear after backslash, and the following one is its representation. For example, the string -\fB\(dqt\\tn\\n\(dq\fR translates \fB\\t\fR into horisontal +\fB\(dqt\\tn\\n\(dq\fR translates \fB\\t\fR into horizontal tabulation character and \fB\\n\fR into newline. .B WRDSF_ESCAPE flag must be set if this member is initialized. @@ -755,8 +755,8 @@ Default flags. This is a shortcut for: WRDSF_SQUEEZE_DELIMS |\ WRDSF_CESCAPES)\fR, -i.e.: disable variable expansion and quote substituton, perform quote -removal, treat any number of consequtive delimiters as a single +i.e.: disable variable expansion and quote substitution, perform quote +removal, treat any number of consecutive delimiters as a single delimiter, replace \fBC\fR escapes appearing in the input string with the corresponding characters. .TP @@ -807,7 +807,7 @@ flag is set, and error code is returned. If this flag is set, the function is called instead. This function is not supposed to return. .TP .B WRDSF_WS -Trim off any leading and trailind whitespace from the returned +Trim off any leading and trailing whitespace from the returned words. This flag is useful if the \fIws_delim\fR member does not contain whitespace characters. .TP @@ -1007,7 +1007,7 @@ Undefined variable. This error is returned only if the \fBWRDSF_UNDEF\fR flag is set. .TP .B WRDSE_NOINPUT -Input exhausted. This is not acually an error. This code is returned +Input exhausted. This is not actually an error. This code is returned if \fBwordsplit\fR (or \fBwordsplit_len\fR) is invoked in incremental mode and encounters end of input string. See the section .BR "INCREMENTAL MODE" .