Improve docs

This commit is contained in:
Sergey Poznyakoff 2019-07-10 09:54:32 +03:00
parent 5742ab5a03
commit d36275fe9a
2 changed files with 48 additions and 39 deletions

73
README
View file

@ -1,15 +1,18 @@
* Overview
This package provides a set of C functions for splitting a string into
words. The splitting process is highly configurable and allows for
considerable flexibility. The default splitting rules are similar to
those used in Bourne shell. The splitting process includes tilde
expansion, variable expansion, quote removal, command substitution,
and path expansion. Each of these phases can be turned off by the caller.
This package provides a set of C functions for parsing input strings.
Default parsing rules are are similar to those used in Bourne shell.
This includes tilde expansion, variable expansion, quote removal, word
splitting, command substitution, and path expansion. Parsing is
controlled by a number of settings which allow the caller to alter
processing at each of these phases or even to disable any of them.
Thus, wordsplit can be used for parsing inputs in different formats,
from simple character-delimited entries, as in /etc/passwd, and up to
complex shell statements.
The following code fragment shows the basic usage:
/* This variable controls the splitting */
/* This variable controls parsing */
wordsplit_t ws;
int rc;
@ -31,7 +34,7 @@ The following code fragment shows the basic usage:
/* Reclaim the allocated memory */
wordsplit_free(&ws);
For a detailed discussion, please see the man page wordsplit.3 inluded
For a detailed discussion, please see the man page wordsplit.3 included
in the package.
* Description
@ -51,21 +54,26 @@ are for building the autotest-based testsuite:
* Incorporating wordsplit into your project
The project is designed to be used as a git submodule. First, select
the location DIR for the wordsplit directory within your project. Then
add the submodule:
The project is designed to be used as a git submodule. To incorporate
it into your project, first select the location for the wordsplit
directory within your project. Then add the submodule at this
location. The rest is quite straightforward: you need to add
wordsplit.c to your sources and add both wordsplit.c and wordsplit.h
to the distributed files.
git submodule add git://git.gnu.org.ua/wordsplit.git DIR
The rest is quite straightforward: you need to add wordsplit.c to your
sources and add both wordsplit.c and wordsplit.h to the distributed files.
There are two methods of doing so: direct incorporation and
incorporation via VPATH. The discussion below will describe both
methods based on the assumption that your project is using GNU
autotools framework. If you are using plain makefiles, these
The following will describe each step in detail. For the rest of this
discussion it is supposed that 'wordsplit' is the name of the location
selected for the submodule. It is also supposed that your project
uses GNU autotools framework. If you are using plain makefiles, these
instructions are easy to convert to such use as well.
To add the submodule do:
git submodule add git://git.gnu.org.ua/wordsplit.git wordsplit
There are two methods of including the sources to the project: direct
incorporation and incorporation via VPATH.
** Direct incorporation
Add the subdir-objects option to the invocation of AM_INIT_AUTOMAKE macro in
@ -88,8 +96,8 @@ You can also put wordsplit.h in the noinst_HEADERS variable, if you like:
noinst_HEADERS = wordsplit/wordsplit.h
AM_CPPFLAGS = -I$(srcdir)/wordsplit
If you are building an installable library and wish to make wordsplit functions
available, install wordsplit.h to $(pkgincludedir), e.g.
If you are building an installable library and wish to export the
wordsplit API, install wordsplit.h to $(pkgincludedir), e.g.
lib_LTLIBRARIES = libmy.la
libmy_la_SOURCES = main.c \
@ -97,7 +105,7 @@ available, install wordsplit.h to $(pkgincludedir), e.g.
AM_CPPFLAGS = -I$(srcdir)/wordsplit
pkginclude_HEADERS = wordsplit/wordsplit.h
** Vpath-based incorporation
** VPATH-based incorporation
Modify the VPATH variable in your Makefile.am:
@ -105,13 +113,13 @@ Modify the VPATH variable in your Makefile.am:
Notice the use of "+=": it is necessary for the vpath builds to work.
Define the nodist_program_SOURCES variable:
Add wordsplit.c to the nodist_program_SOURCES variable:
nodist_program_SOURCES = wordsplit.c
The nodist_ prefix is necessary to prevent Make from trying to
distribute this file from the current directory (where it doesn't
exist of course). It will find it using VPATH during compilation.
exist of course). During compilation it will be located using VPATH.
Finally, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to
the EXTRA_DIST variable and modify AM_CPPFLAGS as shown in the
@ -196,7 +204,7 @@ Add the following lines to your configure.ac:
** lib/Makefile.am
The makefile in lib must be modified to build the auxiliary program
The Makefile.am in lib must be modified to build the auxiliary program
wsp and create the testsuite script. This is done by the following
fragment:
@ -228,17 +236,18 @@ fragment:
* History
First version of wordsplit appeared in March 2009 as a part of the
Wydawca[1] project. Its main usage there was to assist in
configuration file parsing. The parser subsystem proved to be quite
useful and it soon forked into a separate project - Grecs[2]. This
package had been since used (as a git submodule) in a number of other
projects, such as GNU Dico[3] and Direvent[4], to name a few.
Wydawca[1] project. Its main usage was to assist in configuration
file parsing. The parser subsystem proved to be quite useful and
soon evolved into a separate project - Grecs[2]. This package had been
since used (as a git submodule) in a number of other projects, such as
GNU Dico[3] and Direvent[4], to name a few.
In 2010 the wordsplit sources were incorporated to the GNU
Mailutils[5] package, where they replaced the obsolete argcv module.
Mailutils uses its own configuration package, which meant that using
Grecs was not expedient. Therefore the sources had been exported from
Grecs and are kept in sync with the changes in it.
Grecs. Since then both Mailutils and Grecs versions are periodically
synchronized.
Several other projects, such as GNU Rush[6] and fileserv[7], followed
the suite. It was therefore decided that it would be advisable to

View file

@ -333,7 +333,7 @@ The \fBWRDSF_ESCAPE\fR flag allows the caller to customize escape
sequences. If it is set, the \fBws_escape\fR member must be
initialized. This member provides escape tables for unquoted words
(\fBws_escape[0]\fR) and quoted strings (\fBws_escape[1]\fR). Each
table is a string consisting of an even number of charactes. In each
table is a string consisting of an even number of characters. In each
pair of characters, the first one is a character that can appear after
backslash, and the following one is its translation. For example, the
above table of C escapes is represented as
@ -600,10 +600,10 @@ flag must be set. By default, it's value is \fB\(dq#\(dq\fR.
Escape tables for unquoted words (\fBws_escape[0]\fR) and quoted
strings (\fBws_escape[1]\fR). These are used to translate escape
sequences (\fB\\\fIC\fR) into characters. Each table is a string
consisting of even number of charactes. In each pair of characters,
consisting of even number of characters. In each pair of characters,
the first one is a character that can appear after backslash, and the
following one is its representation. For example, the string
\fB\(dqt\\tn\\n\(dq\fR translates \fB\\t\fR into horisontal
\fB\(dqt\\tn\\n\(dq\fR translates \fB\\t\fR into horizontal
tabulation character and \fB\\n\fR into newline.
.B WRDSF_ESCAPE
flag must be set if this member is initialized.
@ -755,8 +755,8 @@ Default flags. This is a shortcut for:
WRDSF_SQUEEZE_DELIMS |\
WRDSF_CESCAPES)\fR,
i.e.: disable variable expansion and quote substituton, perform quote
removal, treat any number of consequtive delimiters as a single
i.e.: disable variable expansion and quote substitution, perform quote
removal, treat any number of consecutive delimiters as a single
delimiter, replace \fBC\fR escapes appearing in the input string with
the corresponding characters.
.TP
@ -807,7 +807,7 @@ flag is set, and error code is returned. If this flag is set, the
function is called instead. This function is not supposed to return.
.TP
.B WRDSF_WS
Trim off any leading and trailind whitespace from the returned
Trim off any leading and trailing whitespace from the returned
words. This flag is useful if the \fIws_delim\fR member does not
contain whitespace characters.
.TP
@ -1007,7 +1007,7 @@ Undefined variable. This error is returned only if the
\fBWRDSF_UNDEF\fR flag is set.
.TP
.B WRDSE_NOINPUT
Input exhausted. This is not acually an error. This code is returned
Input exhausted. This is not actually an error. This code is returned
if \fBwordsplit\fR (or \fBwordsplit_len\fR) is invoked in incremental
mode and encounters end of input string. See the section
.BR "INCREMENTAL MODE" .