mirror of
git://git.gnu.org.ua/wordsplit.git
synced 2025-04-25 16:19:54 +03:00
Improve docs
This commit is contained in:
parent
5742ab5a03
commit
d36275fe9a
2 changed files with 48 additions and 39 deletions
73
README
73
README
|
@ -1,15 +1,18 @@
|
|||
* Overview
|
||||
|
||||
This package provides a set of C functions for splitting a string into
|
||||
words. The splitting process is highly configurable and allows for
|
||||
considerable flexibility. The default splitting rules are similar to
|
||||
those used in Bourne shell. The splitting process includes tilde
|
||||
expansion, variable expansion, quote removal, command substitution,
|
||||
and path expansion. Each of these phases can be turned off by the caller.
|
||||
This package provides a set of C functions for parsing input strings.
|
||||
Default parsing rules are are similar to those used in Bourne shell.
|
||||
This includes tilde expansion, variable expansion, quote removal, word
|
||||
splitting, command substitution, and path expansion. Parsing is
|
||||
controlled by a number of settings which allow the caller to alter
|
||||
processing at each of these phases or even to disable any of them.
|
||||
Thus, wordsplit can be used for parsing inputs in different formats,
|
||||
from simple character-delimited entries, as in /etc/passwd, and up to
|
||||
complex shell statements.
|
||||
|
||||
The following code fragment shows the basic usage:
|
||||
|
||||
/* This variable controls the splitting */
|
||||
/* This variable controls parsing */
|
||||
wordsplit_t ws;
|
||||
int rc;
|
||||
|
||||
|
@ -31,7 +34,7 @@ The following code fragment shows the basic usage:
|
|||
/* Reclaim the allocated memory */
|
||||
wordsplit_free(&ws);
|
||||
|
||||
For a detailed discussion, please see the man page wordsplit.3 inluded
|
||||
For a detailed discussion, please see the man page wordsplit.3 included
|
||||
in the package.
|
||||
|
||||
* Description
|
||||
|
@ -51,21 +54,26 @@ are for building the autotest-based testsuite:
|
|||
|
||||
* Incorporating wordsplit into your project
|
||||
|
||||
The project is designed to be used as a git submodule. First, select
|
||||
the location DIR for the wordsplit directory within your project. Then
|
||||
add the submodule:
|
||||
The project is designed to be used as a git submodule. To incorporate
|
||||
it into your project, first select the location for the wordsplit
|
||||
directory within your project. Then add the submodule at this
|
||||
location. The rest is quite straightforward: you need to add
|
||||
wordsplit.c to your sources and add both wordsplit.c and wordsplit.h
|
||||
to the distributed files.
|
||||
|
||||
git submodule add git://git.gnu.org.ua/wordsplit.git DIR
|
||||
|
||||
The rest is quite straightforward: you need to add wordsplit.c to your
|
||||
sources and add both wordsplit.c and wordsplit.h to the distributed files.
|
||||
|
||||
There are two methods of doing so: direct incorporation and
|
||||
incorporation via VPATH. The discussion below will describe both
|
||||
methods based on the assumption that your project is using GNU
|
||||
autotools framework. If you are using plain makefiles, these
|
||||
The following will describe each step in detail. For the rest of this
|
||||
discussion it is supposed that 'wordsplit' is the name of the location
|
||||
selected for the submodule. It is also supposed that your project
|
||||
uses GNU autotools framework. If you are using plain makefiles, these
|
||||
instructions are easy to convert to such use as well.
|
||||
|
||||
To add the submodule do:
|
||||
|
||||
git submodule add git://git.gnu.org.ua/wordsplit.git wordsplit
|
||||
|
||||
There are two methods of including the sources to the project: direct
|
||||
incorporation and incorporation via VPATH.
|
||||
|
||||
** Direct incorporation
|
||||
|
||||
Add the subdir-objects option to the invocation of AM_INIT_AUTOMAKE macro in
|
||||
|
@ -88,8 +96,8 @@ You can also put wordsplit.h in the noinst_HEADERS variable, if you like:
|
|||
noinst_HEADERS = wordsplit/wordsplit.h
|
||||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||||
|
||||
If you are building an installable library and wish to make wordsplit functions
|
||||
available, install wordsplit.h to $(pkgincludedir), e.g.
|
||||
If you are building an installable library and wish to export the
|
||||
wordsplit API, install wordsplit.h to $(pkgincludedir), e.g.
|
||||
|
||||
lib_LTLIBRARIES = libmy.la
|
||||
libmy_la_SOURCES = main.c \
|
||||
|
@ -97,7 +105,7 @@ available, install wordsplit.h to $(pkgincludedir), e.g.
|
|||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||||
pkginclude_HEADERS = wordsplit/wordsplit.h
|
||||
|
||||
** Vpath-based incorporation
|
||||
** VPATH-based incorporation
|
||||
|
||||
Modify the VPATH variable in your Makefile.am:
|
||||
|
||||
|
@ -105,13 +113,13 @@ Modify the VPATH variable in your Makefile.am:
|
|||
|
||||
Notice the use of "+=": it is necessary for the vpath builds to work.
|
||||
|
||||
Define the nodist_program_SOURCES variable:
|
||||
Add wordsplit.c to the nodist_program_SOURCES variable:
|
||||
|
||||
nodist_program_SOURCES = wordsplit.c
|
||||
|
||||
The nodist_ prefix is necessary to prevent Make from trying to
|
||||
distribute this file from the current directory (where it doesn't
|
||||
exist of course). It will find it using VPATH during compilation.
|
||||
exist of course). During compilation it will be located using VPATH.
|
||||
|
||||
Finally, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to
|
||||
the EXTRA_DIST variable and modify AM_CPPFLAGS as shown in the
|
||||
|
@ -196,7 +204,7 @@ Add the following lines to your configure.ac:
|
|||
|
||||
** lib/Makefile.am
|
||||
|
||||
The makefile in lib must be modified to build the auxiliary program
|
||||
The Makefile.am in lib must be modified to build the auxiliary program
|
||||
wsp and create the testsuite script. This is done by the following
|
||||
fragment:
|
||||
|
||||
|
@ -228,17 +236,18 @@ fragment:
|
|||
* History
|
||||
|
||||
First version of wordsplit appeared in March 2009 as a part of the
|
||||
Wydawca[1] project. Its main usage there was to assist in
|
||||
configuration file parsing. The parser subsystem proved to be quite
|
||||
useful and it soon forked into a separate project - Grecs[2]. This
|
||||
package had been since used (as a git submodule) in a number of other
|
||||
projects, such as GNU Dico[3] and Direvent[4], to name a few.
|
||||
Wydawca[1] project. Its main usage was to assist in configuration
|
||||
file parsing. The parser subsystem proved to be quite useful and
|
||||
soon evolved into a separate project - Grecs[2]. This package had been
|
||||
since used (as a git submodule) in a number of other projects, such as
|
||||
GNU Dico[3] and Direvent[4], to name a few.
|
||||
|
||||
In 2010 the wordsplit sources were incorporated to the GNU
|
||||
Mailutils[5] package, where they replaced the obsolete argcv module.
|
||||
Mailutils uses its own configuration package, which meant that using
|
||||
Grecs was not expedient. Therefore the sources had been exported from
|
||||
Grecs and are kept in sync with the changes in it.
|
||||
Grecs. Since then both Mailutils and Grecs versions are periodically
|
||||
synchronized.
|
||||
|
||||
Several other projects, such as GNU Rush[6] and fileserv[7], followed
|
||||
the suite. It was therefore decided that it would be advisable to
|
||||
|
|
14
wordsplit.3
14
wordsplit.3
|
@ -333,7 +333,7 @@ The \fBWRDSF_ESCAPE\fR flag allows the caller to customize escape
|
|||
sequences. If it is set, the \fBws_escape\fR member must be
|
||||
initialized. This member provides escape tables for unquoted words
|
||||
(\fBws_escape[0]\fR) and quoted strings (\fBws_escape[1]\fR). Each
|
||||
table is a string consisting of an even number of charactes. In each
|
||||
table is a string consisting of an even number of characters. In each
|
||||
pair of characters, the first one is a character that can appear after
|
||||
backslash, and the following one is its translation. For example, the
|
||||
above table of C escapes is represented as
|
||||
|
@ -600,10 +600,10 @@ flag must be set. By default, it's value is \fB\(dq#\(dq\fR.
|
|||
Escape tables for unquoted words (\fBws_escape[0]\fR) and quoted
|
||||
strings (\fBws_escape[1]\fR). These are used to translate escape
|
||||
sequences (\fB\\\fIC\fR) into characters. Each table is a string
|
||||
consisting of even number of charactes. In each pair of characters,
|
||||
consisting of even number of characters. In each pair of characters,
|
||||
the first one is a character that can appear after backslash, and the
|
||||
following one is its representation. For example, the string
|
||||
\fB\(dqt\\tn\\n\(dq\fR translates \fB\\t\fR into horisontal
|
||||
\fB\(dqt\\tn\\n\(dq\fR translates \fB\\t\fR into horizontal
|
||||
tabulation character and \fB\\n\fR into newline.
|
||||
.B WRDSF_ESCAPE
|
||||
flag must be set if this member is initialized.
|
||||
|
@ -755,8 +755,8 @@ Default flags. This is a shortcut for:
|
|||
WRDSF_SQUEEZE_DELIMS |\
|
||||
WRDSF_CESCAPES)\fR,
|
||||
|
||||
i.e.: disable variable expansion and quote substituton, perform quote
|
||||
removal, treat any number of consequtive delimiters as a single
|
||||
i.e.: disable variable expansion and quote substitution, perform quote
|
||||
removal, treat any number of consecutive delimiters as a single
|
||||
delimiter, replace \fBC\fR escapes appearing in the input string with
|
||||
the corresponding characters.
|
||||
.TP
|
||||
|
@ -807,7 +807,7 @@ flag is set, and error code is returned. If this flag is set, the
|
|||
function is called instead. This function is not supposed to return.
|
||||
.TP
|
||||
.B WRDSF_WS
|
||||
Trim off any leading and trailind whitespace from the returned
|
||||
Trim off any leading and trailing whitespace from the returned
|
||||
words. This flag is useful if the \fIws_delim\fR member does not
|
||||
contain whitespace characters.
|
||||
.TP
|
||||
|
@ -1007,7 +1007,7 @@ Undefined variable. This error is returned only if the
|
|||
\fBWRDSF_UNDEF\fR flag is set.
|
||||
.TP
|
||||
.B WRDSE_NOINPUT
|
||||
Input exhausted. This is not acually an error. This code is returned
|
||||
Input exhausted. This is not actually an error. This code is returned
|
||||
if \fBwordsplit\fR (or \fBwordsplit_len\fR) is invoked in incremental
|
||||
mode and encounters end of input string. See the section
|
||||
.BR "INCREMENTAL MODE" .
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue