mirror of
git://git.gnu.org.ua/wordsplit.git
synced 2025-04-25 16:19:54 +03:00

* README: Update. * wordsplit.3: Document changes. * wordsplit.at: Test backward compatibility quirk. * wordsplit.c: Make sure NULL and DELIM nodes are protected from expansions. (wordsplit_finish): Ensure the output array produced with WRDSF_RETURN_DELIMS is consistent with that produced without this flag. Provide new option, WRDSO_RETDELNOTEMPTY, to request old buggy behavior. * wordsplit.h (WRDSO_RETDELNOTEMPTY): New option. * wsp.c: New tests.
297 lines
10 KiB
Text
297 lines
10 KiB
Text
README file for the wordsplit library
|
||
See the end of file for copying conditions.
|
||
|
||
* Overview
|
||
|
||
This package provides a set of C functions for parsing input strings.
|
||
Default parsing rules are similar to those used in Bourne shell.
|
||
This includes tilde expansion, variable expansion, quote removal, word
|
||
splitting, command substitution, and path expansion. Parsing is
|
||
controlled by a number of settings which allow the caller to alter
|
||
processing at each of these phases or even to disable any of them.
|
||
Thus, wordsplit can be used for parsing inputs in different formats,
|
||
from simple character-delimited entries, as in /etc/passwd, and up to
|
||
complex shell statements.
|
||
|
||
The following code fragment shows the basic usage:
|
||
|
||
/* This variable controls parsing */
|
||
wordsplit_t ws;
|
||
int rc;
|
||
|
||
/* Provide variable definitions */
|
||
ws.ws_env = (const char **) environ;
|
||
/* Provide a function for expanding commands */
|
||
ws.ws_command = runcom;
|
||
/* Split input_string into words */
|
||
rc = wordsplit(input_string, &ws,
|
||
WRDSF_QUOTE /* Handle both single and
|
||
double quoted strings as words. */
|
||
| WRDSF_SQUEEZE_DELIMS /* Compress adjacent delimiters */
|
||
| WRDSF_PATHEXPAND /* Expand pathnames */
|
||
| WRDSF_SHOWERR); /* Show errors */
|
||
if (rc == 0) {
|
||
/* Success. The resulting words are returned in the NULL-terminated
|
||
array ws.ws_wordv. Number of words is in ws.ws_wordc */
|
||
}
|
||
/* Reclaim the allocated memory */
|
||
wordsplit_free(&ws);
|
||
|
||
For a detailed discussion, please see the man page wordsplit.3 included
|
||
in the package.
|
||
|
||
* Description
|
||
|
||
The package is designed as a drop-in facility for use in larger
|
||
programs. It consists of the following files:
|
||
|
||
wordsplit.h - Interface header.
|
||
wordsplit.c - Main source file.
|
||
wordsplit.3 - Documentation.
|
||
|
||
For most uses, you will need only these three. The remaining files
|
||
are for building the autotest-based testsuite:
|
||
|
||
wsp.c - Auxiliary test program.
|
||
wordsplit.at - The source for the testsuite.
|
||
|
||
* Incorporating wordsplit into your project
|
||
|
||
Wordsplit is designed to be used as a git submodule. To incorporate
|
||
it into your project, first select the location for the wordsplit
|
||
directory within your project. Then add the submodule at this
|
||
location. The rest is quite straightforward: you need to add
|
||
wordsplit.c to your sources and add both wordsplit.c and wordsplit.h
|
||
to the distributed files.
|
||
|
||
The following will describe each step in detail. For the rest of this
|
||
discussion it is supposed that 'wordsplit' is the name of the location
|
||
selected for the submodule. It is also supposed that your project
|
||
uses GNU autotools framework. If you are using plain makefiles, these
|
||
instructions are easy to convert to such use as well.
|
||
|
||
To add the submodule do:
|
||
|
||
git submodule add git://git.gnu.org.ua/wordsplit.git wordsplit
|
||
|
||
There are two methods of including the sources to the project: direct
|
||
incorporation and incorporation via VPATH.
|
||
|
||
** Direct incorporation
|
||
|
||
Add the subdir-objects option to the invocation of AM_INIT_AUTOMAKE macro in
|
||
your configure.ac:
|
||
|
||
AM_INIT_AUTOMAKE([subdir-objects])
|
||
|
||
In your Makefile.am, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h
|
||
to the sources and -Iwordsplit to the cpp flags. For example:
|
||
|
||
program_SOURCES = main.c \
|
||
wordsplit/wordsplit.c \
|
||
wordsplit/wordsplit.h
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
|
||
You can also put wordsplit.h in the noinst_HEADERS variable, if you like:
|
||
|
||
program_SOURCES = main.c \
|
||
wordsplit/wordsplit.c
|
||
noinst_HEADERS = wordsplit/wordsplit.h
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
|
||
If you are building an installable library and wish to export the
|
||
wordsplit API, install wordsplit.h to $(pkgincludedir), e.g.
|
||
|
||
lib_LTLIBRARIES = libmy.la
|
||
libmy_la_SOURCES = main.c \
|
||
wordsplit/wordsplit.c
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
pkginclude_HEADERS = wordsplit/wordsplit.h
|
||
|
||
** VPATH-based incorporation
|
||
|
||
Modify the VPATH variable in your Makefile.am:
|
||
|
||
VPATH = $(srcdir):$(srcdir)/wordsplit
|
||
|
||
Add wordsplit.c to the nodist_program_SOURCES variable:
|
||
|
||
nodist_program_SOURCES = wordsplit.c
|
||
|
||
The nodist_ prefix is necessary to prevent Make from trying to
|
||
distribute this file from the current directory (where it doesn't
|
||
exist, of course). During compilation it will be located using VPATH.
|
||
|
||
Finally, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to
|
||
the EXTRA_DIST variable and modify AM_CPPFLAGS as shown in the
|
||
previous section.
|
||
|
||
An example Makefile.am:
|
||
|
||
program_SOURCES = main.c
|
||
nodist_program_SOURCES = wordsplit.c
|
||
VPATH = $(srcdir):$(srcdir)/wordsplit
|
||
EXTRA_DIST = wordsplit/wordsplit.c wordsplit/wordsplit.h
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
|
||
It is also possible to use LDADD as shown in the example below:
|
||
|
||
program_SOURCES = main.c
|
||
LDADD = wordsplit.o
|
||
VPATH = $(srcdir):$(srcdir)/wordsplit
|
||
EXTRA_DIST = wordsplit/wordsplit.c wordsplit/wordsplit.h
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
|
||
* The testsuite
|
||
|
||
The package contains two files for building the testsuite: wsp.c,
|
||
which is used to build the auxiliary binary wsp, and wordsplit.at,
|
||
which can be included to a GNU autotest-based testsuite source.
|
||
|
||
The discussion below is for those who wish to include wordsplit
|
||
testsuite into their project. It assumes that the hosting project
|
||
already has an autotest-based testsuite.
|
||
|
||
** Additional files
|
||
|
||
To build the auxiliary tool wsp, it is recommended to define the
|
||
WORDSPLIT_VERSION macro to the actual version of the wordsplit
|
||
package. There are two ways of doing so. First, you can define
|
||
WORDSPLIT_VERSION in your config.h or pass its definition via the
|
||
AM_CPPFLAGS variable. Secondly, you can create an additional file,
|
||
wordsplit-version.h, containing its definition and define the
|
||
HAVE_WORDSPLIT_VERSION_H macro to make sure it is included. The
|
||
following shell fragment can be used to create the header:
|
||
|
||
version=$(cd wordsplit; git describe)
|
||
cat > wordsplit-version.h <<EOF
|
||
#define WORDSPLIT_VERSION "$version"
|
||
EOF
|
||
|
||
This file should be listed in the EXTRA_DIST variable to make sure
|
||
it is distributed with the tarball.
|
||
|
||
** testsuite.at
|
||
|
||
Include the file wordsplit.at to your testsuite.at:
|
||
|
||
m4_include(wordsplit.at)
|
||
|
||
** Makefile.am
|
||
|
||
In the Makefile.am responsible for creating the testsuite, make sure
|
||
that the path to the wordsplit module is passed to the autotest
|
||
invocation, so that the above m4_include statement will work. The
|
||
usual make goal to build the testsuite looks as follows:
|
||
|
||
$(TESTSUITE): package.m4 $(TESTSUITE_AT)
|
||
$(AM_V_GEN)$(AUTOTEST) \
|
||
-I $(srcdir)\
|
||
-I $(top_srcdir)/wordsplit\
|
||
testsuite.at -o $@.tmp
|
||
$(AM_V_at)mv $@.tmp $@
|
||
|
||
Then, add the following fragment to build the auxiliary files:
|
||
|
||
# ###########################
|
||
# Wordsplit testsuite
|
||
# ###########################
|
||
EXTRA_DIST += wordsplit-version.h
|
||
$(srcdir)/wordsplit-version.h: $(top_srcdir)/configure.ac
|
||
$(AM_V_GEN){\
|
||
if test -e $(top_srcdir)/libmailutils/wordsplit/.git; then \
|
||
wsversion=$$(cd $(top_srcdir)/libmailutils/wordsplit; git describe); \
|
||
else \
|
||
wsversion="unknown"; \
|
||
fi;\
|
||
echo "#define WORDSPLIT_VERSION \"$$wsversion\""; } > \
|
||
> $(srcdir)/wordsplit-version.h
|
||
|
||
AM_CPPFLAGS += -DHAVE_WORDSPLIT_VERSION_H
|
||
noinst_PROGRAMS += wsp
|
||
wsp_SOURCES =
|
||
nodist_wsp_SOURCES = wsp.c
|
||
wsp.o: $(srcdir)/wordsplit-version.h
|
||
VPATH = $(srcdir):$(top_srcdir)/wordsplit
|
||
|
||
* History
|
||
|
||
First version of wordsplit appeared in March 2009 as part of the
|
||
Wydawca[1] project. Its main usage was to assist in configuration
|
||
file parsing. The parser subsystem proved to be quite useful and
|
||
soon evolved into a separate project - Grecs[2]. This package had been
|
||
since used (as a git submodule) in a number of other projects, such as
|
||
GNU Dico[3] and Direvent[4], to name a few.
|
||
|
||
In 2010 wordsplit sources were incorporated to the GNU Mailutils[5]
|
||
package, where they replaced the obsolete argcv module. Mailutils
|
||
uses its own configuration package, which meant that using Grecs was
|
||
not expedient. Therefore the sources had been exported from
|
||
Grecs. Since then both Mailutils and Grecs versions were periodically
|
||
synchronized.
|
||
|
||
Several other projects, such as GNU Rush[6] and fileserv[7], followed
|
||
suit. It was therefore decided that it would be advisable to
|
||
have wordsplit as a separate package which could be easily included in
|
||
another project without incurring unnecessary overhead.
|
||
|
||
By the end of July 2019, all mentioned packages switched to using
|
||
wordsplit as a submodule.
|
||
|
||
* References
|
||
|
||
[1] Wydawca - an automatic release submission daemon
|
||
Home: <http://puszcza.gnu.org.ua/software/wydawca>
|
||
Git: <http://git.gnu.org.ua/cgit/wydawca.git>
|
||
[2] Grecs - a library for parsing structured configuration files
|
||
Home: <https://puszcza.gnu.org.ua/projects/grecs>
|
||
Git: <http://git.gnu.org.ua/cgit/grecs.git>
|
||
[3] GNU Dico - a dictionary server
|
||
Home: <https://puszcza.gnu.org.ua/projects/dico>
|
||
Git: <http://git.gnu.org.ua/cgit/dico.git>
|
||
[4] GNU Direvent - filesystem event watching daemon
|
||
Home: <http://puszcza.gnu.org.ua/software/direvent>
|
||
Git: <http://git.gnu.org.ua/cgit/direvent.git>
|
||
[5] GNU Mailutils - a general-purpose mail package
|
||
Home: <http://mailutils.org>
|
||
Git: <http://git.savannah.gnu.org/cgit/mailutils.git>
|
||
[6] GNU Rush - a restricted user shell for remote access
|
||
Home: <http://puszcza.gnu.org.ua/software/rush>
|
||
Git: <http://git.gnu.org.ua/cgit/rush.git>
|
||
[7] fileserv - simple http server for serving static files
|
||
Home: <https://puszcza.gnu.org.ua/projects/fileserv>
|
||
Git: <http://git.gnu.org.ua/cgit/fileserv.git>
|
||
[8] vmod-dbrw - Database-driven rewrite rules for Varnish Cache
|
||
Home: <http://puszcza.gnu.org.ua/software/vmod-dbrw>
|
||
Git: <http://git.gnu.org.ua/cgit/vmod-dbrw.git>
|
||
|
||
* Bug reporting
|
||
|
||
Please send bug reports, questions, suggestions and criticism to
|
||
<gray@gnu.org>. When sending bug reports, please make sure to provide
|
||
the following information:
|
||
|
||
1. Wordsplit invocation flags.
|
||
2. Input string.
|
||
3. Produced output.
|
||
4. Expected output.
|
||
|
||
* Copying
|
||
|
||
Copyright (C) 2009-2025 Sergey Poznyakoff
|
||
|
||
Permission is granted to anyone to make or distribute verbatim copies
|
||
of this document as received, in any medium, provided that the
|
||
copyright notice and this permission notice are preserved,
|
||
thus giving the recipient permission to redistribute in turn.
|
||
|
||
Permission is granted to distribute modified versions
|
||
of this document, or of portions of it, under the above conditions,
|
||
provided also that they carry prominent notices stating who last
|
||
changed them.
|
||
|
||
Local Variables:
|
||
mode: outline
|
||
paragraph-separate: "[ ]*$"
|
||
version-control: never
|
||
End:
|