mirror of
git://git.gnu.org.ua/wordsplit.git
synced 2025-04-26 00:29:54 +03:00
298 lines
10 KiB
Text
298 lines
10 KiB
Text
* Overview
|
||
|
||
This package provides a set of C functions for splitting a string into
|
||
words. The splitting process is highly configurable and allows for
|
||
considerable flexibility. The default splitting rules are similar to
|
||
those used in Bourne shell. The splitting process includes tilde
|
||
expansion, variable expansion, quote removal, command substitution,
|
||
and path expansion. Each of these phases can be turned off by the caller.
|
||
|
||
The following code fragment shows the basic usage:
|
||
|
||
/* This variable controls the splitting */
|
||
wordsplit_t ws;
|
||
int rc;
|
||
|
||
/* Provide variable definitions */
|
||
ws.ws_env = (const char **) environ;
|
||
/* Provide a function for expanding commands */
|
||
ws.ws_command = runcom;
|
||
/* Split input_string into words */
|
||
rc = wordsplit(input_string, &ws,
|
||
WRDSF_QUOTE /* Handle both single and
|
||
double quoted strings as words. */
|
||
| WRDSF_SQUEEZE_DELIMS /* Compress adjacent delimiters */
|
||
| WRDSF_PATHEXPAND /* Expand pathnames */
|
||
| WRDSF_SHOWERR); /* Show errors */
|
||
if (rc == 0) {
|
||
/* Success. The resulting words are returned in the NULL-terminated
|
||
array ws.ws_wordv. Number of words is in ws.ws_wordc */
|
||
}
|
||
/* Reclaim the allocated memory */
|
||
wordsplit_free(&ws);
|
||
|
||
For a detailed discussion, please see the man page wordsplit.3 inluded
|
||
in the package.
|
||
|
||
* Description
|
||
|
||
The package is designed as a drop-in facility for use in larger
|
||
programs. It consists of the following files:
|
||
|
||
wordsplit.h - Interface header.
|
||
wordsplit.c - Main source file.
|
||
wordsplit.3 - Documentation.
|
||
|
||
For most uses, you will need only these three. The rest of files
|
||
are for building the autotest-based testsuite:
|
||
|
||
wsp.c - Auxiliary test program.
|
||
wordsplit.at - The source for the testsuite.
|
||
|
||
* Incorporating wordsplit into your project
|
||
|
||
The project is designed to be used as a git submodule. First, select
|
||
the location DIR for the wordsplit directory within your project. Then
|
||
add the submodule:
|
||
|
||
git submodule add git://git.gnu.org.ua/wordsplit.git DIR
|
||
|
||
The rest is quite straightforward: you need to add wordsplit.c to your
|
||
sources and add both wordsplit.c and wordsplit.h to the distributed files.
|
||
|
||
There are two methods of doing so: direct incorporation and
|
||
incorporation via VPATH. The discussion below will describe both
|
||
methods based on the assumption that your project is using GNU
|
||
autotools framework. If you are using plain makefiles, these
|
||
instructions are easy to convert to such use as well.
|
||
|
||
** Direct incorporation
|
||
|
||
Add the subdir-objects option to the invocation of AM_INIT_AUTOMAKE macro in
|
||
your configure.ac:
|
||
|
||
AM_INIT_AUTOMAKE([subdir-objects])
|
||
|
||
In your Makefile.am, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h
|
||
to the sources and -Iwordsplit to the cpp flags. For example:
|
||
|
||
program_SOURCES = main.c \
|
||
wordsplit/wordsplit.c \
|
||
wordsplit/wordsplit.h
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
|
||
You can also put wordsplit.h in the noinst_HEADERS variable, if you like:
|
||
|
||
program_SOURCES = main.c \
|
||
wordsplit/wordsplit.c
|
||
noinst_HEADERS = wordsplit/wordsplit.h
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
|
||
If you are building an installable library and wish to make wordsplit functions
|
||
available, install wordsplit.h to $(pkgincludedir), e.g.
|
||
|
||
lib_LTLIBRARIES = libmy.la
|
||
libmy_la_SOURCES = main.c \
|
||
wordsplit/wordsplit.c
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
pkginclude_HEADERS = wordsplit/wordsplit.h
|
||
|
||
** Vpath-based incorporation
|
||
|
||
Modify the VPATH variable in your Makefile.am:
|
||
|
||
VPATH += $(srcdir)/wordsplit
|
||
|
||
Notice the use of "+=": it is necessary for the vpath builds to work.
|
||
|
||
Add wordsplit.o to the name_LIBADD or name_LDADD variable, depending on
|
||
the nature of the object being built.
|
||
|
||
Modify AM_CPPFLAGS as shown in the previous section:
|
||
|
||
AM_CPPFLAGS = -I$(srcdir)/wordsplit
|
||
|
||
Add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to the EXTRA_DIST
|
||
variable.
|
||
|
||
An example Makefile.am:
|
||
|
||
program_SOURCES = main.c
|
||
LDADD = wordsplit.o
|
||
noinst_HEADERS = wordsplit/wordsplit.h
|
||
VPATH += $(srcdir)/wordsplit
|
||
EXTRA_DIST = wordsplit/wordsplit.c wordsplit/wordsplit.h
|
||
|
||
* The testsuite
|
||
|
||
The package contains two files for building the testsuite: wsp.c,
|
||
which is used to build the auxiliary binary wsp, and wordsplit.at,
|
||
which is translated by GNU autotest into a testsuite shell script.
|
||
|
||
The discussion below is for those who wish to include wordsplit
|
||
testsuite into their project. It assumes the following layout of the
|
||
hosting project:
|
||
|
||
lib/
|
||
Directory holding the library that incorporates wordsplit.o.
|
||
This discussion assumes the library name is libmy.a
|
||
lib/wordsplit
|
||
Wordsplit sources.
|
||
|
||
The testsuite will be built in lib.
|
||
|
||
** Additional files
|
||
|
||
Three additional files are necessary for the testsuite: atlocal.in,
|
||
wordsplit-version.h, and package.m4.
|
||
|
||
The file atlocal.in is a simple shell script that sets the PATH
|
||
environment variable for the testsuite. It contains just one line:
|
||
|
||
PATH=$srcdir/wordsplit:$PATH
|
||
|
||
The file wordsplit-version.h provides the version definition for the
|
||
test program wsp.c. Use the following script to create it:
|
||
|
||
version=$(cd wordsplit; git describe)
|
||
cat > wordsplit-version.h <<EOF
|
||
#define WORDSPLIT_VERSION "$version"
|
||
EOF
|
||
|
||
The file package.m4 contains package description which allows
|
||
testsuite to generate an accurate report. To create it, use:
|
||
|
||
cat > package.m4 <<EOF
|
||
m4_define([AT_PACKAGE_NAME], [wordsplit])
|
||
m4_define([AT_PACKAGE_TARNAME], [wordsplit])
|
||
m4_define([AT_PACKAGE_VERSION], [$version])
|
||
m4_define([AT_PACKAGE_STRING], [AT_PACKAGE_NAME AT_PACKAGE_VERSION])
|
||
m4_define([AT_PACKAGE_BUGREPORT], [gray@gnu.org])
|
||
EOF
|
||
|
||
Here, $version is the same variable you used for wordsplit-version.h.
|
||
|
||
After creating the three files, list them in the EXTRA_DIST variable in
|
||
lib/Makefile.am to make sure they will be distributed with the tarball.
|
||
|
||
** configure.ac
|
||
|
||
Add the following lines to your configure.ac:
|
||
|
||
AM_MISSING_PROG([AUTOM4TE], [autom4te])
|
||
|
||
AC_CONFIG_TESTDIR([lib])
|
||
AC_CONFIG_FILES([lib/Makefile lib/atlocal])
|
||
|
||
** lib/Makefile.am
|
||
|
||
The makefile in lib must be modified to build the auxiliary program
|
||
wsp and create the testsuite script. This is done by the following
|
||
fragment:
|
||
|
||
EXTRA_DIST = testsuite wordsplit/wordsplit.at package.m4
|
||
DISTCLEANFILES = atconfig
|
||
MAINTAINERCLEANFILES = Makefile.in $(TESTSUITE)
|
||
|
||
TESTSUITE = $(srcdir)/testsuite
|
||
M4=m4
|
||
AUTOTEST = $(AUTOM4TE) --language=autotest
|
||
$(TESTSUITE): src/wordsplit.at
|
||
$(AM_V_GEN)$(AUTOTEST) -I $(srcdir) wordsplit/wordsplit.at \
|
||
-o $(TESTSUITE).tmp
|
||
$(AM_V_at)mv $(TESTSUITE).tmp $(TESTSUITE)
|
||
|
||
noinst_PROGRAMS = wsp
|
||
wsp_SOURCES = wordsplit/wsp.c wordsplit-version.h
|
||
wsp_LDADD = ./libmy.a
|
||
|
||
atconfig: $(top_builddir)/config.status
|
||
cd $(top_builddir) && ./config.status $@
|
||
|
||
clean-local:
|
||
@test ! -f $(TESTSUITE) || $(SHELL) $(TESTSUITE) --clean
|
||
|
||
check-local: atconfig atlocal $(TESTSUITE)
|
||
@$(SHELL) $(TESTSUITE)
|
||
|
||
* History
|
||
|
||
First version of wordsplit appeared in March 2009 as a part of the
|
||
Wydawca[1] project. Its main usage there was to assist in
|
||
configuration file parsing. The parser subsystem proved to be quite
|
||
useful and it soon forked into a separate project - Grecs[2]. This
|
||
package had been since used (as a git submodule) in a number of other
|
||
projects, such as GNU Dico[3] and Direvent[4], to name a few.
|
||
|
||
In 2010 the wordsplit sources were incorporated to the GNU
|
||
Mailutils[5] package, where they replaced the obsolete argcv module.
|
||
Mailutils uses its own configuration package, which meant that using
|
||
Grecs was not expedient. Therefore the sources had been exported from
|
||
Grecs and are kept in sync with the changes in it.
|
||
|
||
Several other projects, such as GNU Rush[6] and fileserv[7], followed
|
||
the suite. It was therefore decided that it would be advisable to
|
||
have wordsplit as a separate package which could be easily included in
|
||
another project without incurring unnecessary overhead.
|
||
|
||
Currently the work is underway on incorporating it into existing
|
||
projects.
|
||
|
||
* References
|
||
|
||
[1] Wydawca - an automatic release submission daemon
|
||
Home: <http://puszcza.gnu.org.ua/software/wydawca>
|
||
Git: <http://git.gnu.org.ua/cgit/wydawca.git>
|
||
[2] Grecs - a library for parsing structured configuration files
|
||
Home: <https://puszcza.gnu.org.ua/projects/grecs>
|
||
Git: <http://git.gnu.org.ua/cgit/grecs.git>
|
||
[3] GNU Dico - a dictionary server
|
||
Home: <https://puszcza.gnu.org.ua/projects/dico>
|
||
Git: <http://git.gnu.org.ua/cgit/dico.git>
|
||
[4] GNU Direvent - filesystem event watching daemon
|
||
Home: <http://puszcza.gnu.org.ua/software/direvent>
|
||
Git: <http://git.gnu.org.ua/cgit/direvent.git>
|
||
[5] GNU Mailutils - a general-purpose mail package
|
||
Home: <http://mailutils.org>
|
||
Git: <http://git.savannah.gnu.org/cgit/mailutils.git>
|
||
[6] GNU Rush - a restricted user shell for remote access
|
||
Home: <http://puszcza.gnu.org.ua/software/rush>
|
||
Git: <http://git.gnu.org.ua/cgit/rush.git>
|
||
[7] fileserv - simple http server for serving static files
|
||
Home: <https://puszcza.gnu.org.ua/projects/fileserv>
|
||
Git: <http://git.gnu.org.ua/cgit/fileserv.git>
|
||
[8] vmod-dbrw - Database-driven rewrite rules for Varnish Cache
|
||
Home: <http://puszcza.gnu.org.ua/software/vmod-dbrw>
|
||
Git: <http://git.gnu.org.ua/cgit/vmod-dbrw.git>
|
||
|
||
* Bug reporting
|
||
|
||
Please send bug reports, questions, suggestions and criticism to
|
||
<gray@gnu.org>. When sending bug reports, please make sure to provide
|
||
the following information:
|
||
|
||
1. Wordsplit invocation flags.
|
||
2. Input string.
|
||
3. Produced output.
|
||
4. Expected output.
|
||
|
||
* Copying
|
||
|
||
Copyright (C) 2009-2019 Sergey Poznyakoff
|
||
|
||
Permission is granted to anyone to make or distribute verbatim copies
|
||
of this document as received, in any medium, provided that the
|
||
copyright notice and this permission notice are preserved,
|
||
thus giving the recipient permission to redistribute in turn.
|
||
|
||
Permission is granted to distribute modified versions
|
||
of this document, or of portions of it, under the above conditions,
|
||
provided also that they carry prominent notices stating who last
|
||
changed them.
|
||
|
||
Local Variables:
|
||
mode: outline
|
||
paragraph-separate: "[ ]*$"
|
||
version-control: never
|
||
End:
|
||
|