No description
Find a file
Sergey Poznyakoff 4cd8bec42c Improve the docs
2019-07-09 12:21:26 +03:00
.gitignore Add the bootstrap script 2019-07-07 16:15:32 +03:00
README Improve the docs 2019-07-09 12:21:26 +03:00
wordsplit.3 Improve the docs 2019-07-09 12:21:26 +03:00
wordsplit.at Add the bootstrap script 2019-07-07 16:15:32 +03:00
wordsplit.c Minor fixes. 2019-07-08 21:41:32 +03:00
wordsplit.h Flatten the source tree 2019-06-25 14:31:09 +03:00
wsp.c Add the bootstrap script 2019-07-07 16:15:32 +03:00

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

* Overview

This package provides a set of C functions for splitting a string into
words.  The splitting process is highly configurable and allows for
considerable flexibility.  The default splitting rules are similar to
those used in Bourne shell.  The splitting process includes tilde
expansion, variable expansion, quote removal, command substitution,
and path expansion.  Each of these phases can be turned off by the caller.

The following code fragment shows the basic usage:

   /* This variable controls the splitting */
   wordsplit_t ws;
   int rc;
   
   /* Provide variable definitions */
   ws.ws_env = (const char **) environ;
   /* Provide a function for expanding commands */
   ws.ws_command = runcom;
   /* Split input_string into words */
   rc = wordsplit(input_string, &ws,
                  WRDSF_QUOTE     /* Handle both single and
                                     double quoted strings as words. */
		  | WRDSF_SQUEEZE_DELIMS /* Compress adjacent delimiters */
		  | WRDSF_PATHEXPAND     /* Expand pathnames */
                  | WRDSF_SHOWERR);      /* Show errors */
   if (rc == 0) {
       /* Success.  The resulting words are returned in the NULL-terminated
          array ws.ws_wordv.  Number of words is in ws.ws_wordc */
   }
   /* Reclaim the allocated memory */
   wordsplit_free(&ws);

For a detailed discussion, please see the man page wordsplit.3 inluded
in the package.

* Description

The package is designed as a drop-in facility for use in larger
programs.  It consists of the following files:

  wordsplit.h   - Interface header.
  wordsplit.c   - Main source file.
  wordsplit.3   - Documentation.

For most uses, you will need only these three.  The rest of files
are for building the autotest-based testsuite:

  wsp.c         - Auxiliary test program.
  wordsplit.at  - The source for the testsuite.

* Incorporating wordsplit into your project

The project is designed to be used as a git submodule.  First, select
the location DIR for the wordsplit directory within your project.  Then
add the submodule:

  git submodule add git://git.gnu.org.ua/wordsplit.git DIR

The rest is quite straightforward: you need to add wordsplit.c to your
sources and add both wordsplit.c and wordsplit.h to the distributed files.

There are two methods of doing so: direct incorporation and
incorporation via VPATH.  The discussion below will describe both
methods based on the assumption that your project is using GNU
autotools framework.  If you are using plain makefiles, these
instructions are easy to convert to such use as well.

** Direct incorporation

Add the subdir-objects option to the invocation of AM_INIT_AUTOMAKE macro in
your configure.ac:

  AM_INIT_AUTOMAKE([subdir-objects])
  
In your Makefile.am, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h
to the sources and -Iwordsplit to the cpp flags.  For example:

  program_SOURCES = main.c \
                    wordsplit/wordsplit.c \
  	            wordsplit/wordsplit.h
  AM_CPPFLAGS = -I$(srcdir)/wordsplit

You can also put wordsplit.h in the noinst_HEADERS variable, if you like:

  program_SOURCES = main.c \
                    wordsplit/wordsplit.c
  noinst_HEADERS = wordsplit/wordsplit.h
  AM_CPPFLAGS = -I$(srcdir)/wordsplit

If you are building an installable library and wish to make wordsplit functions
available, install wordsplit.h to $(pkgincludedir), e.g.

  lib_LTLIBRARIES = libmy.la
  libmy_la_SOURCES = main.c \
                     wordsplit/wordsplit.c
  AM_CPPFLAGS = -I$(srcdir)/wordsplit
  pkginclude_HEADERS = wordsplit/wordsplit.h

** Vpath-based incorporation

Modify the VPATH variable in your Makefile.am:

  VPATH += $(srcdir)/wordsplit

Notice the use of "+=": it is necessary for the vpath builds to work.

Add wordsplit.o to the name_LIBADD or name_LDADD variable, depending on
the nature of the object being built.

Modify AM_CPPFLAGS as shown in the previous section:

  AM_CPPFLAGS = -I$(srcdir)/wordsplit

Add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to the EXTRA_DIST
variable.

An example Makefile.am:

  program_SOURCES = main.c
  LDADD = wordsplit.o
  noinst_HEADERS = wordsplit/wordsplit.h
  VPATH += $(srcdir)/wordsplit
  EXTRA_DIST = wordsplit/wordsplit.c wordsplit/wordsplit.h

* The testsuite

The package contains two files for building the testsuite: wsp.c,
which is used to build the auxiliary binary wsp, and wordsplit.at,
which is translated by GNU autotest into a testsuite shell script.

The discussion below is for those who wish to include wordsplit
testsuite into their project.  It assumes the following layout of the
hosting project:

  lib/
    Directory holding the library that incorporates wordsplit.o.
    This discussion assumes the library name is libmy.a
  lib/wordsplit
    Wordsplit sources.

The testsuite will be built in lib.

** Additional files

Three additional files are necessary for the testsuite: atlocal.in,
wordsplit-version.h, and package.m4.

The file atlocal.in is a simple shell script that sets the PATH
environment variable for the testsuite.  It contains just one line:

  PATH=$srcdir/wordsplit:$PATH

The file wordsplit-version.h provides the version definition for the
test program wsp.c.  Use the following script to create it:

  version=$(cd wordsplit; git describe)
  cat > wordsplit-version.h <<EOF
  #define WORDSPLIT_VERSION "$version"
  EOF

The file package.m4 contains package description which allows
testsuite to generate an accurate report.  To create it, use:

  cat > package.m4 <<EOF
  m4_define([AT_PACKAGE_NAME],      [wordsplit])
  m4_define([AT_PACKAGE_TARNAME],   [wordsplit])
  m4_define([AT_PACKAGE_VERSION],   [$version])
  m4_define([AT_PACKAGE_STRING],    [AT_PACKAGE_NAME AT_PACKAGE_VERSION])
  m4_define([AT_PACKAGE_BUGREPORT], [gray@gnu.org])
  EOF

Here, $version is the same variable you used for wordsplit-version.h.

After creating the three files, list them in the EXTRA_DIST variable in
lib/Makefile.am to make sure they will be distributed with the tarball.

** configure.ac

Add the following lines to your configure.ac:

  AM_MISSING_PROG([AUTOM4TE], [autom4te])

  AC_CONFIG_TESTDIR([lib])
  AC_CONFIG_FILES([lib/Makefile lib/atlocal])

** lib/Makefile.am

The makefile in lib must be modified to build the auxiliary program
wsp and create the testsuite script.  This is done by the following
fragment:

  EXTRA_DIST = testsuite wordsplit/wordsplit.at package.m4
  DISTCLEANFILES       = atconfig
  MAINTAINERCLEANFILES = Makefile.in $(TESTSUITE)

  TESTSUITE = $(srcdir)/testsuite
  M4=m4
  AUTOTEST = $(AUTOM4TE) --language=autotest
  $(TESTSUITE): src/wordsplit.at
	$(AM_V_GEN)$(AUTOTEST) -I $(srcdir) wordsplit/wordsplit.at \
	                       -o $(TESTSUITE).tmp
	$(AM_V_at)mv $(TESTSUITE).tmp $(TESTSUITE)

  noinst_PROGRAMS = wsp
  wsp_SOURCES = wordsplit/wsp.c wordsplit-version.h 
  wsp_LDADD = ./libmy.a
  
  atconfig: $(top_builddir)/config.status 
	cd $(top_builddir) && ./config.status $@

  clean-local:
	@test ! -f $(TESTSUITE) || $(SHELL) $(TESTSUITE) --clean

  check-local: atconfig atlocal $(TESTSUITE)
	@$(SHELL) $(TESTSUITE)

* History

First version of wordsplit appeared in March 2009 as a part of the
Wydawca[1] project.  Its main usage there was to assist in
configuration file parsing.  The parser subsystem proved to be quite
useful and it soon forked into a separate project - Grecs[2].  This
package had been since used (as a git submodule) in a number of other
projects, such as GNU Dico[3] and Direvent[4], to name a few.

In 2010 the wordsplit sources were incorporated to the GNU
Mailutils[5] package, where they replaced the obsolete argcv module.
Mailutils uses its own configuration package, which meant that using
Grecs was not expedient.  Therefore the sources had been exported from
Grecs and are kept in sync with the changes in it.

Several other projects, such as GNU Rush[6] and fileserv[7], followed
the suite.  It was therefore decided that it would be advisable to
have wordsplit as a separate package which could be easily included in
another project without incurring unnecessary overhead.

Currently the work is underway on incorporating it into existing
projects.

* References

[1] Wydawca - an automatic release submission daemon
    Home: <http://puszcza.gnu.org.ua/software/wydawca>
    Git: <http://git.gnu.org.ua/cgit/wydawca.git>
[2] Grecs - a library for parsing structured configuration files
    Home: <https://puszcza.gnu.org.ua/projects/grecs>
    Git: <http://git.gnu.org.ua/cgit/grecs.git>
[3] GNU Dico - a dictionary server
    Home: <https://puszcza.gnu.org.ua/projects/dico>
    Git: <http://git.gnu.org.ua/cgit/dico.git>
[4] GNU Direvent - filesystem event watching daemon
    Home: <http://puszcza.gnu.org.ua/software/direvent>
    Git: <http://git.gnu.org.ua/cgit/direvent.git>
[5] GNU Mailutils - a general-purpose mail package
    Home: <http://mailutils.org>
    Git: <http://git.savannah.gnu.org/cgit/mailutils.git>
[6] GNU Rush - a restricted user shell for remote access
    Home: <http://puszcza.gnu.org.ua/software/rush>
    Git: <http://git.gnu.org.ua/cgit/rush.git>
[7] fileserv - simple http server for serving static files
    Home: <https://puszcza.gnu.org.ua/projects/fileserv>
    Git: <http://git.gnu.org.ua/cgit/fileserv.git>
[8] vmod-dbrw -	Database-driven rewrite rules for Varnish Cache
    Home: <http://puszcza.gnu.org.ua/software/vmod-dbrw>
    Git: <http://git.gnu.org.ua/cgit/vmod-dbrw.git>    

* Bug reporting

Please send bug reports, questions, suggestions and criticism to
<gray@gnu.org>.  When sending bug reports, please make sure to provide
the following information:

  1. Wordsplit invocation flags.
  2. Input string.
  3. Produced output.
  4. Expected output.

* Copying

Copyright (C) 2009-2019 Sergey Poznyakoff

Permission is granted to anyone to make or distribute verbatim copies
of this document as received, in any medium, provided that the
copyright notice and this permission notice are preserved,
thus giving the recipient permission to redistribute in turn.

Permission is granted to distribute modified versions
of this document, or of portions of it, under the above conditions,
provided also that they carry prominent notices stating who last
changed them.

Local Variables:
mode: outline
paragraph-separate: "[ 	]*$"
version-control: never
End: