Contents
nws
is a Unix CLI that normalizes whitespace in text, offering several modes,
grouped into two categories:
- Whitespace transliteration modes:
Line endings can be changed to be Windows- or Unix-specific, and select
Unicode whitespace and punctuation can be replaced with their closest ASCII
equivalents.
- Whitespace condensing modes:
Trims leading and trailing runs of any mix of tabs and spaces and replaces
them with a single space each. The individual modes in this category differ
only with respect to how multi-line input is treated.
Input can be provided either via filename arguments or via stdin.
Option -i
offers in-place updating.
See the examples below, get concise usage information further below, or read the manual.
# Converts a CRLF line-endings file (Windows) to a LF-only file (Unix).
# No output is produced, because the file is updated in-place; a backup
# of the original file is created with suffix '.bak'.
$ nws --mode lf --in-place=.bak from-windows.txt
# Converts a LF-only file (Unix) to a CRLF line-endings file (Windows).
# No output is produced, because the file is updated in-place; since no
# backup suffix is specified, no backup file is created.
$ nws --crlf -i from-unix.txt
# Converts select Unicode whitespace and punctuation chars. to their
# closest ASCII equivalents and sends the output to a different file.
# Note that any other non-ASCII characters are left untouched.
# Helpful for converting code samples that were formatted for display back to
# valid source code.
# IMPORTANT: This only works with properly encoded UTF-8 files.
$ nws --ascii unicode-punct.txt > ascii-punct.txt
- Output from the example commands is piped to
cat -et
to better illustrate the output;cat -et
shows line endings as$
(and control chars. as^M<char>
; e.g., a tab would show as^I
).
# -- Single-input-line normalization (mode option doesn't apply).
> nws <<<' I will be normalized. ' | cat -et
I will be normalized.$
# Ditto, but with a mix of spaces and tabs.
> nws "$(printf ' I \t\t will be normalized.\t\t')" | cat -et
I will be normalized.$
# -- Multi-input-line normalizations, using different modes.
# Create demo file.
> cat <<EOF > /tmp/nws-demo
$(printf '\t')
one
two
$(printf '\t')
three
EOF
# Multi-paragraph mode - by default, or with `--mp` or `-m mp` or
# `--mode multi-para`.
# In addition to line-internal normalization,
# folds runs of blank/empty lines into 1 empty line each.
$ nws < /tmp/nws-demo | cat -et
$
one$
two$
$
three$
$
# Single-paragraph mode: `--sp` or `-m sp` or `--mode single-para`
# In addition to line-internal normalization,
# removes all blank/empty lines.
$ nws --sp < /tmp/nws-demo | cat -et
one$
two$
three$
# Flattened-multi-pargraph mode: `--fp` or `-m fp` or `--mode flat-para`
# In addition to line-internal normalization,
# joins paragraph-internal lines with a space each.
$ nws --fp < /tmp/nws-demo | cat -et
$
one two$
$
three$
$
# Single-output-line mode: `sl` or `-m sl` or `--mode single-line`.
# In addition to line-internal normalization,
# joins all non-empty/non-blank lines with a space each
# to form a single, long output line.
$ nws --sl < /tmp/nws-demo | cat -et
one two three$
Supported platforms
- When installing from the npm registry: Linux and OSX
- When installing manually: any Unix-like platform with Bash and POSIX-compatible utilities.
Note: Even if you don't use Node.js, its package manager, npm
, works across platforms and is easy to install; try curl -L http://git.io/n-install | bash
With Node.js or io.js installed, install the package as follows:
[sudo] npm install nws-cli -g
Note:
- Whether you need
sudo
depends on how you installed Node.js / io.js and whether you've changed permissions later; if you get anEACCES
error, try again withsudo
. - The
-g
ensures global installation and is needed to putnws
in your system's$PATH
.
- Download the CLI as
nws
. - Make it executable with
chmod +x nws
. - Move it or symlink it to a folder in your
$PATH
, such as/usr/local/bin
(OSX) or/usr/bin
(Linux).
Find concise usage information below; for complete documentation, read the manual online or,
once installed, run man nws
(nws --man
if installed manually).
$ nws --help
Normalizes whitespace in one of several modes.
nws [-m <mode>] [[-i[<ext>]] file...]
Condensing <mode>s:
All these modes normalize runs of tabs and spaces to a single space
each and trim leading and trailing runs; they only differ with respect to
how multi-line input is processed.
mp (default) multi-paragraph: folds multiple blank lines into one
fp flattened multi-paragraph: normalizes each paragraph to single line
sp single-paragraph: removes all blank lines.
sl single-line: normalizes to single output line
Transliteration <mode>s:
lf translates line endings to LF-only (\n)
crlf translates line endings to CRLF (\r\n)
ascii translates Unicode whitespace and punctuation to ASCII
Alternatively, specify mode values directly as options; e.g., --sp in lieu
of -m sp
Standard options: --help, --man, --version, --home
Copyright (c) 2015-2017 Michael Klement mklement0@gmail.com (http://same2u.net), released under the MIT license.
This project gratefully depends on the following open-source components, according to the terms of their respective licenses.
npm dependencies below have optional suffixes denoting the type of dependency; the absence of a suffix denotes a required run-time dependency: (D)
denotes a development-time-only dependency, (O)
an optional dependency, and (P)
a peer dependency.
Versioning complies with semantic versioning (semver).
-
v0.3.4 (2017-09-06):
- [doc] Clarified that
--mode ascii
(--asci
) only works with properly encoded UTF-8 files.
- [doc] Clarified that
-
v0.3.3 (2017-09-05):
- [enhancement] Error message for -i mode improved to reflect the count of input files in case the pre-updating check fails;
this is an improvement with potentially batched
xargs
-mediated invocations to at least provide a hint that only a given batch failed. - [doc] Fixed typo in man page.
- [enhancement] Error message for -i mode improved to reflect the count of input files in case the pre-updating check fails;
this is an improvement with potentially batched
-
v0.3.2 (2016-12-11):
- [fix] Mode
--crlf
is now idempotent with input that is already CRLF- terminated (previously, an extra CR was mistakenly added).
- [fix] Mode
-
v0.3.1 (2016-12-10):
- [doc] Copy-editing in read-me file.
-
v0.3.0 (2016-11-13):
- [BREAKING CHANGE]
nws
is now file-based: operands are interpreted as filenames, and option-i
allows in-place updating. Use stdin to provide strings as input, such as viaecho ... | nws ...
. - [enhancement] New transliteration modes added for changing line-ending styles and for translating non-ASCII Unicode whitespace/punctuation to their closest ASCII equivalents.
- [BREAKING CHANGE]
-
v0.2.0 (2015-09-18):
- [usability improvement] New, mnemonic mode names supersede the old numeric
normalization modes (option-arguments for
-m
); mode names come in both short and long forms; similarly,--mode
is now supported as a verbose alternative to-m
. - [deprecation] The numeric modes (0..3) still work, but should no longer be used and are no longer documented.
- [doc]
nws
now has a man page (if manually installed, usenws --man
);nws -h
now just prints concise usage information.
- [usability improvement] New, mnemonic mode names supersede the old numeric
normalization modes (option-arguments for
-
v0.1.4 (2015-09-15):
- [dev] Makefile improvements; various other behind-the-scenes tweaks.
-
v0.1.3 (2015-06-13):
- [doc] Read-me improvements.
-
v0.1.2 (2015-06-13):
- [doc] Read-me improvements.
-
v0.1.1 (2015-06-13):
- [doc] Read-me improvements.
-
v0.1.0 (2015-06-13):
- Initial release.