nws(1) - normalize whitespace

SYNOPSIS

Normalizes whitespace in one of several modes.

nws [-m <mode>] [[-i[<ext>]] file...]

Condensing <mode>s:

All these modes normalize runs of tabs and spaces to a single space  
each and trim leading and trailing runs; they only differ with respect to
how multi-line input is processed.

mp   (default) multi-paragraph: folds multiple blank lines into one
fp   flattened multi-paragraph: normalizes each paragraph to single line
sp   single-paragraph: removes all blank lines.
sl   single-line: normalizes to single output line

Transliteration <mode>s:

lf     translates line endings to LF-only (\n)
crlf   translates line endings to CRLF (\r\n)
ascii  translates Unicode whitespace and punctuation to ASCII

Alternatively, specify mode values directly as options; e.g., --sp in lieu
of -m sp

Standard options: --help, --man, --version, --home

DESCRIPTION

nws (normalize whitespace) performs whitespace normalization,
offering several modes in two categories:

whitespace-condensing modes:
Trims leading and trailing runs of any mix of tabs and spaces and replaces
them with a single space each. The individual modes differ only with respect to
how multi-line input is treated.
whitespace-transliteration modes:
Line endings can be changed to be Windows- or Unix-specific, and select
Unicode whitespace and punctuation can be replaced with their closest ASCII
equivalents.

Input is provided either from the specified files or via stdin.
Output is sent to stdout by default.
To update files in-place, use the -i option (in which case there will be no
stdout output).

OPTIONS

CONDENSING modes: -m <mode> or --mode <mode> or --<mode>
where <mode> is one of:
- mp, multi-para (default)
  Runs of blank (all-whitespace or empty) lines are replaced with 1 empty
  line each, resulting in paragraph-internal newlines getting preserved,
  with blank lines at the beginning, between paragraphs, and at the end
  getting normalized to a single empty line each.
- fp, flat-para
  Like mode mp, except that paragraph-internal newlines are replaced
  with a single space each, resulting in each paragraph becoming a
  single line, with 1 empty line between paragraphs.
- sp, single-para
  Runs of blank (all-whitespace or empty) lines are discarded, resulting
  in a single paragraph of non-blank lines.
- sl, single-line
  Normalization includes newlines too, so that any run of any mix of
  spaces, tabs, and newlines is replaced with a single space each,
  resulting in a single, long output line.
TRANSLITERATION modes: -m <mode> or --mode <mode>or --<mode>
where <mode> is one of:
- lf
  Translates Windows-style CRLF (\r\n) line endings to Unix-style LF (\n)
  line endings.
- crlf
  Translates Unix-style LF (\n) line endings to Windows-style CRLF (\r\n)
  line endings.
- ascii, ascii-punctuation
  Translates non-ASCII Unicode whitespace and punctuation to the closest
  ASCII equivalents, while leaving other non-ASCII characters untouched.
  This is helpful for source-code samples that have been formatted for display
  with typographic quotes, em dashes, and the like, which usually makes the
  code indigestible to compilers/interpreters.
  IMPORTANT: This mode only works with PROPERLY ENCODED UTF-8 FILES.
  On BSD/macOS systems, an improperly encoded input file will
  result in a 'sed: RE error: illegal byte sequence' error.
-i[<backup-suffix>], --in-place[=<backup-suffix>]
Updates the specified files in place; that is, the results of the
normalization are written back to the input file, and no stdout output is
produced. If <backup-suffix> is specified (recommended), a backup copy of each
input file is made first, simply by appending the suffix to the filename.

STANDARD OPTIONS

All standard options provide information only.

-h, --help
Prints the contents of the synopsis chapter to stdout for quick reference.
--man
Displays this manual page, which is a helpful alternative to using man,
if the manual page isn't installed.
--version
Prints version information.
--home
Opens this utility's home page in the system's default web browser.

LICENSE

For license information, bug reports, and more, visit this utility's home page
by running nws --home

EXAMPLES

The examples use ANSI C-quoted input strings ($'...') for brevity, which
are supported in Bash, Ksh, and Zsh.
Empty output lines are represented by ~.

## CONDENSING EXAMPLES:

# Single-line input - no mode needed.
$ nws <<< $'  one \t\t two  three   '
one two three

# Default: multi-paragraph mode (`-m mp` or `--mode multi-para`)
$ nws <<<$'\n\n  one\n two \n\n\n  three\n\n'
~
one
two
~
three
~

# Single-paragraph mode; `-m sp` is the short equivalent of
# `--mode single-para`.
$ nws -m sp <<<$'\n\n  one\n two \n\n\n  three\n\n'
one
two
three

# Flattened-paragraph mode; note use of shorcut option `--fp` for `-m fp`.
nws --fp <<<$'\n\n  one\n two \n\n\n  three\n\n'
~
one two
~
three
~

# Single-line mode
$ nws --sl <<<$'  one two\n  three '
one two three

## TRANSLITERATION EXAMPLES:

# Converts a CRLF line-endings file (Windows) to a LF-only file (Unix).
# No output is produced, because the file is updated in-place; a backup
# of the original file is created with suffix '.bak'. 
$ nws --mode lf --in-place=.bak from-windows.txt

# Converts a LF-only file (Unix) to a CRLF line-endings file (Windows).
# No output is produced, because the file is updated in-place; since no
# backup suffix is specified, no backup file is created.
$ nws --crlf -i from-unix.txt

# Converts select Unicode whitespace and punctuation chars. to their 
# closest ASCII equivalents and sends the output to a different file. 
$ nws --ascii unicode-punct.txt > ascii-punct.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nws.md

nws.md

nws(1) - normalize whitespace

SYNOPSIS

DESCRIPTION

OPTIONS

STANDARD OPTIONS

LICENSE

EXAMPLES

Files

nws.md

Latest commit

History

nws.md

File metadata and controls

nws(1) - normalize whitespace

SYNOPSIS

DESCRIPTION

OPTIONS

STANDARD OPTIONS

LICENSE

EXAMPLES