Documentation for the functions in string_handling.sh. A general overview is given in the project documentation.
- escape()
- escape_sed_special_characters()
- find_sed_operation_separator()
- find_substring()
- get_absolute_path()
- get_random_string()
- get_sed_extract_expression()
- get_sed_replace_expression()
- get_string_bytelength()
- get_string_bytes()
- is_string_a()
- sanitize_variable_quotes()
- trim()
If the pipes are not documented, the default is:
stdin
: piped input ignoredstdout
: empty
Parameters enclosed in brackets [ ] are optional.
Takes the piped input and escapes specified character(s) with backslashes
Special care is taken to disable bash globbing to make sure that affected characters, typically *, can be escaped properly. At the end, the original globbing configuration is restored.
Example:
echo "path/to/file" | escape "/"
prints path\/to\/file
Param. | $1 | character to escape |
[$2...n ] | additional character(s) to escape | |
Pipes | stdin | read completely |
stdout | stdin where $1...n were escaped | |
Status | 0 |
If a string contains a value enclosed in quotes (the quotes are part of string), this function removes them. It checks for single and double quotes.
Examples:
- Input as parameter:
sanitize_variable_quotes "'quoted value'"
- Piped input
$(echo "'quoted value'" | sanitize_variable_quotes)
both print quoted value
Param. | [$1 ] | string to sanitize, if omitted or empty stdin is read |
Pipes | stdin | if $1 is undefined or empty, read completely |
stdout | sanitized $1 or stdin | |
Status | 0 |
Cut leading and trailing whitespace on either the provided parameter or the piped stdin
Examples:
- Input as parameter:
trimmed_string=$(trim "$string_to_trim")
- Piped input:
trimmed_string=$(echo "$string_to_trim" | trim)
Param. | [$1 ] | string to trim, if omitted or empty stdin is read |
Pipes | stdin | if $1 is undefined or empty, read completely |
stdout | trimmed $1 or stdin | |
Status | 0 |
Finds the position of the first instance of $2
in $1
. If $3
is omitted the search begins at the beginning of $1
, otherwise it begins after $3
characters.
Inspired by this StackOverflow thread
Param. | $1 | string to search in |
$2 | character/string to find - exact matching is used (bash's matching special characters are disabled) | |
[$3 ] | search start position inside $1 - if it's omitted, search starts at the beginning | |
Pipes | stdout |
|
Status | 0 | success, search executed and result written on stdout |
1 | $1 undefined or empty | |
2 | $2 undefined or empty |
Transforms $1
in a absolute filepath if it's relative. Uses $2
as base directory if it's defined, the current working directory otherwise.
The path $1
and the directory $2
don't have to exist.
Param. | $1 | path to "absolutify" if necessary |
[$2 ] | root path - if omitted, the current working directory is used | |
Pipes | stdout | the computed absolute path |
Status | 0 |
Checks if string $1
is of a certain type $2
:
Type | Test condition |
---|---|
absolute_filepath | the first non-whitespace character of $1 is a / |
integer | $1 contains only numbers |
The test may be inverted if the type is prepended with a !, f.ex. !absolute_filepath.
Example:
is_string_a "$potential_int" "integer" && echo "This is a integer: $potential_int"
Param. | $1 | string to check |
$2 | test type, see table above; can be inverted with a leading !, f.ex. !integer | |
Status | 0 | the test was executed and succeeded |
1 | the test was executed, but failed | |
2 | $1 is empty | |
3 | $2 is empty | |
4 | $2 is unknown |
Gives the byte length of $1
. Characters which are part of the ASCII set are encoded on 1 byte, hence, for strings which contain only
ASCII characters, the byte length and the string length correspond. Characters from other sets like f.ex. é, à, å, etc. require 2 or more
bytes and lead to a higher byte length than string length. Uses the C locale internally.
Param. | $1 | string to get the bytelength of |
Pipes | stdout | the bytelength of $1 |
Status | 0 |
Computes the byte representation of a string. Non-ASCII chars like à,é,å,ê,etc. are transformed to their character code, é f.ex. is \303\251. Uses the C locale internally.
Param. | $1 | string to get the byte representation of |
Pipes | stdout | the byte representation of $1 |
Status | 0 |
Generates sed
string extraction expression.
Example:
echo "first|second|third" | sed -e $(get_sed_extract_expression "|" "before" "first")
prints first (the sed
expression is s/|.*//g).
Param. | $1 | marker |
$2 | part to extract: can be before or after, with regard to the occurence of
$1 selected with $3 | |
$3 | occurence: can be first or last | |
Pipes | stdout | the sed extraction expression, empty in case of error |
Status | 0 | success, expression was computed and written on stdout |
1 | the function was unable to find a suitable sed operation separator character | |
2 | the value of $2 and/or $3 is unknown |
Generates sed
string replacement expression.
Example:
echo "some string" | sed -e $(get_sed_replace_expression "some" "awesome")
print awesome string (the sed
expression is s/some/awesome/g).
Param. | $1 | sed match regex/string |
$2 | sed replace string | |
[$3 ] | occurence selection:
| |
Pipes | stdout | the sed replace expression, empty in case of error |
Status | 0 | success, the expression was computed and written on stdout |
1 | the function was unable to find a suitable separator character | |
2 | the occurence selection $3 is unknown |
Provides a sed separator character which doesn't occur in $1
and $2
.
Param. | $1 | sed match regex/string |
[$2 ] | 2nd sed argument | |
Pipes | stdout | the sed operation separator character, empty in case of error |
Status | 0 | found a suitable separator character, written on stdout |
1 | none of the 23 characters available is suited |
Adds a backslash to every occurence of a character which has a special signification in sed expressions:
. + ? * [ ] ^ $
Param. | $1 | string to escape |
Pipes | stdout | escaped $1 |
Status | 0 |
Provides a string composed of $1
alphanumeric characters taken from /dev/urandom
.
Important: it's not suited for critical security applications like cryptography. However, it's useful to get unique strings for non-critical usecases, f.ex. "run IDs" which may be used to distinguish interleaving log entries from several instances of the same script running in parallel.
Param. | [$1 ] | length of the random string, defaults to 16 if omitted |
Pipes | stdout | the random string |
Status | 0 | success, the random string is on stdout |
1 | /dev/urandom doesn't exist |