| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This is worth mentioning because POSIX-1.2024 (Issue 8) introduces
pipefail as a standard feature.
https://austingroupbugs.net/view.php?id=789
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I am unwilling to pander to those that elect to enable either of the
errexit or nounset options. Rather than gloss over the matter, comment
as to the behaviour of gentoo-functions being unspecified in that event.
Further, display a warning for each of those options found to be enabled
at the time of sourcing functions.sh.
It is worth noting that the behaviour of nounset can be selectively
employed with the ${parameter:?} form of parameter expansion. Such is
occasionally useful and does not require for library authors to
acquiesce to the cult of the "unofficial strict mode".
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
| |
Doing so protects against the following scenario.
$ IFS=e word=1
$ set -x; test ${word+set}
+ test s t
dash: 2: test: s: unexpected operator
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
| |
Render trim() faster in bash for cases where only the positional
parameters are to be processed e.g. var=$(trim "$var") or
var=${ trim "$var"; }.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
| |
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the case where IFS consists of a single character whose value
is neither <space>, <tab> nor <newline>. The following example employs
the colon, since it is the character that the whenceforth() function
relies upon during word splitting.
$ bash -c 'IFS=":"; path=":"; set -- $path; echo "$# ${1@Q}"'
1 ''
The result is very much as expected because the colon in path serves as
a terminator for an empty field. Now, let's consider how many fields are
produced in OpenBSD sh as a consequence of word splitting.
$ sh -c 'IFS=":"; path=":"; set -- $path; echo "$#"'
0
For the time being, work around it by having whenceforth() repeat the
field terminator for the affected edge cases, which are two in number.
With this change, the test suite is now able to pass for:
- loksh 7.5
- oksh 7.5
- sh (OpenBSD 7.5)
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
| |
The ability to locate and source the modules depends on the
genfun_basedir variable being set correctly. In the case that no modules
can be found, print a useful diagnostic message and ensure that the
return value is non-zero.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
| |
SC2153 is informational in nature and triggers only for environment
variables (all uppercase variables) whose names are similar to others
and for which no explicit assignment can be observed. In the case of
gentoo-functions, it was being raised as a result of KSH_VERSION and
YASH_VERSION being expanded. In other words, it is a nuisance.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
| |
In accordance with the Gentoo style.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
| |
Given that the EPOCHREALTIME variable loses its special properties if
unset, to compare two expansions of it to one another ought to be more
robust.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given that the SRANDOM variable loses its special properties if unset,
to compare two expansions of it to one another ought to be more robust.
Do so up to three times, so as not to be foiled by the unlikely event of
the RNG repeating the same number.
Further, the prior check was defective because it incorrectly presumed
the minimum required version of bash to be 5.0 rather than 5.1.
Fixes: 5ee035a364bea8d12bc8abfe769014e230a212a6
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implementation of srandom() that was written with mksh first and
foremost in mind is no longer as slow as it was. I decided to benchmark
30,000 iterations of both of the non-bash implementations with varying
maximal pool sizes. The results are beneath. Note that both "dash/1" and
"mksh/1" refer to the mksh-targeting implementation.
Pool Size dash/1 dash/2 mksh/1
48 B 6.67s 5.57s 58.84s
64 B 5.39s 4.78s 58.20s
96 B 5.49s 4.36s 58.13s
128 B 5.87s 4.63s 59.94s
160 B 5.93s 5.46s 64.64s
These figures demonstrate that the optimal pool size is roughly between
64 and 96 bytes, and that the performance of both implementations is now
comparable. In addition to testing Linux (6.6) on x86_64 hardware, I
experimented with the pool size on macOS Sonoma (using an Apple M1 CPU)
and found a value of 64 to be close to optimal.
In view of these findings, have _collect_entropy() collect 64 bytes at a
time and remove the marginally faster implementation. That is, the one
that depended on being able to perform arithmetic on a number as high as
2^32-1 without overflowing.
Additionally, increase the maximum number of times that the remaining
implementation tries to find a suitable sequence of hex digits from 2 to
3. Finally, remove the overflow check, for it is no longer required.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Presently, there are two srandom() implementations that do not require
bash, one of which is intended for use with mksh and the other of which
is intended for the various other implementations of sh(1). Both of
these implementations are capable of maintaining an entropy pool, which
markedly enhances performance for repeated invocations of the function.
However, the pool cannot be effectively utilised in cases where the
shell has forked.
$ srandom # initialises the pool
$ srandom # reads from the now-initialised pool
$ ( srandom ) # may fork, rendering the pool rather ineffective
$ ( srandom; srandom ) # ditto, despite the consecutive calls
This commit addresses the discrepancy by keeping track of whether the
pool has been populated on a per-PID basis. Consider the following
benchmark, in which the loop is forced to execute within a subshell
environment.
(
i=0
while [ $((i+=1)) -le 30000 ]; do srandom; done >/dev/null
/bin/true
)
As conducted with mksh 59c on a system with a 2nd generation Intel Xeon,
I obtained the following figures.
BEFORE
real 3m8.857s
user 2m57.276s
sys 0m59.511s
AFTER
real 1m24.047s
user 1m6.435s
sys 0m19.565s
As conducted with dash on the same system, I obtained the following
figures.
BEFORE
real 0m52.056s
user 1m2.913s
sys 0m18.143s
AFTER
real 0m12.887s
user 0m12.521s
sys 0m1.016s
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The slowest of the the three srandom() implementations is presently
selected for shells that overflow numbers at the 2^31 mark. A prominent
shell which does so is mksh (even for LP64 architectures).
Recently, one of the other srandom() implementations was accelerated by
having the shell maintain its own entropy pool of up to 512 hex digits
in size. Make it so that the mksh-targeting implementation employs a
similar technique. Consider the following benchmark.
i=0; while [ $((i += 1)) -le 30000 ]; do srandom; done >/dev/null
As conducted with mksh 59c on a system with a 2nd generation Intel Xeon,
I obtained the following figures.
BEFORE
real 0m56.414s
user 0m47.043s
sys 0m24.751s
AFTER
real 0m28.900s
user 0m22.795s
sys 0m6.802s
Note that the performance increase cannot be applied in all situations.
For further details regarding the constraints, refer to commit 866af9c.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The slowest implementation of srandom() runs od(1) and awk(1) within a
command substitution. There, both LC_ALL and LC_CTYPE are overridden but
they should also be exported.
For now, export LC_ALL=C exclusively, even though it overrides
LC_MESSAGES, potentially affecting the user's preferred language for
diagnostics. The reason for choosing this course of action is as
follows.
$ uname
Darwin
$ echo "$BASH_VERSION"
5.2.26(1)-release
$ f() { nonexistent; }; $ ( export LC_ALL=; f )
objc[29971]: +[__SwiftNativeNSStringBase initialize] may have been in
progress in another thread when fork() was called.
objc[29971]: +[__SwiftNativeNSStringBase initialize] may have been in
progress in another thread when fork() was called. We cannot safely call
it or ignore it in the fork() child process. Crashing instead. Set a
breakpoint on objc_initializeAfterForkError to debug.
A fix for this is present in the devel branch:
- https://git.savannah.gnu.org/cgit/bash.git/commit/?h=devel&id=b3d8c8a
See, also:
- https://trac.macports.org/ticket/68638
- https://lists.gnu.org/archive/html/bug-bash/2024-05/msg00088.html
Of course, the fix hasn't been backported to an actual release. As such,
I would prefer to play it safe for the time being.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
| |
I normally always do this for local variables that may immediately be
checked for emptiness or non-emptiness, owing to the formally
unspecified behaviour of the local command.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
| |
In the case of ksh93, the commonly implemented behaviour of "local" can
be approximated with "typeset". However, to use typeset in this way
requires the use of the function f { ...; } syntax instead of the
POSIX-compatible f() compound-command syntax. As things stand, there is
no sense in allowing for functions.sh to be sourced by ksh93.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The yash shell takes conformance so seriously that it goes as far as to
disable the local builtin in its posixlycorrect mode.
https://magicant.github.io/yash/doc/posix.html
$ yash -o posixlycorrect -c 'f() { local var; }; f'
yash: local: non-portable built-in is not supported in the POSIXly-correct mode
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Presently, there are three implementations of srandom(), one of which is
the preferred implementation for shells other than bash. It is a little
on the slow side as it has to fork and execute both od(1) and tr(1)
every time, just to read 4 bytes. Accelerate it by having the shell
maintain its own entropy pool of up to 512 hex digits in size. Consider
the following benchmark.
i=0; while [ $((i += 1)) -le 30000 ]; do srandom; done >/dev/null
As conducted with dash on a system with a 2nd generation Intel Xeon, I
obtained the following figures.
BEFORE
real 0m49.878s
use 1m1.985s
sys 0m17.035s
AFTER
real 0m12.866s
user 0m12.559s
sys 0m0.962s
It should be noted that the optimised routine will only be utilised in
cases where the kernel is Linux and the shell has not forked itself.
$ uname
Linux
$ srandom # uses the fast path
$ number=$(srandom) # subshell; probably uses the slow path
$ srandom | { read -r number; } # ditto
Still, there are conceivable use cases for which this optimisation may
prove useful. Below is an example in which it is known in advance that
up to 100 random numbers are required, and where writing them to
temporary storage is not considered to be a risk.
i=0
tmpfile=${TMPDIR:-/tmp}/random-numbers.$$.$(srandom)
while [ $((i += 1)) -le 100 ]; do
srandom
done > "$tmpfile"
while read -r number; do
do_something_with "$number"
done < "$tmpfile"
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
| |
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
| |
The _should_throttle() function gets the best of shellcheck, which
incorrectly reports that there is unreachable code.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As regards the test(1) utility, the POSIX.1-2024 specification defines
the -nt and -ot primaries as standard features. Given that the
specification in question was only recently published, this would not
normally be an adequate reason for using them in gentoo-functions, in
and as of itself. However, I was already aware that the these primaries
are commonly implemented and have been so for years.
So, I decided to evaluate a number of shells and see how things stand
now. Here is a list of the ones that I tested:
- ash (busybox 1.36.1)
- dash 0.5.12
- bash 5.2.26
- ksh 93u+
- loksh 7.5
- mksh 59c
- oksh 7.5
- sh (FreeBSD 14.1)
- sh (NetBSD 10.0)
- sh (OpenBSD 7.5)
- yash 2.56.1
Of these, bash, ksh93, loksh, mksh, oksh, OpenBSD sh and yash appear to
conform with the POSIX-1.2024 specification. The remaining four fail to
conform in one particular respect, which is as follows.
$ touch existent
$ set -- existent nonexistent
$ [ "$1" -nt "$2" ]; echo "$?" # should be 0
1
$ [ "$2" -ot "$1" ]; echo "$?" # should be 0
1
To address this, I discerned a reasonably straightforward workaround
that involves testing both whether the file under consideration exists
and whether the variable keeping track of the newest/oldest file has yet
been assigned to.
As far as I am concerned, the coverage is more than adequate for both
primaries to be used by gentoo-functions. As such, this commit adjusts
the following three functions so as to do exactly that.
- is_older_than()
- newest()
- oldest()
It also removes the following functions, since they are no longer used.
- _find0()
- _select_by_mtime()
With this, GNU findutils is no longer a required runtime dependency. Of
course, should a newly introduced feature of gentoo-functions benefit
from the presence of findutils in the future, there is no reason that it
cannot be brought back in that capacity.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When integer overflow occurs in a non-interactive yash shell, it prints
"yash: arithmetic: overflow" as a diagnostic message before proceeding
to exit. That makes it extremely difficult for the arithmetic in the
_should_throttle() function to be implemented safely for it. For now,
ensure that _update_time() does nothing for yash but return a non-zero
status code. In turn, this disables the rate limiting feature for yash.
Additionally, refrain from running test_update_time() and
test_should_throttle() for yash in test-functions. The former would only
amount to a waste of time and the latter would be guaranteed to fail.
For the record, my testing was performed with yash 2.56.1.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the point that the genfun_time variable overflows, guarantee that the
should_throttle() function behaves as if no throttling should occur
rather than proceed to perform arithmetic based on the result of
deducting genfun_last_time from genfun_time.
Further, guarantee that the should_throttle() function behaves as if no
throttling should occur upon the very first occasion that it is called,
provided that the call to update_time() succeeds.
Finally, add a test case.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
| |
For it need not be in the public name space.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the quote_args_bash() function, which will be called from
quote_args() under the appropriate circumstances. It is faster than the
sh implementation, not merely because it takes advantage of the
${parameter@Q} form of parameter expansion, but also because executing
external utilities exacts a greater performance toll for bash than it
does for, say, dash. The difference is appreciable if running the test
suite.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the case of some shells - mksh, at least - the maximum value of an
integer is 2147483647. Such is a consequence of implementing integers as
signed int rather than signed long, even though doing so contravenes the
specification.
Reduce the output range of srandom() so as to be between 0 and
2147483647, rather than 0 and 4294967295. A change of this scope would
normally justify incrementing GENFUN_API_LEVEL but I shall not do so on
this occasion. My rationale is that >=gentoo-functions-1.7 has not yet
had enough exposure for srandom() to be in use by other projects.
Additionally, have test-functions test srandom() 10 times instead of 5.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
| |
Also, extend the coverage of the test suite a little further.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
| |
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
| |
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The _select_by_mtime() function is called by both newest() and oldest().
Pathnames may be specified as positional parameters or as NUL-separated
records to be read from the standard input. Unfortunately, the latter
interface does not work at all. Rectify this by checking whether the
number of parameters is greater then 0, rather than greater than or
equal to 0.
Also, extend the existing test case in such a way that the interface in
question is tested.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I overlooked that bash respects the radix character defined by the
locale in the course of synthesizing the value of the EPOCHREALTIME
value. Set LC_NUMERIC as C to guarantee that the radix character is
considered as U+2E (FULL STOP) within the scope of the bash-specific
function. Doing so also addresses a distinct issue whereby the
invocation of printf was sensitive to the implied value of LC_NUMERIC.
Another way to address this would have been to set LC_ALL as C. I
decided not to because it would decrease the likelihood of the relevant
diagnostic messages being rendered in the user's native language.
Additionally, add a test case.
Closes: https://bugs.gentoo.org/937376
Reported-by: Christian Bricart <christian@bricart.de>
Signed-off-by: Kerin Millar <kfm@plushkava.net>
Signed-off-by: Sam James <sam@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These two functions are primarily intended to mitigate the appalling use
of eval in projects such as netifrc and openrc. Consider the following
code.
net/iproute2.sh:29: eval netns="\$netns_${IFVAR}"
This could instead be be written as:
deref "netns_${IFVAR}" netns
Alternatively, it could be written so as to use a command substitution:
netns=$(deref "netns_${IFVAR}")
Either method would protect against against illegal identifier names and
code injection.
Consider, also, the following code.
net/iproute2.sh:185: eval "$x=$1" ; shift ;;
This could instead be written as:
assign "$x" "$1"
As with deref, it would protect against illegal identifier names and
code injection.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
| |
Now that POSIX-1.2024 has been ratified, strictly_posix no longer makes
sense as a variable name.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
POSIX-1.2024 (Issue 8) requires for the cd builtin to raise an error
where given an empty directory operand. However, various implementations
have yet to catch up. Given that it is a sensible change, let's have the
chdir() function behave accordingly. Further, since doing so renders the
test_chdir_noop test useless, get rid of it. The purpose that the test
served is now subsumed by test_chdir.
Closes: https://bugs.gentoo.org/937157
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A factor of 16 was shown to be faster on average by timing how long it
takes for bash to print a rule 5000 times for all lengths between 40 and
132, inclusive.
Factor Time StdDev
8 87.004000 3.961607
16 82.893000 3.971257
Further, 16 remains a factor of 80, which is often the number of columns
that a terminal emulator is initialised with.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Testing the BASH variable for non-emptiness is an inadequate pretext for
activating the bash-optimised code path. Instead, the test would have to
be implemented like so ...
if ! case ${BASH_COMPAT} in 3?|4[012]) false ;; esac && _has_bash 4 3
then
...
fi
Given that hr() is not expected to be called often, and that the sh code
was already improved by employing a divide-by-8 strategy, I don't
consider it to be worth the trouble.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
| |
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
| |
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
| |
I'm not yet ready to commit to it being among the core functions for the
inaugural API level.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
| |
Reduce the number of loop iterations by initially trying to append
characters 8 at a time.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
| |
Render hr() faster by eliminating the requirement to fork and execute
any external utilities after having established the intended length of
the rule. Also, use printf -v and string-replacing parameter expansion
where the shell is found to be bash. Doing so helps considerably because
bash is very slow at looping.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-implement the contains_all() and contains_any() functions in such a
way that they are faster than their forebears by an order of magnitude.
In order to achieve this level of performance, the value of IFS is no
longer taken into account. Instead, words are always presumed to be
separated by characters matching the [[:space:]] character class.
Consider a scenario in which the FEATURES variable is comprised of 33
words.
$ FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs
buildpkg buildpkg-live config-protect-if-modified distlocks ebuild-locks
fixlafiles ipc-sandbox merge-sync merge-wait multilib-strict
network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted
preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms
strict unknown-features-warn unmerge-logs unmerge-orphans userfetch
userpriv usersandbox usersync xattr"
Let's say that the contains_any function is used to search for 10 words,
where only the 10th can be matched and where FEATURES must be scanned in
its entirety exactly 10 times.
$ contains_any "$FEATURES" the quick brown fox jumped over the lazy hen xattr
The following benchmarks show how long it took to call the function
50,000 times consecutively on a system with an Apple M1 CPU for both the
original and new implementations. This is with the dash shell.
contains_any (BEFORE)
real 0m19.135s
user 0m16.781s
sys 0m2.258s
contains_any (AFTER)
real 0m1.571s
user 0m1.497s
sys 0m0.063s
Now let's say that the contains_all function is used to search for 3
words, where all can be matched while requiring for FEATURES to be
scanned in its entirety at least once.
$ contains_all "$FEATURES" assume-digests news xattr
Again, The following benchmarks show how long it took to call the
function 50,000 times consecutively.
contains_all (BEFORE)
real 1m8.052s
user 0m19.363s
sys 0m42.742s
contains_all (AFTER)
real 0m0.689s
user 0m0.627s
sys 0m0.057s
The performance improvements are similarly impressive if using bash.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coerce the effective character set as being C (US-ASCII) in the course
of executing awk(1). Some implementations are strict and will otherwise
fail in situations where the bytes cannot be decoded.
$ uname -o
Darwin
$ echo "$LC_ALL"
en_GB.UTF-8
$ printf '\200' | awk '/[\001-\037\177-\377]/'
awk: towc: multibyte conversion failure on: ''
In the above case, awk aborts because it has a need to decode the input,
which turns out not to be valid UTF-8. Now, it is rather beyond the
purview of quote_args() to guarantee that its parameters adhere to any
particular character encoding. Fortunately, for it to contend with
strings on a byte-by-byte basis is acceptable.
Refactor the code somewhat. The behaviour has been adjusted so to be
virtually identical to that of the "${*@Q}" expansion in bash, with the
exception that the ESC character is rendered as $'\e' instead of $'\E'.
Such an exception is necessary for POSIX-1.2024 conformance, wherein
dollar-single-quotes are now a standard feature (see section 2.2.4 of
the Shell Command Language).
Revise the comment preceding the function so as to accurately document
its behaviour.
Finally, add a test case. It works by calling quote_args for every
possible single-byte string before calculating a CRC checksum for the
cumulative output and comparing it against a pre-determined value.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
| |
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
| |
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
| |
The POSIX-1.2024 specification was published on 2024/06/14.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
| |
https://austingroupbugs.net/view.php?id=339
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
| |
Doing so simplifies the case where /proc/uptime is read. Having one more
digit's worth of accuracy is no bad thing either.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|
|
|
|
|
|
|
| |
Also, require for true(1) to be executable in order for it to be deemed
usable.
Signed-off-by: Kerin Millar <kfm@plushkava.net>
|