t: add greplint.pl and convert grep to test_grep#2135
Open
mmontalbo wants to merge 6 commits into
Open
Conversation
5d5366a to
9354adf
Compare
test_grep is a wrapper around grep for test assertions that prints the file contents on failure for easier debugging. It also accepts '!' as its first argument for negation, which preserves the diagnostic output that '! test_grep' would suppress. Despite being widely used (and the preferred replacement for bare grep in assertions), test_grep has no entry in t/README alongside the other documented helpers like test_cmp and test_line_count. Add one. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Three grep assertions were missing their file arguments, causing them to read from empty stdin instead of the intended file: - t2402: '! grep ...' should read from 'out', matching the grep on the preceding line. - t7507: the closing quote is in the wrong place, making the entire 'diff --git actual' a single pattern with no file argument instead of pattern 'diff --git' and file 'actual'. - t7700: '! grep ...' should read from 'packlist', matching the redirect on the preceding line. Without file arguments these greps always succeed (empty stdin matches nothing), so the assertions were not actually checking anything. All three tests pass with the corrected file arguments, confirming the intended behavior is sound. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
f749f29 to
614b21f
Compare
1f7855c to
937f800
Compare
45e09e3 to
14114d7
Compare
Move chainlint.pl's Lexer, ShellParser, and ScriptParser into a shared module (lib-shell-parser.pl) so other lint tools can reuse the same shell parsing infrastructure. A subsequent commit adds greplint.pl, which needs the same tokenizer to correctly identify command boundaries. ScriptParser's check_test() becomes a no-op in the shared module. chainlint.pl defines ChainlintParser (extending ScriptParser) with the &&-chain check_test() implementation. No functional change: chainlint produces the same output and check-chainlint self-tests pass. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
scan_dqstring's post-loop newline counter re-counts newlines that were already counted during recursive parsing of $() bodies. This happens because scan_dollar returns text containing newlines (from multi-line command substitutions), and the catch-all counter at the end of scan_dqstring counts all of them again. Fix this by counting newlines inline as non-special characters are consumed, and removing the post-loop catch-all. Each newline is now counted exactly once: literal newlines at the inline match, line splices at the backslash handler, and $() newlines by scan_token during the recursive parse. This is a latent bug: any consumer that relies on token line numbers rather than byte offsets would get incorrect results for tokens following a multi-line $() inside a double-quoted string. chainlint is not affected because it annotates the original body text using byte offsets, not token line numbers. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
|
There are issues in commit fd9dc7e:
|
Author
|
/preview |
|
Preview email sent as pull.2135.v2.git.1781318996.gitgitgadget@gmail.com |
Replace bare grep with test_grep in test assertions across the
suite, including sourced test helpers (lib-*.sh, *-tests.sh).
test_grep prints the contents of the file being searched on
failure, making debugging easier than a bare grep which fails
silently.
Only assertion-style greps are converted: grep used as a filter
in pipelines, command substitutions, conditionals, or with
redirected I/O is left as-is with a "# lint-ok" annotation.
Existing '! test_grep' calls are rewritten to 'test_grep !' so
that the diagnostic output is preserved on failure.
The conversion was generated using a grep-assertion linter
(greplint.pl, added in the following commit) to identify bare
grep calls at command position. To reproduce:
# Step 1: mark bare greps that should not be converted
sed -i '/! grep "$m" \.git\/packed-refs/s/$/ # lint-ok: file may not exist (reftable)/' \
t/t1400-update-ref.sh
sed -i '/! grep dirty file3 &&/{/lint-ok/!s/$/ # lint-ok: file may not exist after --quit/}' \
t/t3420-rebase-autostash.sh
sed -i '/grep -vf before commits\.raw/s/$/ # lint-ok: data filter/' \
t/t5326-multi-pack-bitmaps.sh
sed -i '/! grep $d shallow-client\/\.git\/shallow/s/$/ # lint-ok: file may not exist after repack/' \
t/t5537-fetch-shallow.sh
sed -i '/grep -E "^\[0-9a-f\].*|| :/s/$/ # lint-ok: data filter/' \
t/t5702-protocol-v2.sh
sed -i '/! grep gitdir squatting-clone/s/$/ # lint-ok: file may not exist after failed clone/' \
t/t7450-bad-git-dotfiles.sh
# Step 2: reorder pre-existing '! test_grep' to 'test_grep !'
# (must come before steps 3-4 so greplint does not see them)
sed -i 's/! test_grep/test_grep !/' t/t0031-lockfile-pid.sh
sed -i 's/! test_grep/test_grep !/' t/t5300-pack-object.sh
sed -i 's/! test_grep/test_grep !/' t/t5319-multi-pack-index.sh
# Step 3: convert '! grep' -> 'test_grep !'
perl t/greplint.pl t/*.sh 2>&1 | cut -d: -f1,2 |
while IFS=: read f l; do
sed -i "${l}s/! *grep/test_grep !/" "$f"
done
# Step 4: convert remaining 'grep' -> 'test_grep'
perl t/greplint.pl t/*.sh 2>&1 | cut -d: -f1,2 |
while IFS=: read f l; do
sed -i "${l}s/grep/test_grep/" "$f"
done
To verify, run: make -C t test-greplint
Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Without a lint guard, bare grep assertions will creep back into tests over time, defeating the previous commit's conversion. Add greplint.pl to catch bare 'grep' used as a test assertion (where 'test_grep' should be used) and '! test_grep' (where 'test_grep !' should be used). greplint.pl reuses the shared shell parser from lib-shell-parser.pl to tokenize test bodies. The Lexer collapses heredocs, command substitutions, and quoted strings into single tokens, so 'grep' appearing inside these contexts is not flagged. A flat walk over the token stream tracks command position and pipeline state to distinguish assertion greps from filter greps. For double-quoted test bodies, a source-line walk counts backslash-continuation lines that the Lexer consumes without emitting into the body text, adjusting the reported line number accordingly. Add test fixtures in greplint/ (modeled on chainlint/) covering detection of bare grep assertions, correct skipping of filters, pipelines, redirects, command substitutions, and lint-ok annotations. Wire into the Makefile as: - test-greplint: runs greplint.pl on $(T) $(THELPERS) $(TPERF) - check-greplint: runs greplint.pl on fixtures, diffs against expected - clean-greplint: removes temp dir Add eol=lf entries in t/.gitattributes for greplint fixtures, matching chainlint, so that check-greplint passes on Windows where core.autocrlf would otherwise cause CRLF mismatches between expected and actual output. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Author
|
/submit |
|
Submitted as pull.2135.v2.git.1781323575.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
test_grep is a wrapper around grep for test assertions that prints
the file contents on failure for easier debugging. Bare grep fails
silently, making it hard to diagnose what went wrong.
This series converts existing bare grep assertions to test_grep and
adds greplint.pl to prevent new ones from being introduced.
Patch 1 documents test_grep in t/README.
Patch 2 fixes three greps missing file arguments (t2402, t7507,
t7700). They were reading empty stdin and passing vacuously.
Patch 3 extracts chainlint's Lexer, ShellParser, and ScriptParser
into a shared module (lib-shell-parser.pl) so greplint.pl can
reuse the same tokenizer. No functional change to chainlint.
Patch 4 fixes a latent line-counting bug in scan_dqstring where
newlines from $() bodies inside double-quoted strings were counted
twice. This does not affect chainlint (which uses byte offsets)
but matters for greplint.pl's line-number reporting.
Patch 5 converts existing assertion greps to test_grep, including
sourced test helpers. Greps used as data filters or on files
that may not exist are left unconverted with lint-ok annotations.
Patch 6 adds greplint.pl with test fixtures (modeled on chainlint/)
and wires it into the Makefile as test-greplint and check-greplint.
Changes since v1:
Dropped lint-style.pl and the --fix mode concept. Replaced with
greplint.pl, which more closely follows chainlint's conventions,
and reuses its parser logic via a shared module.
A regex approach to grep linting was prototyped in an attempt
to reduce the number of patches in the series, but this
approach produced false positives from grep inside heredoc
bodies (e.g. write_script) and cross-line pipelines where
the pipe or redirect is on a different line from the grep.
The shared module's Lexer already collapses these into
single tokens, giving zero false positives with less code
than the regex heuristics would need, which is why it was
retained in the current version.
Reverted incorrect conversions where grep was used as a
data filter inside redirected compound commands, not as a
test assertion.
Known limitation / follow-up:
grep pattern file >/dev/nullandgrep pattern <fileare not converted because greplint.pltreats any redirect as a filter. The former is ambiguous
because >/dev/null becomes dead code under test_grep (which
already suppresses matching-line output). The latter
requires removing the redirect and passing the file as a
positional argument, since test_grep does not support stdin
redirects. Both are left as bare grep. A follow-up series
can address these once a convention is established.
cc: "D. Ben Knoble" ben.knoble@gmail.com, Eric Sunshine sunshine@sunshineco.com