regex - Why does grep show me different output depending on my input file size? -
i'm little puzzled output of grep command, seems truncating results based on size of -f file. instance, consider 1000-line file of strings, patterns.txt, e.g.:
adkgjwofjdjglkadjglkjasdfahdg dsklfjsldkfjaghwioeghsdlkjfld sdkljfsdkljghsdlfhkwhfklshdfo ... sdklfjsdklfjsdklfjslkjghdfkjj and 1gb queryfile.txt search patterns. when run
grep -f -o -f patterns.txt queryfile.txt | grep -c adkgjwofjdjglkadjglkjasdfahdg in case, command reports 0 matches 1st line, (adkgjwofjdjglkadjglkjasdfahdg) of patterns.txt, though there 35 occurrences in queryfile.txt. verified reducing patterns.txt file first 10 lines. rerunning
grep -f -o -f patterns_reduced-list.txt queryfile.txt | grep -c adkgjwofjdjglkadjglkjasdfahdg properly reports 35 occurrences of adkgjwofjdjglkadjglkjasdfahdg.
what's happening?
this shouldn't happen unless... patterns overlap.
check example:
echo "xyxx" | grep -o -f yx$'\n'xy # output: xy this finds second pattern (xy), , because of won't find first pattern (yx).
echo "xyxx" | grep -o -f yx # output: yx
Comments
Post a Comment