Few basic examples to start with, please beware that none of these is absolute, nor the combination of these are. Just adjust the parts that define the string and the pattern and look at the results, then adjust again, and a pattern will most probably occur showing the “odd” occurrences.
Here’s how:
In order to search for “base64” in all php files in a designated public_html directory and display the found string, 10 characters before and 30 characters after, use this:
grep --include=\*.php -rnoE '/your/absolute/path/public_html' -e '.{0,10}base64.{0,30}'
Same but searching for “eval(” goes like this:
grep --include=\*.php -rnoE '/your/absolute/path/public_html' -e '.{0,10}eval\(.{0,30}'
These will return all the legit usage of base64 or eval anyway, so it’s important to see the ones that are not legit. For that look at the results, the legit ones looks like this:
ments[] = eval('return '.$this->twig->compile
… but the malicious ones may look like this:
kxjser5h = eval(YKZon('HZtFpwihBZkVvYvb6oPTNv1
… or even like:
RZidK: eval(eval(eval(eval(eval(eval(eval(
The idea behind looking at these bits of code is that the malicious ones have to use random and encoded variables that varies with each instance, to avoid standard scan methods; while the legit ones will use meaningful names that can be understood by real developers.
Looking forward to see some sort of AI looking for such stuff with a good success rate 🙂
Some other strings to look for: “goto”, “gzinflate”, “strrev”, “hex2bin”.
More on grep with patterns in the grep manual.