Physical File GREP (PFGREP): Fast IBM i Source Code Search

pfgrep: Fast search for IBM i PFsOur 2023 article on searching source physical file members using the QShell grep command showed grep’s potential. In practice, while we found QShell grep to be flexible, we also experienced slow performance and occasional errors.

Now, our own Calvin Buckley has built an improved grep command called pfgrep to search traditional IBM i source physical file members. Quick and reliable, pfgrep is also free and open source.

Installation

Detailed installation instructions are found on the pfgrep Github page, but here are the basic three installation methods:

  • Download the .rpm file from the Github site, then install via yum
  • Seiden Group customers with access to our repositories can install directly from our repos using yum
  • Git clone from Github to build from source

Using pfgrep

Example: To find any mention of php (or PHP, or pHP) in physical files in the ALAN library, I could launch a PASE shell (SSH Bash, QShell, CALL QP2TERM) and run this case-insensitive, recursive search:

pfgrep -i -r php /QSYS.LIB/ALAN.LIB/

On my IBM i system, the results include PHP references in CL, RPG, command source, and more, that I wouldn’t have found without such a powerful search tool.

Partial output:

Performance

The primary motivation for developing pfgrep was to dramatically speed up code searches.

For example, for QShell grep to look through all the ILE C and C++ header files for the string ‘Qp01’, it needed 26.963 seconds:

pfgrep on the same system took only 3.098 seconds:

For more examples with PCRE searches, additional command options, and helper utilities pfzip, pfcat, and pfstat, see the pfgrep Github site.

Integration with VS Code for i

Work is being done to integrate pfgrep into Code for i so VS Code users can benefit from its speed and power.

Keep up with the latest in VS Code for i and open source

Come to our free Code for i Fridays meetings and consider a Developer Support contract (VS Code support, pfgrep, PHP, Node, Python, RPG, Git, more) to receive one-on-one mentoring from our team as well as access to Seiden Developer Council meetings.

3 replies
  1. Calvin Buckley
    Calvin Buckley says:

    I replied earlier, but it seems WordPress ate my comment. 🤬

    I don’t have visibility into how IBM did it, but I suspect a lot of it might be how it reads. pfgrep tries to do the I/O for reading the file all at once; it might do encoding conversion or the actual regex match in line based chunks, but file I/O is the heaviest because it involves going to disk and making a lot of round trips through syscalls. It’s quite possible qsh grep is just reading a line at a time when it does its work.

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.