Code Search

2019-08-01

There are many options available for searching code. If you are using an IDE, there is a good chance it has something built in. Sadly it's often not very good, or fast, and is never composable. What you want is something that is fast, able to handle complex queries, and that could be composed with other tools. If you are a unix geek, then you are likely already thinking "grep". That is close to where I am headed, but there is more to the story.

Some time ago I was working on my first larger python codebase, it was pushing some half a million lines of code and lacked documentation. So I often found myself greping. I noticed that one of the other engineers was getting results back from his searches much faster than I was. While my searches would often take minutes to complete, his where returning in seconds. That sent me off on a search to understand why, and to find a better tool.

To cut the story short, I found that the main reason my searches were slow was that they were searching everything. Every library and dependency in the repository. I could get things faster by building up lengthy grep ignore lists, but that felt like a kludge. Then, almost by accident I discovered a little gem: git grep yes, git has a built in code search tool! It behaves mostly like the good old grep you (may) be used to, but this one knows what code to ignore because you have already told git that (in your .gitignore file for example), and it is multithreaded. It is also universally available.

However if your needs are more specialized, there is an entire mini-ecosystem of such tools:

Also, for the true unix geeks, Awk can be a surprisingly powerful (and fast) search tool.

Happy searching!