From time to time, you might need to search certain strings within an archive file in the zip format which contains multiple files on a Linux system. If you have never done this, you might ask — what tools to use? Are there existing commands which can be used instead of writing your own script to unzip the archive and search?
Fortunately the answer is yes. There are commands like zgrep and zipgrep. What’s the difference between them then?
When I got a request as mentioned above to grep a string within an archive, I didn’t know which commands to use and which commands for which scenarios. I first tried the command zgrep because I wasn’t aware of zipgrep.
Using zgrep to searching a zip file doesn’t really work with two issues:
- It can only search the first file within the zip file.
- It doesn’t list the file name.
For example, I created two zip files which contians two plain text files in different orders –aaa.log contained the string “jli” I was looking for, the other one bbb.log didn’t.
test_grep1.zip had aaa.log as the first one while test_grep.zip had bbb.log as the first file within the archive.
zip test_grep1.zip aaa.log bbb.log
zip test_grep.zip bbb.log aaa.log
root@jlitest:/var/log# unzip -l test_grep.zip
Archive: test_grep.zip
Length Date Time Name
--------- ---------- ----- ----
376908 03-24-2022 20:52 bbb.log
8 03-24-2022 20:52 aaa.log
--------- -------
376916 2 files
root@jlitest:/var/log# unzip -l test_grep1.zip
Archive: test_grep1.zip
Length Date Time Name
--------- ---------- ----- ----
8 03-24-2022 20:52 aaa.log
376908 03-24-2022 20:52 bbb.log
--------- -------
376916 2 files
When searching the string “jli” within test_grep1.zip, it found it. But searching within test_grep.zip, it didn’t.
root@jlitest:/var/log# zgrep jli test_grep1.zip
jli
jli
root@jlitest:/var/log# zgrep jli test_grep.zip
Then I took a close look at zgrep, it is actually just a bash file wrapping grep & gzip (using gzip’s options “-c”, “-d” to decompress to stdout) as its man page states “zgrep — a wrapper around a grep program that decompresses files as needed”
root@jlitest:/var/log# which zgrep
alias zgrep='zgrep --color=auto'
/usr/bin/zgrep
No wonder it doesn’t work well with zip files. As you can see from the following example:
root@jlitest:/var/log# gzip -d -c test_grep1.zip
jli
jli
gzip: test_grep1.zip has more than one entry--rest ignored
It will stop after the first file because it only expects 1 compressed file.
Then I realized there is another command zipgrep (bash file again) which wraps egrep & unzip — exactly what I was looking for my task.
Its man page says “zipgrep: Use unzip and egrep to search the specified members of a zip archive for a string or pattern.”
root@jlitest:/var/log# zipgrep jli test_grep1.zip
aaa.log:jli
aaa.log:jli
root@jlitest:/var/log# zipgrep jli test_grep.zip
aaa.log:jli
aaa.log:jli
Again the concept of zipgrep is to unzip files within an archive (zip) file to stdout and egrep patterns from there. It uses unzip’s “-p” option to extract files (only the file data) to pipe (stdout).
unzip has another “-c” option to extract files to stdout/screeen — similar to “-p” but with the name of each file printed.
So you can use simple commands to implement zipgrep.
Using “-p”, no file name is printed.
root@jlitest:/var/log# unzip -p test_grep1.zip|egrep "extracting|inflating|jli"
jli
jli
Using “-c”, the file name is printed.
extracting: aaa.log
jli
jli
inflating: bbb.log
root@jlitest:/var/log# unzip -c test_grep.zip|egrep "extracting|inflating|jli"
inflating: bbb.log
extracting: aaa.log
jli
jli
Extracting files to stdout is quite useful for another scenario to search a zip file — what if you only want to search a string in a specific file within a zip file? In this case, you don’t want to use zipgrep because it will try to scan all files within the zip file. For example, you have a file named aaa.log within logs.zip and you want to just search this aaa.log for the string “jli”, as showed in the above example, we could just use the “-p” option of unzip:
root@joetest:~/Service_Tools# unzip -p logs.zip aaa.log|grep jli
jli
Having fun with your searching a zip file now!