From time to time, you might need to search certain strings within an archive file in the zip format which contains multiple files on a Linux system. If you have never done this, you might ask — what tools to use? Are there existing commands which can be used instead of writing your own script to unzip the archive and search?
Fortunately the answer is yes. There are commands like zgrep and zipgrep. What’s the difference between them then?
When I got a request as mentioned above to grep a string within an archive, I didn’t know which commands to use and which commands for which scenarios. I first tried the command zgrep because I wasn’t aware of zipgrep.
Using zgrep to searching a zip file doesn’t really work with two issues:
- It can only search the first file within the zip file.
- It doesn’t list the file name.
For example, I created two zip files which contians two plain text files in different orders –aaa.log contained the string “jli” I was looking for, the other one bbb.log didn’t.
test_grep1.zip had aaa.log as the first one while test_grep.zip had bbb.log as the first file within the archive.
zip test_grep1.zip aaa.log bbb.log zip test_grep.zip bbb.log aaa.log root@jlitest:/var/log# unzip -l test_grep.zip Archive: test_grep.zip Length Date Time Name --------- ---------- ----- ---- 376908 03-24-2022 20:52 bbb.log 8 03-24-2022 20:52 aaa.log --------- ------- 376916 2 files root@jlitest:/var/log# unzip -l test_grep1.zip Archive: test_grep1.zip Length Date Time Name --------- ---------- ----- ---- 8 03-24-2022 20:52 aaa.log 376908 03-24-2022 20:52 bbb.log --------- ------- 376916 2 files
When searching the string “jli” within test_grep1.zip, it found it. But searching within test_grep.zip, it didn’t.
root@jlitest:/var/log# zgrep jli test_grep1.zip jli jli root@jlitest:/var/log# zgrep jli test_grep.zip
Then I took a close look at zgrep, it is actually just a bash file wrapping grep & gzip (using gzip’s options “-c”, “-d” to decompress to stdout) as its man page states “zgrep — a wrapper around a grep program that decompresses files as needed”
root@jlitest:/var/log# which zgrep alias zgrep='zgrep --color=auto' /usr/bin/zgrep
No wonder it doesn’t work well with zip files. As you can see from the following example:
root@jlitest:/var/log# gzip -d -c test_grep1.zip jli jli gzip: test_grep1.zip has more than one entry--rest ignored
It will stop after the first file because it only expects 1 compressed file.
Then I realized there is another command zipgrep (bash file again) which wraps egrep & unzip — exactly what I was looking for my task.
Its man page says “zipgrep: Use unzip and egrep to search the specified members of a zip archive for a string or pattern.”
root@jlitest:/var/log# zipgrep jli test_grep1.zip aaa.log:jli aaa.log:jli root@jlitest:/var/log# zipgrep jli test_grep.zip aaa.log:jli aaa.log:jli
Again the concept of zipgrep is to unzip files within an archive (zip) file to stdout and egrep patterns from there. It uses unzip’s “-p” option to extract files (only the file data) to pipe (stdout).
unzip has another “-c” option to extract files to stdout/screeen — similar to “-p” but with the name of each file printed.
So you can use simple commands to implement zipgrep.
Using “-p”, no file name is printed.
root@jlitest:/var/log# unzip -p test_grep1.zip|egrep "extracting|inflating|jli" jli jli
Using “-c”, the file name is printed.
extracting: aaa.log jli jli inflating: bbb.log root@jlitest:/var/log# unzip -c test_grep.zip|egrep "extracting|inflating|jli" inflating: bbb.log extracting: aaa.log jli jli
Extracting files to stdout is quite useful for another scenario to search a zip file — what if you only want to search a string in a specific file within a zip file? In this case, you don’t want to use zipgrep because it will try to scan all files within the zip file. For example, you have a file named aaa.log within logs.zip and you want to just search this aaa.log for the string “jli”, as showed in the above example, we could just use the “-p” option of unzip:
root@joetest:~/Service_Tools# unzip -p logs.zip aaa.log|grep jli jli
Having fun with your searching a zip file now!