After using your Linux system for a couple of months or years, you will find that a lot of dust will start building up on different parts of your OS.
This is especially true regarding your own data and files.
You will have many duplicate files, large files that you no longer use or need, files which you copied to somewhere else but forgot to delete from the original position to free up space… It will eventually be a mess.
On Windows, there were many data cleaning programs like CCleaner and others, but are there any useful alternatives on Linux?
Luckily, the answer is yes, and today we will be doing a walkthrough on Czkawka; an open source data cleaning software which can be used for all your system cleaning purposes.
A Swiss Knife For Data Cleaning on Linux
Czkawka is an open source application written in Rust programming language and GTK toolkit. It works on Linux, macOS and Windows operating systems.
The following features are available in the software:
- Detecting duplicate files anywhere on the system.
- Detecting empty files and folders.
- Detecting files with big size based on a specific criteria.
- Detecting temporary files (Cache, miscellaneous files… etc).
- Detecting image files which are either identical or similar, using a number of different possible algorithms or configurations.
- Detecting video files which are similar (Requires
- Detecting duplicate music files based on their metadata.
- Detecting invalid symbolic links on the system (Links which are no longer working).
- Detecting broken files (Files which can not be opened or rendered).
- Detecting files with bad extensions (Files which are using a different extension in their name than the actual data container they ship).
- GTK 3 or GTK 4 user interface, based on what is available on the user’s system.
- Different configurations and options that can be tweaked in order to meet specific needs of the user (E.g file size, modification date, path exclusion and inclusion… etc).
The software provides these features as “tools” (Just tabs, actually) which can be selected and used depending on what the user wants.
Version 5.0 of the software was released just few days ago.
The software is also available as a CLI (Command line) tool which can be used without needing a graphical user interface. This is useful for anyone wishing to use the software for automation purposes for example, or just as a geeky taste.
Journey to Dust Off Your Partitions
This is the default interface of the program (with version 4.1):
You can choose any tool that you want to run from the left sidebar. You can also choose the files and folders to include or exclude from the section on the top.
Notice that you can also choose the algorithm/method to be used with the current tool that you have chosen. This is because in many tools (Like finding similar files or videos), there isn’t just one way for running the query; instead, there are many different algorithms that can give you different results based on how you set them up.
Once you are done with settings adjustment, you can click the “Search” button to start your query:
As you can see from the picture above, the software clearly displayed the duplicate files on the system, as well as their current location. You can choose to select, delete, link or move these files however you like.
There is also a terminal output part on the bottom of the screen, which shows you some possible error or information messages.
Perhaps one of the most important features of the software is finding big files on the system. You can specify the number of big files that you want to be listed:
Once found, you can move these files or remove them however you like.
Another great feature of the software is finding similar images and videos wherever they are on your system. This could be done using many different algorithms and hash sizes, which will give you different results if you re-run the query, and it is a very handy tool:
Notice how if you click on an image, you can preview it from the right sidebar before deciding what to do with it.
Finally, you can also see broken files. These files are the ones which could not be opened or rendered using your current system libraries and software, which may make them pointless to you and perhaps you want to do something with them:
However, the most prominent feature that makes Czkawka special from its peers is perhaps its speed; based on our tests and even when using an old desktop CPU (Ryzen 1600 from 2017, a $50 CPU), almost all included tools took from 1 to 10 seconds only to finish (on an almost full 256GB NVMe SSD)! While on other software (even from command line tools like
find), it may take minutes or hours to finish the same task.
The following comparison is given by the developers which shows the performance of their software compared to other ones under the same scenario. But even if you don’t take their word for it, it is indeed blazing fast based on our tests as well.
You can download the latest version of Czkawka from its releases page on GitHub. Just download the corresponding file based on your current operating system.
If you are on Linux, then you have many possible ways to use the software. You can install it as a Snap package (Gtk 3 version, 4.1.0), Flatpak package (Gtk 4 version, 5.0.0) or portable AppImage file.
Keep in mind that the latest version may not work on your Linux system if you don’t have Gtk 4 (Specifically, the
libgtk-4-dev package), which is why using an app package format like Flatpak could be a better idea (Because the needed libraries are included).
Additionally, make sure you have
ffmpeg installed on your system if you want to use the “similar videos/images” feature in the software.
Everything is explained in details in the installation help file.
The Bottom Line
So as you should have seen by now, Czkawka is indeed one of the best tools for data and files cleaning out there. The fact that it is open source, written in Rust and Gtk is a huge plus, but as we said, the performance of the software is perhaps the best thing there is about it.
The interface of the software is also quite straightforward to the point and easy to use, and any user will be able to use it without having to read tutorials or documentations of any type.
Consider donating to the developer of the software if you like it.