linux - Removing files takes too long

Short version: rm -rf mydir, with mydir (recursively) containing 2.5 million files, takes about 12 hours on a mostly idle machine.

More information: Most of the files being deleted are hard links to files in other directories (the directory being deleted is actually the oldest backup made by rsnapshot; the rm command is actually given by rsnapshot). So it's mostly directory entries being deleted - the file content itself isn't much; it's in the order of some tens of GB.

I'm far from certain that btrfs is the culprit. I recall backup was also very slow before I started to use btrfs, but I'm not certain that the slowness was in the deletion.

The machine is an Intel Core i5 2.67 GHz with 4 GB RAM. It has two SATA disks: one has the OS and some other stuff, and the backup disk is a 1 TB WDC WD1002FAEX-00Z3A0. The motherboard is an Asus P7P55D.

Edit: The machine is a Debian wheezy with Linux 3.16.3-2~bpo70+1. This is how the filesystem is mounted:

root@thames:~# mount|grep rsnapshot
/dev/sdb1 on /var/backups/rsnapshot type btrfs (rw,relatime,compress=zlib,space_cache)

Edit: Using rsync -a --delete /some/empty/dir mydir takes about 6 hours. A significant improvement over rm -rf, but still too much I think. (Explanation of why rsync is faster than rm: "[M]ost filesystems store their directory structures in a btree format, the order [in] which you delete files is ... important. One needs to avoid rebalancing the btree when you perform the unlink.... rsync -a --delete ... does deletions in-order")

Edit: I attached another disk which had 2.2 million files (recursively) in a directory, but on XFS. Here are some comparative results:

                  On the XFS disk      On the BTRFS disk
Cached reads[1]       10 GB/s               10 GB/s
Buffered reads[1]     80 MB/s              115 MB/s
Walk tree[2]         11 minutes            43 minutes
rm -rf mydir[3]       7 minutes            12 hours

[1] With hdparm -T /dev/sdX and hdparm -t /dev/sdX.
[2] Time taken to run find mydir -print|wc -l immediately after boot.
[3] On the XFS disk, this was soon after walking the tree with find. On the BTRFS disk it is the old measurement (and I don't think it was with the tree cached).

It appears to be a problem with btrfs.

Blog

Search This Blog

linux - Removing files takes too long

Comments

Post a Comment

Popular posts from this blog

iLO 3 Firmware Update (HP Proliant DL380 G7)

linux - Awstats - outputting stats for merged Access_logs only producing stats for one server's log

linux - How can I get my mediawiki to stop thinking I have cookies disabled?