Short version: rm
, with
-rf mydirmydir
(recursively) containing 2.5
million files, takes about 12 hours on a mostly idle
machine.
More
information: Most of the files being deleted are hard links to files in
other directories (the directory being deleted is actually the oldest backup made by
rsnapshot
; the rm
command is actually
given by rsnapshot
). So it's mostly directory entries being
deleted - the file content itself isn't much; it's in the order of some tens of
GB.
I'm far from certain that
btrfs
is the culprit. I recall backup was also very slow before
I started to use btrfs
, but I'm not certain that the slowness
was in the deletion.
The machine is
an Intel Core i5 2.67 GHz with 4 GB RAM. It has two SATA disks: one has the OS and some
other stuff, and the backup disk is a 1 TB WDC
. The motherboard is an Asus
WD1002FAEX-00Z3A0
P7P55D.
Edit: The
machine is a Debian wheezy with Linux 3.16.3-2~bpo70+1
. This is
how the filesystem is
mounted:
root@thames:~# mount|grep
rsnapshot
/dev/sdb1 on /var/backups/rsnapshot type btrfs
(rw,relatime,compress=zlib,space_cache)
Edit:
Using rsync -a --delete /some/empty/dir mydir
takes about 6
hours. A significant improvement over rm -rf
, but still too
much I think. (Explanation of
why rsync
is faster than rm
:
"[M]ost filesystems store their directory structures in a btree format, the order [in]
which you delete files is ... important. One needs to avoid rebalancing the btree when
you perform the unlink.... rsync -a --delete
... does deletions
in-order")
Edit:
I attached another disk which had 2.2 million files (recursively) in a directory, but on
XFS. Here are some comparative
results:
On the XFS disk On the
BTRFS disk
Cached reads[1] 10 GB/s 10 GB/s
Buffered reads[1] 80 MB/s
115 MB/s
Walk tree[2] 11 minutes 43 minutes
rm -rf mydir[3] 7
minutes 12 hours
[1]
With hdparm -T /dev/sdX
and hdparm -t
.
/dev/sdX
[2] Time taken to run find mydir -print|wc
immediately after boot.
-l
[3] On the XFS disk, this was soon
after walking the tree with find
. On the BTRFS disk it is the
old measurement (and I don't think it was with the tree
cached).
It appears to be a problem
with btrfs
.
Comments
Post a Comment