We collect data from clients via a number of methods including FTP. We'd been collecting, but not processing, a particular client's files for a while now accumulating to over 200,000 files in a single folder.

That's a fair few files and we needed to process them and then remove them, being the go-to-Linux-guy I was tasked with sorting through the files.

Sadly the version of find we have on the server doesn't have the parameter allowing me to set a date to find files before/after but it does have the ability to pick a reference file:

find -not -newer ./mb-001.*****.log.csv -delete

This command finds every file not newer than the reference file, and deletes them, rather quickly too.

As a side note, even rm had difficulty deleting all the files in another folder (which had 400,000+ files in it) thankfully using find and xargs allowed me to break it up a little:

find . | xargs -0 rm -f

Before deleting the files I'd compressed them all down to back them up, went down from 1.6GB to 26MB, even the ls -l file listing was larger than that.