Listing All Subdirectories With File Sizes in Linux

I recently had to add a new site to a repository, and was appalled to see that the site came to 2.2G! This wouldn’t ordinarily be an issue if I or one of my colleagues had developed the site as I’d know which directories to set ignore rules for etc. and be able to whittle it right down very quickly. Not in this case – we were taking on a site for someone else so I’d have to look into which files and folders were causing such hefty disk usage.

To complicate matters the site in question was programmed using a pattern such as MVC so I was presented with a huge number of PHP files, text files, subdirectories et al and to see where the directories were amongst everything else was a challenge in itself. I decided I needed to use ls to show only directories. Easier said than done! No such flag seems to exist as far as I can see (though I would say that running man ls on my Mac OS X Lion returns a helpful list of what each flag does to ls but not what the corresponding flag actually is, so that was far from helpful.

A quick google soon rectified this as the site had a forum post with a very similar issue – and here’s one of the examples given to just show directories when using ls:

[sh]$ ls -AF | grep \/[/sh]

Great. Now we’re cooking with gas. All we need to do now is pipe the result to du -sh which will give you the human-readable file size summary for a given path and we’ll have our code. The finished command should look like so:

[sh]$ ls -AF | grep \/ | xargs du -sh[/sh]

If you want to go one step further you could create an alias for the command in your bash profile by adding something similar to the below to your profile (usually at ~/.profile or ~/.bash_profile):

[sh]alias subdirsizes=’ls -AF | grep \/ | xargs du -sh'[/sh]

And there you have it – a command to help you determine where the large subdirectories are within a folder on Linux!

Update 10 July 2012
As per Moztech’s comment below the above falls over when encountering directories with spaces in their names. We’ll not go into why you have directories with spaces in their names in the first place, but rather just come up with a solution that solves this issue. Piping the output of the grep command to sed as shown below allows us to slip in a regex replace to escape the spaces and stop them being an issue:

[sh]$ ls -AF | grep \/ | sed ‘s/\ /\\\ /g’ | xargs du -sh[/sh]

This has now become quite the mouthful though so I do implore you to alias the command as I did above – it now becomes the following in your bash profile:

[sh]alias subdirsizes=”ls -AF | grep \/ | sed ‘s/\ /\\\ /g’ | xargs du -sh”[/sh]

That should settle that one for you nicely – thanks to Moztech for pointing it out!