Next, the ProcessPath() subroutine processes each subdirectory, as callout C shows. First, the script recursively calls ProcessPath(), this time passing in the path of a subdirectory. The function call returns a reference to the $DirEntry hash. This hash contains the aggregate sum of all the files in the subdirectory and further down the tree. This information is added to values in a local copy of the %ThisDir hash, which represents the size of the directory currently being processed.
The code at callout D processes all the files that were discovered in the directory as opposed to the subdirectories. The code retrieves the size of each file and sums them. This information is added to the local copy of the %ThisDir hash. Finally, the code at callout D returns a reference to %ThisDir.
Eventually, the script calls the Report() subroutine (see the last line in callout A), which displays the results of the data it has collected. As callout E shows, the core of this subroutine is a foreach loop. The foreach loop processes each entry in the %PathSize hash, which contains keys representing the path to each directory that has been processed. The hash's values are the same %ThisDir hashes that were created and updated in the code at callout C and callout D.
The directories are processed in descending order of the total size listed in each directory's hash. To accommodate long paths on the display without wrapping to another line, the $ShortPath variable is modified when its length exceeds 64 characters. The modification results in a path that's shortened by replacing the middle of it with an ellipsis (...).
The two remaining subroutines— FormatNumber() and FormatNumberPretty()—in Dir_Sizes.pl are for formatting purposes. FormatNumber() adds commas to large numbers (e.g., 12345 becomes 12,345). Format-NumberPretty() adds memory size suffixes, such as M for megabytes.
Scan Your Hard Disk
Dir_Sizes.pl has become a useful part of my personal toolkit. The script works well on any file-based drive, such as local drives, CD-ROMs, USB flash drives, and even network shares. Running the script occasionally is a great way to reacquaint yourself with your hard drive. You might be amazed on what you find lurking in its depths.
Dave Roth (rothd@roth.net) is the author of several Win32 Perl extensions, including Win32::AdminMisc, Win32::ODBC, Win32::Daemon, and Win32::Perms. His most recent book is Win32 Perl Programming: The Standard Extensions, 2nd edition (New Riders Publishing/ Macmillan Technical Publishing).