Categories
Articles

How to Recursively Traverse File Directories with PHP RecursiveDirectoryIterator


By Lon Hosford
You can avoid writing recursive function to traverse through tree structures like you server file system. PHP has several Iterator classes starting with version 5. PHP Logo with  RecursiveDirectoryIterator

In this article we will look at the RecursiveDirectoryIterator class. We will build a utility function to use the RecursiveDirectoryIterator class to provide a text listing of the path and file names in one or more directories.

Although we are just displaying paths to the files, you can also access the file information such as modification date, creation date, size, permissions and a variety of properties through the parent classes of RecursiveDirectoryIterator. Those classes include in order of inheritance: FilesystemIterator, DirectoryIterator and SplFileInfo. SplFileInfo provides many of the global file functions in PHP such as isDir, is_readable is_writeable and is_real for example.

Source Files: Download

Video Tutorial:

[iframe https://www.youtube.com/embed/jtqVJ3-95m4 640 360]

The User Interface – test_debug_dir_list.php

There is a simple hybrid HTML PHP script, test_debug_dir_list.php, to demonstrate. It has five tests using a customized function for RecursiveDirectoryIterator.

test_debug_dir_list.php in browser no links chosen
This is the first test showing all the file and directories in the folder that test_debug_dir_list.php is in.

test_debug_current_dir in browser first test link chosen

[ad name=”Google Adsense”]

The second test showing all the file and directories in the folder that test_debug_dir_list.php and of all its child directories.

test_debug_current_dir_1 in browser second test link chosen

The third links shows the files and directories of the parent directory for test_debug_dir_list.php.

test_debug_parent_dir in browser third test link chosen

This fourth test link is like the third but includes the first level of children directories for the parent directory.

test_debug_parent_dir_1 in browser third test link chosen

This fifth link test one more child directory level than the fourth.

test_debug_parent_dir_2 in browser third test link chosen

Custom Function debug_dir_list Using RecursiveDirectoryIterator

The function parameters on line 17 are the depth levels for recursing file system directories and the starting file system directory.

The default file system directory is the running script that might include debug_utils.inc.php contain our function. You can use the standard file system notation to express parent directories and paths.

The depth levels are the exact values for the RecursiveIteratorIterator class instance created on line 23. A negative one recurses to the last lowest level. Be careful with that on a big file system where the second argument is the root or near it. The debug_dir_list default value is zero which confines the RecursiveIteratorIterator instance to the starting files system directory.

debug_utils.inc.php – parameters

function debug_dir_list($dir_recurse_depth = 0, $dir_list_root = '.'){

Lines 19 to 21 creates the RecursiveDirectoryIterator instance. It has to arguments. The first, on line 20, is the starting directory and we use the debug_dir_list function’s second parameter without change. The second RecursiveDirectoryIterator instance parameter is a set of flags. We are using the flag to suppress showing the single and double dot files.

debug_utils.inc.php – The RecursiveDirectoryIterator Instance

	// Create a recursive file system directory iterator.
	$dir_iter = new RecursiveDirectoryIterator(
		$dir_list_root,
		RecursiveDirectoryIterator::SKIP_DOTS) ;// Skips dot files (. and ..)

Lines 23 to 27 create the RecursiveIteratorIterator class instance. Its first parameter on line 24 requires a class with a traversal iterator class and RecursiveDirectoryIterator implements the RecursiveIterator interface to meet that requirement.

The second argument on line 25 for the RecursiveIteratorIterator constructor is called mode. There are three modes that are constants to the class.

  • RecursiveIteratorIterator::LEAVES_ONLY – The default. Lists only leaves in iteration.
  • RecursiveIteratorIterator::SELF_FIRST – Lists leaves and parents in iteration with parents coming first.
  • RecursiveIteratorIterator::CHILD_FIRST – Lists leaves and parents in iteration with leaves coming first.

The RecursiveIteratorIterator constructor’s third argument on line 26 is called modes. It is optional and currently only has its own RecursiveIteratorIterator::CATCH_GET_CHILD as a possible value which will then ignore exceptions thrown such as denied file permissions.

Line 29 passed the recursion depth through the RecursiveIteratorIterator class setMaxDepth method. Here to the debug_dir_list function’s parameter is passed unchanged.

debug_utils.inc.php – The RecursiveIteratorIterator Instance

	// Create a recursive iterator.
	$iter = new RecursiveIteratorIterator(
		$dir_iter,
		RecursiveIteratorIterator::SELF_FIRST, // Lists leaves and parents in iteration with parents coming first.
		RecursiveIteratorIterator::CATCH_GET_CHILD // Ignore exceptions such as "Permission denied"
		);
	// The maximum recursive path.
	$iter->setMaxDepth($dir_recurse_depth);

The rest of the function uses the RecursiveIteratorIterator to iterate its objects which have a base class of SplFileInfo. These objects resolve as the path string. But also you can see on line 33, they have the method isDir which is like the standalone is_file function. If you explore the SplFileInfo class you can see all the other methods for file system objects. Line 31 is really just adding for visual purposes a trailing slash to objects that are a directory and not the starting directory.

Line 34 pushes that $path string onto the $paths array which is returned from the debug_dir_list function.

debug_utils.inc.php

	// List of paths Include current paths
	$path = array($dir_list_root);
	foreach ($iter as $path => $dir) {
		if ($dir_recurse_depth == 0 && $dir->isDir()) $path .= "/";
		$paths[] = substr($path,2);
	}
	return $paths;

debug_utils.inc.php – full listing

<?php
/**
 *	Utilitites to help in debugging.
 *
 *	@author Lon Hosford
 *	@link www.lonhosford.com
*/
/**
 *	Create an array of files and directory names, Requires PHP 5 >= 5.3.1.
 *
 *	Directories without contents have a slash appended or at the $dir_recurse_depth regardless if they have contents. Hidden files and folders are included.
 *	@link http://stackoverflow.com/questions/14304935/php-listing-all-directories-and-sub-directories-recursively-in-drop-down-menu This code is based on this Stackoverflow post.
 *	@param int $dir_recurse_depth recurse depth. 0 for $dir_list_root. Add 1 for each child level.  -1 is used for any depth.
 *	@param string $dir_list_root recurse depth. Starting folder path. Files in this directory are included. Default is current directory.
 *	@return string[] List of folders and files found.
 */
function debug_dir_list($dir_recurse_depth = 0, $dir_list_root = '.'){
	// Create a recursive file system directory iterator.
	$dir_iter = new RecursiveDirectoryIterator(
		$dir_list_root,
		RecursiveDirectoryIterator::SKIP_DOTS) ;// Skips dot files (. and ..)
	// Create a recursive iterator.
	$iter = new RecursiveIteratorIterator(
		$dir_iter,
		RecursiveIteratorIterator::SELF_FIRST, // Lists leaves and parents in iteration with parents coming first.
		RecursiveIteratorIterator::CATCH_GET_CHILD // Ignore exceptions such as "Permission denied"
		);
	// The maximum recursive path.
	$iter->setMaxDepth($dir_recurse_depth);
	// List of paths Include current paths
	$path = array($dir_list_root);
	foreach ($iter as $path => $dir) {
		if ($dir_recurse_depth == 0 && $dir->isDir()) $path .= "/";
		$paths[] = substr($path,2);
	}
	return $paths;
}
?>
The User Interface – Exploring the source of test_debug_dir_list.php

For the UI script line 10 imports our debug_dir_list function.

test_debug_dir_list.php

include_once "debug_utils.inc.php";

Line 32 and lines 35 to 40 provide a url link back to test_debug_dir_list.php. Lines 35 to 40 provide a NVP (Name Value Pair) for the URL query and line 32 omits that. The NVP name is debug-action and the values current_dir, current_dir_1, parent_dir, parent_dir_1 and parent_dir_2.

test_debug_dir_list.php

<h3>debug_dir_list($dir_recurse_depth = 0, $dir_list_root = '.')</h3>
<ol class = 'mono-space'>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=current_dir';?>">Run</a> - List Current Directory - debug_dir_list()</li>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=current_dir_1';?>">Run</a> - List Current Directory + Children(1) - debug_dir_list(1)</li>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=parent_dir';?>">Run</a> - List Parent Directory  + Children(0) - debug_dir_list(0, "../")</li>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=parent_dir_1';?>">Run</a> - List Parent Directory + Children(1) - debug_dir_list(1, "../")</li>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=parent_dir_2';?>">Run</a> - List Parent Directory + Children(2) - debug_dir_list(2, "../")</li>
</ol>

Lines 44 to 48 interrogate the super global $_GET variable for the debug-action key and sets the $get_action variable either as an empty string or the value in the $_GET['debug-action'] variable. There is no need to sanitize here as this is for internal development and testing and not a production script.

test_debug_dir_list.php

<?php
if ( !isset($_GET["debug-action"]) ){
	$get_action = "";
}else{
	$get_action = $_GET["debug-action"];
}

Lines 52 to 68 a switch block sets out each of the debug-action values and echoes the results of the debug_dir_list function with parameters to meet the desired results.

test_debug_dir_list.php

switch($get_action){
	case 'current_dir':
		echo "File list: " . print_r(debug_dir_list(), true);
	break;
	case 'current_dir_1':
		echo "File list: " . print_r(debug_dir_list(1), true);
	break;
	case 'parent_dir':
		echo "File list : " . print_r(debug_dir_list(0, "../"), true);
	break;
	case 'parent_dir_1':
		echo "File list: " . print_r(debug_dir_list(1, "../"), true);
	break;
	case 'parent_dir_2':
		echo "File list: " . print_r(debug_dir_list(2, "../"), true);
	break;
}

Worth a mention are lines 49 to 51. They help debug the debugging. The file name is helpful on line 49 when you are bleary eyed and not sure which testing script you are running. The basename function passed the magic constant __FILE__ provides the file name of this script. In face you might want to add a line use display the magic constant __FILE__. Testing scripts can get spread around and copied when the coding battle to finish gets wild.

Along with that heat of the battle information is the PHP version in front of your eyes. Line 50 does that. There is always the situation that a “temporary” server is needed to run some tests and no one bothers to check the PHP version installed. The phpversion function is just the ticket.

Then to check this debugging program is passing the expected debug-action value, we throw that out on line 51.

test_debug_dir_list.php

echo basename(__FILE__) . "\n";
echo "PHP version: " . phpversion() . "\n";
echo "debug-action=" . $get_action . "\n";

test_debug_dir_list.php – full listing

<?php
/**
 *	The debugging dashboard for testing and development.
 *
 *  @author Lon Hosford
 *  @link www.lonhosford.com
 *	@copyright 2014 Alonzo Hosford
 *  @license GPL
*/
include_once "debug_utils.inc.php";
?>
<!doctype html>
<html>
<head>
	<meta charset="UTF-8">
	<title>Testing and Debugging Dashboard | lonhosford.com</title>
	<style>
	body{ font-family:"Gill Sans", "Gill Sans MT", "Myriad Pro", "DejaVu Sans Condensed", Helvetica, Arial, sans-serif}
	pre {
	 white-space: pre-wrap;       /* css-3 */
	 white-space: -moz-pre-wrap;  /* Mozilla, since 1999 */
	 white-space: -pre-wrap;      /* Opera 4-6 */
	 white-space: -o-pre-wrap;    /* Opera 7 */
	 word-wrap: break-word;       /* Internet Explorer 5.5+ */
	}
	.mono-space{font-family:monospace;}
	table,td {border:solid 1px #000;}
	</style>
</head>
<body>
<h2>Testing and Debugging Dashboard</h2>
<h4><a href="<?php echo $_SERVER['PHP_SELF'];?>"><?php echo basename(__FILE__);?></a></h4>
<h3>debug_dir_list($dir_recurse_depth = 0, $dir_list_root = '.')</h3>
<ol class = 'mono-space'>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=current_dir';?>">Run</a> - List Current Directory - debug_dir_list()</li>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=current_dir_1';?>">Run</a> - List Current Directory + Children(1) - debug_dir_list(1)</li>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=parent_dir';?>">Run</a> - List Parent Directory  + Children(0) - debug_dir_list(0, "../")</li>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=parent_dir_1';?>">Run</a> - List Parent Directory + Children(1) - debug_dir_list(1, "../")</li>
	<li><a href="<?php echo $_SERVER['PHP_SELF'] . '?debug-action=parent_dir_2';?>">Run</a> - List Parent Directory + Children(2) - debug_dir_list(2, "../")</li>
</ol>
<hr>
<pre>
<?php
if ( !isset($_GET["debug-action"]) ){
	$get_action = "";
}else{
	$get_action = $_GET["debug-action"];
}
echo basename(__FILE__) . "\n";
echo "PHP version: " . phpversion() . "\n";
echo "debug-action=" . $get_action . "\n";
switch($get_action){
	case 'current_dir':
		echo "File list: " . print_r(debug_dir_list(), true);
	break;
	case 'current_dir_1':
		echo "File list: " . print_r(debug_dir_list(1), true);
	break;
	case 'parent_dir':
		echo "File list : " . print_r(debug_dir_list(0, "../"), true);
	break;
	case 'parent_dir_1':
		echo "File list: " . print_r(debug_dir_list(1, "../"), true);
	break;
	case 'parent_dir_2':
		echo "File list: " . print_r(debug_dir_list(2, "../"), true);
	break;
}
?>
</pre>
</body>
</html>

[ad name=”Google Adsense”]