Getting Directory Tree – Recursive VS Non-Recursive
I took some time to examine between the performance of using a recursive and non recursive method of retrieving a directory tree. Of-course retrieving a directory tree is possible with out using any functions and still be able to get the full tree with unlimited depth possibilities. I made the non recursive piece of code that outputs an array of each file and folder/subfolder of a given directory and a recursive function that calls it self on each new folder found. And then before an after I took a memory usage survey and time lapse of code execution. The folder that i used for testnig had some random folders and files and a freshly downloaded codeigniter frame work. So there were about 360 items in the generated array of files and folders.
Non-Recursive Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | $tree=array('folder/'); $folders_to_do = array('folder/'); $current_folder = ''; while(!empty($folders_to_do)){ $key_of_current_folder = key($folders_to_do); $current_folder = $folders_to_do[$key_of_current_folder]; $folder = scandir($current_folder); foreach($folder as $item){ if(is_dir($current_folder.$item) and $item != '.' and $item != '..'){ $folders_to_do[] = $current_folder.$item.'/'; $tree[] = $current_folder.$item.'/'; } if(is_file($current_folder.$item)) $tree[] = $current_folder.$item; } unset($folders_to_do[$key_of_current_folder]); } sort($tree); |
The way it works is very simple. It runs only on 2 loops (while & foreach), 1 array (folders_to_do) and 1 string variable (current_folder) to control what we are scanning. The outer loop will always loop it self as long as it has something to scan, that means if the to-do array is not empty it will loop again. Now inside the while loop we have the regular scandir() function and foreach loop. in the foreach loop when a file is found it is added to the main tree array but if a folder is found then it is not only added to the tree array but is also added to the to-do list. Since the to-do list has something new to scan, the while loop will not brake. At the end when there is no more directories to scan the tree array will contain every folder and file found in the initial directory passed. All that is left to do is to sort the array and you get a neat list of each files and folders of a passed directory.
Recursive:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | $tree2 = recursive('folder/'); function recursive($current_folder){ static $tree = array(); $folder = scandir($current_folder); foreach($folder as $item){ if(is_dir($current_folder.$item) and $item != '.' and $item != '..'){ $tree[] = $current_folder.$item.'/'; recursive($current_folder.$item.'/'); } if(is_file($current_folder.$item)) $tree[] = $current_folder.$item; } return $tree; } |
This method is also straight forward. It uses a static array inside the function to hold the tree, optionally you can keep passing the tree as an argument if you do not wish to use a static array or if you simple going to use the function multiple times on the same script. The function it self takes in only 1 string variable which is the folder it is going to process. The foreach loop does same thing basically; if it is a file it only adds to the tree array and if it is a folder it adds to the tree array and calls the function with the folder path as teh argument. The recursion concept is no different then any other. It was only simplified ridiculously.
Result Array:
Array
(
[0] => folder/
[1] => folder/another folder/
[2] => folder/another folder/bar file
[3] => folder/another folder/dddddd/
[4] => folder/another folder/dddddd/lost track/
[5] => folder/another folder/dddddd/lost track/last file
[6] => folder/another folder/hhh/
[7] => folder/another omg folder/
[8] => folder/codeigniter omg/
[9] => folder/codeigniter omg/index.php
...
Both of those codes will produce exactly the same thing but using different amount of memory. Above you can see a steddy array with each folder and file name with full paths.
Performance:
AS I said earlier I had about 360 files and folder combined for testing.
The time in milliseconds took to execute was the same on average of those methods. I refreshed many times and it was always either same or one or the other had few more milliseconds.
The memory usage for the non recursive was 82,512 and for the recursive method it was 116,112. I used the memory_get_usage() function to find those numbers. For each method I took memory usage before the code and after the code. So the results are (after_code – before_code) for first and second methods, just to clarify so no one has questions to how I have come to my conclusion.
82,512 / 116,112 = 0.71 ratio!
All testing was done here: http://www.sandbox.doc776.org/recursion/script.php