Jump to content
Larry Ullman's Book Forums

Large Number Of Files In A Single Directory?


Recommended Posts

I've just started a new project which involves collating 104 years of magazine articles currently in PDF format and making them searchable. I've extracted the content from a DVD they produce and uploaded it to my development server. There's a large number of documents (approx 44000) which need to be organised including PDFs, images, XMPs and THMs.

 

At the moment they are arranged as they were on the DVD in folders for the volume and sub-folders for each issue within that volume. My question is should I keep this structure or recurse through and pull all the PDF's into one directory or at least into one directory per volume? Are there any pros or cons of having them organised in either way? The file names are all unique so moving into a single directory would not overwrite any files.

 

PS. I'm using the Zend_Lucene article you wrote to index all the PDF's too Larry so thanks for that!!

 

Thanks

Link to comment
Share on other sites

Hey Stuart,

 

I'm glad that Zend_Lucene article was useful for you. Good luck with your project.

 

I would be inclined to keep the files organized as they are, at least because it's convenient and easiest. There can be performance issues putting too many files and directories in one folder, but it depends upon the OS in use.

Link to comment
Share on other sites

OK thanks Larry I think I'll leave them as they are for now then - did wonder if they'd be performance issues. Right now the issue appears to be extracting the text from the PDF so a good chance I'll be creating a linux install post relatively soon!

Link to comment
Share on other sites

 Share

×
×
  • Create New...