Store images in date folders

Piers · Dec 12, 2011

A post here -> http://chevereto.com/forums/topic1587-store-images-by-date-folders.html makes a lot of sense and I think this should be rolled out in the next update.

Storing files this way makes perfect sense, it is easy to manage albums and reduces the loading time when using FTP/SFTP/SCP to access the site (Admins).

Stepashka and Danny.Domb have got some good code there and if Rodolfo could take a look and let us know what he thinks?

Rodolfo · Dec 13, 2011

I preffer this: http://chevereto.com/forums/topic1512-images-stored-in-multiple-folders.html

I'm fan of the idea of having a consecutive directory tree with help of a dB to filter and manage results. The only advantage of doing this in datefolders is that you can browse files on a date based structure on your server... But, why?

I mean, you will most likely need to search images by size, data range, ip, user, etc. And for that you need to have a dB of images with their corresponding relation. I will dare anyone to do this same filter by browsing through the folders on the system.

At this time, I haven't found a good reason to drop the idea of consecutive folders instead of this alternative.

Rodolfo · Jan 16, 2012

I will keep this in mind
http://stackoverflow.com/questions/671260/tips-for-managing-a-large-number-of-files
http://stackoverflow.com/questions/446358/storing-a-large-number-of-images

Rodolfo · Jan 21, 2012

Ok, the method implemented will be a mix between the current Chevereto system and the id % values.

There will be a base of 100 (00 to 99) folders on the /images folder. And each one of this folders will have 1000 sub-folders (000 to 999) and the images will be stored balancing the storage on each folder. This means that the dirs will most likely share the same amount of files.

This new perspective of "a lot of balanced folders" gives the possibility of drop the random image name and preserve the original names (but if you have a zillion of images is most likely that the new images will always have a random name).

The folder-numbers will be calculated using the file id (dB), this means that the balancing is made with no dB index or file count, it just delivers "12345" to "/45/345" so is lightweight.

And this also allows scalability because you can map a dir to another hdd or even to another machine in your system. Is way different than the folder date alternative because in that configuration you can't easy scale things because you don't balance the dirs, you just populate them.

What you think about it?

Piers · Jan 21, 2012

I like that idea, makes sense, however when the user is presented with a direct link - example: http://anony.ws/i/12345.png -will that turn to http://anony.ws/i/100/203/12345.png? Or could you write a .htaccess to deal with that? It's just I like the short links

Apart from that the structure sounds like a great idea.

Danny.Domb · Jan 22, 2012

well, since all images will be stored inside a database, when you will access an image, the path will be something similar to :

http://your-website.com/images/XXX

Where XXX is the obfuscated id (ids are numeric values) of the image (ids are automatically assigned by the database)

By obfuscated I mean, instead of seeing a numeric value such as 24, you will see something like : As65DSu97

Hubertus · Jan 22, 2012

In my previous script, each day had own folder. img/20100402 maybe something like that: images/YYYY/MMDD? its simpe, and we dont need extra databse involvement to find image to delete. Just simple link. Please try not to use database, where is not needed.

edit. then we can have original filename, if they doubled-added extra char to end of name.

Rodolfo · Jan 22, 2012

The goal will be always to minimize the amount of folders and give you the ability of balance the distribution. This implementation means 10.000 folders which in fac is more than the ext3 system supports (3200) but that can be tricked (and is tricked) on mostly webservers and by the way they mostly use ext4 (the limit there is 4 billion).

Second thing, the image name will never be related to the id, time or anything because in 2.1 you will be able to change the filename meaning that we can't rely on the filename. This also means that the rw that tonemapped suggest can't be done.We will need either a php redirect or move that folder to something like "oldfolder" and do the rw over the /folder/image.ext --> /oldfolder/image.ext

Now, something that I haven't told you is that Chevereto will support multiple storage structures like multiserver, cdn, the balanced storage and why not the datefolder alternative. There will be a table that will store the storage rules and in the images table we just tell it what kind of storage have this current image. That's why there is no need to relate the filepath either with the real id and the encrypted id, the system, will have this options and I have that in mind that is why I have come up with this system.

I haven't make my mind on what should be the better alternative between /id%/%id/ and /datefolder/ method. For instance, date folders means easy way to browse the real folders and it won't mean a huge number of folders because this means 365 folders each year (27 years to get 10,000 folders) and if you want to balance, just assign each month to a different server or something like that, based on population (is most likely that no one here will set up 300 servers to do the balance in the /id%/%id/ alternative and if you want to balance you go for cdn and things that are way more easy to setup and maintain.

As I said it before, I'm observing all the alternatives.

Rodolfo · Jan 22, 2012

Ok, I have finally made my mind. We are going to use date folders in the format: YYYY/MM/DD

The images will be stored as filename.ext and the thumbs will be on the same folder but named filename.th.ext possible new sizes (like a miniature which keeps proportions) will use this same way of storing different sizes.

Piers · Jan 22, 2012

Sounds like the best idea, although I do like images have random names. Would it still be possible to keep an option for that?

Rodolfo · Jan 22, 2012

Yeah.. random names stays because you can upload a sequential images and you can guess it by just doing +1 on each of them, so yes... Random image names are still on.

Store images in date folders

Piers

Founder licence

Rodolfo

⭐ Chevereto Godlike

Rodolfo

⭐ Chevereto Godlike

Rodolfo

⭐ Chevereto Godlike

Piers

Founder licence

Danny.Domb

👽 Chevereto Freak

Hubertus

Chevereto Member

Rodolfo

⭐ Chevereto Godlike

Rodolfo

⭐ Chevereto Godlike

Piers

Founder licence

Rodolfo

⭐ Chevereto Godlike