• Hey Guest, don't forget to VOTE on each RFC topic. Your voting determine Chevereto development! No votes, no development.
  • Welcome to the Chevereto user community!

    Here users from all over the world gather around to learn the latest about Chevereto and contribute with ideas to improve the software.

    Please keep in mind:

    • 😌 This community is user driven. Be polite with other users.
    • 👉 Is required to purchase a Chevereto license to participate in this community (doesn't apply to Pre-sales).
    • 💸 Purchase a Pro Subscription to get access to active software support and faster ticket response times.

Image statistic

evilmoe2

Not needed
Hello,

New idea (I'm surprised, noboy asked for it). Image statistics.

I know its possible to use awstats or similar. But evereyone with more as 1 server its not very comforable.
Should have:
  • Access counter (thumb, image seperate; per hour, day, week...)
  • Bandwidth counter (per hour, day, week, month...)
  • Last access to the image + delete unused images after x days (registred users after y days or never)
Cheers
 
Just use Google Analytics?

Unless you want to display this information to people, which isn't very hard to do. I've already added a bandwidth counter to my site, I'm nearly ready to release a chart that shows bandwidth over time, too. This would be handy for me, though!
 
The only way for Chevereto to track what happens to an image when you access to it is by serving the image using PHP which is not recommendable. Mainly because for every single image request you will trigger a PHP process and that will certainly burn the machine or something.

Best thing to do is to rely in software that is on the server layer, before PHP.
 
I know you have to use PHP for everey image request. But this is the only way to get easy statistic of every image.
...will certainly burn the machine or something.
I will add later a load balancer anyway. So I will have alot of machines in different countries which deliver images. Its hard for me if I dont get a real statistic abou the images which do alot of traffic or smth.
 
I know you have to use PHP for everey image request. But this is the only way to get easy statistic of every image.

I will add later a load balancer anyway. So I will have alot of machines in different countries which deliver images. Its hard for me if I dont get a real statistic abou the images which do alot of traffic or smth.

You don't understand. You don't use PHP to serve static content because the use of resources is huge.

For every single image request you will ask your server to run PHP, you will also hit the disk with a write request to keep the track (either in database or in a log file), open a new connection and finally burning memory. Imagine 100 images (which is very low) and imagine just 5 persons accessing your website at the same time. You will get 500X more load just with this simple example. Now imagine that someone embed your photos in one website, take for sure that your machine will get down in minutes. And I'm talking about anything above VPS, shared hosting don't event think about this.

What every does is to track the usages at server level which can't be compared with PHP doing this. Bottom line, you are asking to implement something that 1) Is not resource friendly at all and 2) It works better just accessing server logs. If you want to host the images in several servers you only need to keep an eye of it with scripts like AwStats or Webalizer. You can also use programs that reads the access_log file.

There could be an access_log reader for Chevereto to collect those stats, but it will only work for imges hosted in local. Not in external storage.
 
Last edited:
I know PHP should not use to serve static content, but you dont have to.

Example:
Image URL is at the moment: http://test.com/path/name.jpg
You can map this URL with mod_rewrite to a php script. This script counts the access and update your db. After this the PHP script forward the request to the real image path: http://test.com/real_path/name.jpg

The only bad thing is to safe this methode against a direct access to the real path.This should not be a huge load increase on a server.

BTW: I have only dedicated servers.
 
I know PHP should not use to serve static content, but you dont have to.

Example:
Image URL is at the moment: http://test.com/path/name.jpg
You can map this URL with mod_rewrite to a php script. This script counts the access and update your db. After this the PHP script forward the request to the real image path: http://test.com/real_path/name.jpg

The only bad thing is to safe this methode against a direct access to the real path.This should not be a huge load increase on a server.

That procedure is describing exactly what I told you. Is calling PHP. Just call PHP in a situation like this will cause issues. Is not like let say "is just an small PHP script..." wrong. Just by calling PHP for doing this thing is already consuming several times more resources than serving the image using direct server access.

Read or rewrite files using PHP is not designed to be implemented in situations like this when we are talking about a huge number of requests (hits). Is designed for controlled situations like when you download Chevereto.zip from your client panel. As you may notice this is not a request that is being made several times per second and it also has cache, is complete different from serving ALL the images of a website using PHP. And yes, you are serving them using PHP because that is the thing that is being dispatch, even if the script has two lines. You are asking the machine to start a sub-process.

Any procedure regarding PHP for doing this has several pitfalls, I've already told you how this impact the machine but if you don't believe me implement this on your own and let it run one week. Don't even collect stats just do the rewrite thing and measure how much is the impact on your machine.
 
Last edited:
That procedure is describing exactly what I told you. Is calling PHP. Just call PHP in a situation like this will cause issues. Is not like let say "is just an small PHP script..." wrong. Just by calling PHP for doing this thing is already consuming several times more resources than serving the image using direct server access.

Read or rewrite files using PHP is not designed to be implemented in situations like this when we are talking about a huge number of requests (hits). Is designed for controlled situations like when you download Chevereto.zip from your client panel. As you may notice this is not a request that is being made several times per second and it also has cache, is complete different from serving ALL the images of a website using PHP. And yes, you are serving them using PHP because that is the thing that is being dispatch, even if the script has two lines. You are asking the machine to start a sub-process.

Any procedure regarding PHP for doing this has several pitfalls, I've already told you how this impact the machine but if you don't believe me implement this on your own and let it run one week. Don't even collect stats just do the rewrite thing and measure how much is the impact on your machine.

I agree with you mostly, but done right this shouldn't be a problem.

I've installed my own hit counter into G\ and it's currently tracking;

Duplicate IP's to establish unique views, locations (using a GeoLocation API) and bandwidth usage per hour, day, week, month and year. Fair enough that the bandwidth usage is all down to logistics and isn't causing any load at all, but the IP tracking and duplication is where I had a problem with database-locking. I had 150,000 unique views and the system didn't cripple, which is great, but it did start to struggle. I eliminated this problem by creating a log file, much like access logs. The PHP script writes to that log and then uses that log to update the database every few hours. It's much more efficient and so far, no issues. Since the system has been implemented, I've counted more than 500,000 IP's.

Edit: Just a note, I flush the table every 36 hours so that if someone were to view it again, it counts as a unique view. I leave 36 hours before an IP is counted again.
 
evilmoe,

If Rodolfo doesn't want to implement this, which I can understand why, here's a good start for you if you want to use Memcache. This function will store your views in memory, when it's stored 10 views, it will do 1 query to update the database. You'll need to set the query. Rodolfo, please feel free to use this if you so desire, maybe implement it for those of us who are using Memcache?

If, for whatever reason this function fails or maybe the database has failed you, then it will check to see if the cache stored is less than whatever is in the database and update it where needed. Just pass this function the Image ID and the current view count. Let me know if you need any help. Sorry for the formatting, the post here has ruined it and removed my tabs lol

PHP:
function viewcount($pageID, $viewCount)
{

//Check to see if Memcache is runningif(class_exists('Memcache')) {
$memcache = new Memcached();

$theKey = 'image-with-ID-' . $pageID;

if (!$memcache->get($theKey)) {
$memcache->set($theKey, $viewCount, 0);
}

$count = $memcache->increment($theKey);

if ($count < $viewCount) {
$memcache->set($theKey, $viewCount, 0);
}

$interval = 10;
if ($count % $interval == 0) {
//Do query
}

} else {
//No memcache.
return $viewCount;
}

return $count;
}

Edit: I only had 5 minutes to do this, so please don't expect much
 
Yeah it always depends on how much you will hit the thing and how you will control DDoS. My main concern with this is not that normal usage will burn the CPU, but that it will open the path for an easy ddos attack. In the past I've been under attacks in the demo with some botnets targeting static content (files). By implementing this counter the resources used by the machine are higher so the ddos is easier.

So in your case you ended up with a system that writes in a log file and later send the data back to the database and my suggestion was to use the standard server access_log and read them using Chevereto. Access log has everything you need. For example:
Code:
127.0.0.1 - - [17/Jan/2015:16:22:06 -0800] "GET /img/favicon.png HTTP/1.1" 200 11182 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36"

So this what you have:
  1. The IP (127.0.0.1 is just example) so you can geo locate this
  2. The time when this happened
  3. The resource being access (/img/favicon.png)
  4. The HTTP status (200)
  5. The size transferred (11182)
  6. The referrer (not present in the example)
  7. Browser information
So why we need a custom hanler/writer in PHP anyway? We need a reader of the access_log file, that's all. We don't need to pass PHP in static requests. I like the efficient machine saving solutions and I think that this is the best thing to do.
 
Yeah it always depends on how much you will hit the thing and how you will control DDoS. My main concern with this is not that normal usage will burn the CPU, but that it will open the path for an easy ddos attack. In the past I've been under attacks in the demo with some botnets targeting static content (files). By implementing this counter the resources used by the machine are higher so the ddos is easier.

So in your case you ended up with a system that writes in a log file and later send the data back to the database and my suggestion was to use the standard server access_log and read them using Chevereto. Access log has everything you need. For example:
Code:
127.0.0.1 - - [17/Jan/2015:16:22:06 -0800] "GET /img/favicon.png HTTP/1.1" 200 11182 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36"

So what you have:
  1. The IP (127.0.0.1 is just example) so you can geo locate this
  2. The time when this happened
  3. The resource being access (/img/favicon.png)
  4. The HTTP status (200)
  5. The size transfered (11182)
  6. The referer (not present in the example)
  7. Browser information
So why we need a custom writer in PHP anyway? We need a reader of the access_log file, that's all.

It's because I plan to extend the functionality of it, that's why I'm storing it in a database. I have plans for it, the things I listed are as far as I've got so far. That's why.

Also, check my above post, as we both posted again at the same time. It does what the OP wants and would work for people who decide to use Memcached. Please feel free to use it.
 
There are things that you can't do without using PHP like prevent display of private images and in those cases the solution could be to use a /private folder that will run with mod_rewrite so only those files are served using PHP. I don't see other common uses or scenarios where you need to depend on PHP for the purpose of logging image usage.

By the way, the idea is to read the access_log and store what we need in the database. Is not my idea to read it on demand.

Which other functions do you have in mind for it?
 
There are things that you can't do without using PHP like prevent display of private images and in those cases the solution could be to use a /private folder that will run with mod_rewrite so only those files are served using PHP. I don't see other common uses or scenarios where you need to depend on PHP for the purpose of logging image usage.

Which other functions do you have in mind for it?

Personally for me, the need for me to store all the information in the database is because of the analytics I intend to store and display for certain users as I'm targeting a specific niche, which I obviously won't go in to because that gives away my strategy. I'm only commenting here to come up with a solution for the OP, as the solution I have in place is currently working perfectly for my needs and I have no complaints. It's been tested under heavy strain (1,500 page views per minute) and worked as I'd expect.

The reason for the Memcached approach is that OP is saying he has dedicated servers and is obviously not afraid to implement load balancers and other technology, so if he uses Memcached, this solution will work well for him. If you want to put the code in Chevereto, go ahead. Just make it only available to Memcached users and have a setting or something that says "Count unique views and analytics" or whatever, and only then will the code I supplied run. It will satisfy the people that want these anayltics, like me and OP! (well, not me, as I already have my solution).

My code was written very quickly as I don't have the time to do it right now, but if you want me to, I can come up with something better tomorrow when I'm at my work desk!
 
I understand and my idea is also to store the access into the database and link it to the user_id owner of the content. The only difference is that you are adding one layer to the process (PHP thing), something that I want to avoid (by reading and processing access_logs).

At the end both methods store the data in the Chevereto database, thing is seek which causes less impact on the machine.
 
I understand and my idea is also to store the access into the database and link it to the user_id owner of the content. The only difference is that you are adding one layer to the process (PHP thing), something that I want to avoid (by reading and processing access_logs).

At the end both methods store the data in the Chevereto database, thing is seek which causes less impact on the machine.

Yeah, I plan to revise my method also. I am adding an extra layer, but mine was a "quick fix" for the shear amount of traffic I was getting. The function I wrote for you ^^ up there is pretty good because it stores it in Memory and you can change the interval at which it updates, but it's not for everyone because not everyone uses Memcached.

I recommend doing methods for the popular data caching/OPCode caching software's like Memcached, APC for the people that *are* using them and then have a fallback method for the people that aren't!
 
I recommend doing methods for the popular data caching/OPCode caching software's like Memcached, APC for the people that *are* using them and then have a fallback method for the people that aren't!

PHP 5.5 got OPCache out of the box. Did not test it yet. But could be a solution. It does not mean we do not need a fallback.
 
Thanks. I can program PHP, dont worry.

Am currently working on a Google Chrome extension for my site (every chevereto owner can use it for his own site too).

Yeah, it's fairly simple to do. I've created one already, it's on the Google App store. I was going to release a more advanced version for free. Also, the code I provided above offers a solution to what you wanted without the overhead mentioned by both me and Rodolfo. And yes, as you mentioned, PHP 5.5 does come with OPCode cache, but it's not safe to assume everyone is using PHP 5.5.

Good luck with your Chrome plugin and view counter. If you want a more in-depth version of what I created above or want to sneak a look at my bandwidth monitor, please feel free to give me a shout.
 
I'm also interested in this. Processing the access log periodically and constructing advanced image stats is a good idea. This would also give us information about hotlinked images, or better yet those that aren't hotlinked or accessed in a long time, allowing you to remove very old inactive ones.

I don't know about you but most of the images hosted by my site are active for a short period after upload then they just take up space. With time this will be a burden.

Looking forward for this feature.
 
Back
Top