Number of Images on this Page = 0


Popular Downloads

Dynamic Ranking of Images

A new dynamic ranking system has been developed that looks at the log files of images that have been accessed in full size. These logs are maintained by the viewtrains script, and hence do not account for images accessed directly. Click on this link to see the New Dynamic Ranking. Warning! There are currently over 2500 images in this ranking. While only 25 get shown at one time, you may find browsing through this archive addictive!

Each time this dynamic ranking link is clicked, the rankings are recalculated. The scoring system used assigns a vote value for each full-sized image access made, where the vote value has an exponential decay on the time since that access. Hence an image that has not been accessed for a while will gradually drop down the rankings, unless revisited by accessing the full sized image again (through the viewtrains interface).

When an image's vote drops below a certain threshold (exact value yet to be finalised), a script (which runs once a day) removes the full-sized image from the catalog. This is done to maintain the catalogue to a reasonable size.

However, while images are removed, the thumbnails are kept, and in such cases, clicking the thumbnail not only votes for the image, but records a request for the full-sized image to be restored. This is done by the same script which removes images.

You vote for an image each time you click on a thumbnail, anywhere in the catalogue. Note that you can only vote once for an image per day - this is to stop people pushing their favourites up the list by repeatedly reloading the viewtrains page!

How the Rankings are Computed

As pointed out above, each time a thumbnail is clicked to view the full image, a vote is recorded. However, as you may have guessed, it is not quite that simple! The method used to accumulate votes is effectively a simple digital high pass filter. If you are still interested, let me explain.

Each click on a thumbnail requests a call on a CGI script, called viewtrains.py. The parameters passed to this script include the image name (such as "3801-5"), and the IP address of the machine invoking the script (your computer). These parameters are recorded on a log file, together with the date and time of the request. (Incidentally, it is these parameters that Internet Explorer gets wrong, and precludes users of IE from viewing the full images.)

Each of these log file entries is regarded as a "vote" for the image. However, the vote's value is progressively reduced over time through a mathematical transformation known as exponential decay. Effectively, this decay multiplies the vote's value by a fixed factor, which is less than one, for every unit of time that has elapsed since the original vote (request) was made. Because this factor is less than one, the vote value gets smaller and smaller with every time unit, but never (theoretically at least) becomes exactly zero.

Every vote from every person accessing the images in this way gets recorded. You might have voted for (say) "3801-5" a week ago, and your vote has now decayed from 1.0 when you voted, to 0.627. Meanwhile Charlie Brown has voted for the same image 3 days ago, so his vote has decayed to only 0.819. Together your votes are worth 0.627+0.818 = 1.445

Now it so happens (and you can prove this mathematically, but I haven't ) that all the votes made by different people voting for the same image can be summed together, and all decayed together. Then each time a vote is cast, we add 1.0 to the image's current vote tally, and then continue decaying it towards zero. This has the big advantage that we don't need to keep decaying everyone's different votes, only the tally for each image. We therefore maintain a file containing the name of every image in the system, together with its (single) vote value, and most importantly, the date and time of this set of rankings.

This set of rankings is rebuilt from time to time, by decaying all the values by a factor computed from the length of time since they were computed, then adding in the (appropriately decayed) values of votes cast since then. If you look at the data recorded on the rankings pages, you'll see:

votefactor
This is the factor by which old rankings are decayed.
Ranking data input
This is the time it took to read the old rankings and decay them by the votefactor.
Logfile input
This is the time it took to read votes cast since the last set of rankings were made.
Input analysis and sorting
This is the time it took to add the old and new vote values, and resort them into the new order.
Data ranking
This is the time it took to rebuild the web page.

If you happened to have used the previous version of this system, you may have noticed that it was taking up to 19 seconds to read the logfile (containing votes cast back in January 2005!) of over 32,000 entries. Now it takes less than 2 seconds for the combined reads from the rankings and logfiles. (Times are elapsed times, not server CPU times, and hence depend upon server load.)

This page is copyright, and maintained by John Hurst.
5825 accesses since
04 Feb 2022
My PhotoMy PhotoTrain Photo

Local servers: Localhost Newport Burnley Geelong Jeparit Reuilly Spencer (accessible only on local network.)
Public Web Servers: ajhurst.org ajh.co ajh.id.au (not all may be active.)
Dynamically generated at 20240425:1254 from an XML file modified on 20180703:0447, by index.py version 1.6.5.