Manual for the dipsoc league table generating scripts _____________________________________________________ Day-to-day usage: Each game that is to be included in the tables should have its own file. All the files for the season 2008/2009 should lie in the folder data/2009/ (this and all other locations are relative to the dipsoc/dipstats/ folder). The convention is to name the file after the term and the week the meeting was held, e.g. M3 for the third week in Michaelmas term, and add a, b, etc if there was more than one game. The name only has two effects. If it begins with an M then the long form of the link to the game is different, e.g. in the data/2009/ folder the game M3 will be linked to as 2008-M3, while L3 is linked to as 2009-L3. If the second character is an X then the game is taken to be an all-day game. The following example is the file data/2008/M1a: Standard 1904 * A Ben Ravenhill 5 E Zhi Qi Soh 8 F Gayle Goh 6 G Samuel Siljander 3 I Matthew Cliffe 3 R Benny Talbot 8 T Miguel Rodriguez 0 1904 Neutral 1 # SWE neutral at game end # not 100% certain of names of G and T The first line contains the variant name and the last year played. The * means that the game will not be included in the averages, in this case because it is a newbie game. (The feature was originally added to allow all-day games and unbalanced variants to be excluded, but that should now be done automatically.) The following lines contain one entry for each power: its initial, the name of the player and the final SC count, and the year of elimination if the SC count is 0. The name of a player must be written exactly the same way in all game s/he has played, otherwise the script will not identify them. (A "fake" player name (e.g. NN or Democracy) can be prefixed with * to prevent the creation of a player entry.) The variant information in the script defines the starting year (e.g. 1901 for Standard). If one cannot remember the starting year then years can be written as +n to indicate the nth year played. The program will print an error message if the total SC count is more than the total available in the variant (but will still score the game). If it is less than the total SCs in the variant then a warning is printed. If there were neutral SCs at the end of the game then the warning can be avoided by adding an extra lines specifying the number of neutrals. Lines which begin with # are comments, and ignored completely by the script. To be included in the tables the game must be listed in main script, dshist.py. At the start are two lists called files and years. Years contains the names of the folders to be scored, and files contains the names of the files in those folders. Run the script by typing python dshist.py in the command prompt. The script should print a lot of messages about reading all the files and generating tables, along with the odd warning message about SCs not adding up for some of the old games. If the script crashes it's most likely because of an error in the most recently added files. A typical culprit is forgetting to indicate the year of elimination for an eliminated power. The script prints html files to the folder stats/. This folder is also symlinked from the dipsoc/public_html folder, so that the contents can be viewed at http://www.dip.soc.ucam.org/stats/ The script may also print a message about Unknown first names written to data/names_u The script uses the players' first names to generate gender statistics. For this to work you should move the names from names_u to names_m or names_f (or names_n) as appropriate. As pointed out above, players' names must be written exactly the same every time for the script to identify them. You can check whether extra player entrieshave been created by mistake either from the output of the script, or by looking at the fulltable.html pages (these include all player entries, while the usual table.html linked from main pages will cull those who have played only a single game after the first couple of weeks of term). Player names containing special symbols can be a bit of an issue. Currently, all html files are created with the instruction "charset=ISO-8859-1" in the header, so the files need to use the latin-1 encoding for the symbols to display correctly. (In vim you can use ":set fenc" to check the file encoding (and ":set enc" to check the current encoding)) The file symbols contains some common foreign characters in latin-1 which you can copy and paste if necessary. Putty tends to use latin-1 by default (it's set under window->translation). If different files use different encodings then extra player entries may be created for players whose names contain special symbols. ___________________________________________________________________________ Back-ups: Make back-ups every now and then. Easiest way is to type tar -zcf filename.tar.gz dshist.py readme data/ and copy the resulting file filename.tar.gz somewhere safe. The archive can be expanded by running tar -xf filename.tar.gz (The filename extensions don't matter, but .tar is conventional for tar archives, and .gz for gzipped file (the -z option gzips the output)) Ownership and permissions: It is important to ensure that the stats files have the correct permission settings. First, a description of the permissions. Each file or directory has an owner, group and permission settings. These can be viewed with the command ls -l. E.g., at the time of typing the following are among the contents of dipstats/: drwxrwsr-x 16 jn226 dipsoc 4096 2009-10-30 08:51 data -rwxrwxr-- 1 jn226 dipsoc 48124 2009-10-30 08:02 dshist.py -rw-rw-r-- 1 jn226 dipsoc 4554 2009-11-05 10:56 readme All three are owned by me (username jn226) and belong to the group dipsoc. The string on the left indicate the permission settings. The first character in the string shows whether the entry is a directory or a file (data is a directory, the other two are files). The next three indicate whether the owner (i.e. jn226) has read, write and execute access to the file (- indicates that I do not). In this case, dshist.py is an executable script, while readme is not. For a directory, read access means that the contents can be viewed (with ls), write access that files in the directory can be created, deleted or renamed, and execute access that it can be made the current working directory. The next three characters show the group permissions, i.e. for users who are not the owner, but who belong to the group of the file. If you are reading this then your SRCF account presumably belongs to the dipsoc group (this is controlled by the SRCF sysadmins). Meaning of the three characters are as for the owner permissions. Note however the 's' in the entry for data. This means that in addition to having group execute permission set, the directory also has the "SGID" bit set (S would indicate that SGID is set, but execute is not). For a directory, setting SGID ensures that files created in the directory by default inherit the group permissions from the directory. E.g. files created in data/ default to -rw-rw-r--, while if SGID was not set they would default to -rw-r--r--. Finally, the last three characters show permissions for all users who do not belong to the group. When the directory is visible on the web (like data/ is, because it is symlinked from dipsoc/public_html/) this includes web users. If the last r bit is not set (e.g. a file has -rw-r----- instead of -rw-rw-r--) then a web user typing in the URL of the file will get a 403 permission denied error. IMPORTANT: all files in the dipsoc directory should have group read and write permission, and all directories should in addition have group execute permission. Otherwise there is all sorts of trouble if several users try to make changes, or when one user hands over the maintenance to another (in particular, the dshist.py script will abort if it tries to write to a file that the current user does not have write permission for). To help ensure that the files have the correct permissions, all directories should also have SGID set as explained above. Moreover, generally files and directories should be readable by outside users. Thus the CORRECT PERMISSIONS are -rw-rw-r-- for files (except executables) and drwxrwsr-x for directories. The permissions are changed with the command chmod (but only the *owner* can change permissions). In particular, before handing over maintenance to another user you may want to type chmod g+w * -R in dipsoc/ to ensure that the next maintainer has write access to files owned by you (the -R option applies the command recursively to subdirectories). ___________________________________________________________________________ Old statistics: The dipstats/ folder contains two directories called oldstats/ and olderstats/. The oldstats/ directory contains a parallel copy of the league tables, using the scoring algorithm for up to 2007. olderstats/ contains pre-2002 league tables constructed manually or with yearly scripts. Both directories are symlinked from public_html/. The league table links from the society.html page point to whatever table was used at the time, i.e. the ones in oldstats/ or olderstats/ for seasons before 2007.