filePics

John Hurst

Version 2.1.2

20160728:012406

Table of Contents

1 The Main Program
1.1 Imports and Initialization
1.2 Collect the Command Line Options
1.3 Procedure Definitions
1.4 Camera Details: Load and Save
1.5 Process Images
1.6 Save New Camera Details
2 renamePhotos.py
3 Handling an explicit album subdirectory
4 Indices


1. The Main Program

This program deals with all details associated with downloading photographs from John's and Barb's digital cameras. It saves the originals in a dedicated directory, and loads the images, renamed with the date and time of exposure, into the photo albums. Images are saved to a directory rooted at $HOME/Pictures/cameraName, and with a sub-path name of year/month/day, representing the date taken.

Details of the current image number for each camera are stored in an XML format file in the location /home/ajh/etc/camera.xml. This file is the "catalog" file, and the path to it can be changed as the value of the variable catalogName in chunk <initialization 1.3,3.2>.

"filePics.py" 1.1 =
#!/usr/bin/python # modified 20120903:141111 ajh to allow change of picture and album directory # extensively modified 20130916:233517 ajh to use exiv2 library (v2.0.0) # 20141105:150728 1.0.0 ajh first version two number, since adjusted to 2.1.0 <imports 1.2,3.1> <initialization 1.3,3.2> <collect the command line options 1.4> <procedure definitions 1.5,1.6,1.7> # real work starts here <collect camera details 1.8> <process images 1.9,1.14,1.15,1.16,1.17> <save new camera details 1.19> #catalogdom.unlink()

1.1 Imports and Initialization

<imports 1.2> =
import EXIF import os import re import rlcompleter import readline import shutil from subprocess import PIPE,Popen import sys from xml.dom import Node import xml.dom.minidom
Chunk referenced in 1.1 2.1
Chunk defined in 1.2,3.1
<initialization 1.3> =
debug=False noMove=False saveCatalog=True catalogName='/home/ajh/etc/camera.xml' picdir='/home/ajh/Pictures' albumdir='%s/Albums' % (picdir)
Chunk referenced in 1.1
Chunk defined in 1.3,3.2

The variable camera contains information about the current state of the cameras (image number, etc.) derived from the catalog file called catalogName. It is updated at the end of operations (subject to command line options, see <save new camera details 1.19>).

1.2 Collect the Command Line Options

There are three command line options:

c
do not save catlogue on exit
d
turn on debugging
s
define a subdirectory into which the images are to be placed, rather than the normal directory, identified by the date.
v
print version information
<collect the command line options 1.4> =
(opts,args) = getopt.getopt(sys.argv[1:],'a:cdnp:s:v') forceXmls=recurse=thumbsOnly=False; large=True for (option,value) in opts: if option == '-a': albumdir=value elif option == '-c': saveCatalog=False elif option == '-d': debug=True elif option == '-n': noMove=True elif option == '-p': picdir=value albumdir='%s/Albums' % (picdir) <get subdirectory option value 3.3> elif option == '-v': print "<current version 5.1>" sys.exit(0)
Chunk referenced in 1.1

1.3 Procedure Definitions

<procedure definitions 1.5> =
def stringValue(node): str="" for c in node.childNodes: if c.nodeType==Node.TEXT_NODE: str+=c.data else: str+=stringValue(c) return str
Chunk referenced in 1.1 2.1
Chunk defined in 1.5,1.6,1.7

Return the string value of a node, visiting all children and concatenating the values of any TEXT_NODEs.

<procedure definitions 1.6> =
def stripspaces(node): if node.nodeType==Node.TEXT_NODE: text=node.nodeValue.strip() #print "TextNode=(%s=>%s)" % (node.nodeValue,text) if text=='': parent=node.parentNode parent.removeChild(node) else: for n in node.childNodes: stripspaces(n)
Chunk referenced in 1.1 2.1
Chunk defined in 1.5,1.6,1.7

Recursively visit all TEXT_NODE descendants of this node, stripping any spaces from the start and end of the strings. If a string value vanishes (no non-blank characters), remove the node from its parent.

<procedure definitions 1.7> =
nodeLookup=['None', 'ELEMENT_NODE', 'ATTRIBUTE_NODE', 'TEXT_NODE', \ 'CDATA_SECTION_NODE', 'ENTITY_NODE', \ 'PROCESSING_INSTRUCTION_NODE', 'COMMENT_NODE', 'DOCUMENT_NODE', \ 'DOCUMENT_TYPE_NODE', 'NOTATION_NODE'] def treeprint(node,level): if node.nodeType==Node.TEXT_NODE: text=re.sub(r'(\r|\n)','\\\\n',node.nodeValue) print "%stext=>%s<=" % (" "*level,text) elif node.nodeType==Node.ELEMENT_NODE: print "%selem<%s>" % (" "*level,node.tagName) for n in node.childNodes: treeprint(n,level+1) else: nodeT=nodeLookup[node.nodeType] print "%s%s" % (" "*level,nodeT) for n in node.childNodes: treeprint(n,level+1)
Chunk referenced in 1.1 2.1
Chunk defined in 1.5,1.6,1.7

1.4 Camera Details: Load and Save

<collect camera details 1.8> =
# get camera.xml as string, no spaces catfile=open(catalogName,'r') catstr='' for l in catfile.readlines(): l=l.strip() catstr+=l catalogdom=xml.dom.minidom.parseString(catstr) print "input camera details:\n%s\n" % (catalogdom.toprettyxml(" ")) cameraNodes={} photocatalog=catalogdom.documentElement for child in photocatalog.childNodes: if child.nodeType!=Node.ELEMENT_NODE: continue # should be 'camera' elements # print child attributes = child.attributes for i in range(0,attributes.length): attr = attributes.item(i) if attr.nodeName=='name': thiscamera=attr.nodeValue break pass current=child.getElementsByTagName('current')[0] cameraNodes[thiscamera]=(child,current) pass
Chunk referenced in 1.1

Get all the currently known cameras and their last known image name. The tuple values of cameraNodes contain the camera element node and its child text node current.

1.5 Process Images

Generate the new names for the files in their new locations. The original file is copied into a directory tree with root name equal to the camera name. This is for archival purposes. The tree is traversed using a path of year, month and day (the latter two zero left-filled to two places). Provided that no two images taken on the same model camera on the same day have the same name and number, this should suffice.

This is not the original strategy, which was to renumber all images with a unique number for the particular camera. This strategy has been changed (20090107:221852), since experience shows that the original name is more useful for tracking the file in the event of misadventure, and the uniqueness of the name was adequately compensated by the subdirectory structure.

The file is also copied into the Albums subdirectory, again with a year/month/day subdirectory structure. This image is given a name which uniquely identifies it within the entire Albums substructure. Any clashes that may occur are resolved by appending a "-n" suffix, where n is a unique ordinal number.

With version 1.2.0 and later, an optional subdirectory below this point may be specified. This allows the easy creation of different collections of photos which may be taken on the same day, but by different people, or at different locations. At the moment (version 2.1.1), this subdirectory does not obey the unique naming convention. This will require more work, namely scanning the parent and other subdirectories for the same day to resolve potential name clashes.

<process images 1.9> =
# begin process images 1 <extract and sort image names 1.10> for image in images: (grp,num,typ,file)=image if debug: print "processing image(grp=%s,num=%s,typ=%s,file=%s)" % (grp,num,typ,file) orgname="%s%s" % (grp,num) <extract EXIF data 1.11> <get date and time data for this image 1.12> <get the camera model information 1.13>
Chunk referenced in 1.1
Chunk defined in 1.9,1.14,1.15,1.16,1.17
<extract and sort image names 1.10> =
# begin extract and sort image names cwd=os.getcwd() files=os.listdir(cwd) images=[] lastnum=0 imageName='(IMG_|DSC_|DSC|S40)' for file in files: res=re.match(imageName+'(.*)\.((JPG)|(jpg)|(AVI))$',file) if res: grp=res.group(1) num=res.group(2) typ=res.group(3) images.append((grp,num,typ,file)) if debug: print "Found image, name parts are", print "group=%s, number=%s, type=%s, file=%s" % (grp,num,typ,file) def imgcmp(a,b): (ag,an,at,af)=a ; (bg,bn,bt,bf)=b return cmp(an,bn) images.sort(imgcmp) if len(images) == 0: print "Could not find any images files. Are we in the right directory?" sys.exit(1) # end extract and sort image names
Chunk referenced in 1.9
<extract EXIF data 1.11> =
args=["/usr/local/bin/exiv2", "-PEgnt",file] output=Popen(args,stdout=PIPE).communicate()[0] #print output output=output.strip() output=output.split('\n') tags={} for l in output: if l: bits=l.split() group=bits[0].strip(' ') tag=bits[1].strip(' ') val=' '.join(bits[2:]) val=val.strip() tags[tag]=val srttags=tags.keys() srttags.sort() if debug: print "Image %s has EXIF tags %s" % (file,tags) for tag in srttags: val="%s" % (tags[tag]) if len(val) < 20: print "%s = %s" % (tag,val) exiftags=tags
Chunk referenced in 1.9
<get date and time data for this image 1.12> =
yr=0 # try to extract a datetime for image datetime='' if exiftags.has_key('DateTime'): datetime=exiftags['DateTime'] elif exiftags.has_key('DateTimeOriginal'): datetime=exiftags['DateTimeOriginal'] if datetime: datetime="%s" % (datetime) if debug: print "image %s has datetime = %s" % (file,datetime) res=re.match(r'(\d\d\d\d):(\d\d):(\d\d) (\d\d):(\d\d):(\d\d)',datetime) if res: yr=int(res.group(1)) mn=int(res.group(2)) day=int(res.group(3)) hr=int(res.group(4)) min=int(res.group(5)) sec=int(res.group(6)) if not datetime: print "Cannot read date/time information for image %s. Do you want to enter them now?" % (file) ok=raw_input("Enter 'y' to confirm: ") if re.match('y',ok): # must put these in a try to avoid errors buggering up try: yr=int(raw_input("year: ")) mn=int(raw_input("month: ")) day=int(raw_input("day: ")) hr= int(raw_input("hour: ")) min=int(raw_input("minute: ")) sec=int(raw_input("second: ")) except: print "some error" continue else: print "OK, skipping this image" continue pass
Chunk referenced in 1.9 2.1

We try to extract date and time details for this image, as this determines the name under which the image is filed. Collect this from the exif data, and if no data recovered, ask the user if she wants to enter it manually. Skip the image if not, otherwise read input for the date and time information.

<get the camera model information 1.13> =
#print "Looking for model" make=exiftags['Make'] model=exiftags['Model'] matchpat="%s(.*)$" % (make) res=re.match(matchpat,model) if res: model=res.group(1) model=re.sub(' ','',model) else: if debug: print "%s,%s did not match" % (matchpat,model) model=re.sub(' ','',model) if debug: print "got model = %s" % (model) ## check for an alias realModel='' for key in cameraNodes.keys(): if re.search(key,model): realModel=model model=key break if realModel=='': # we found no matching model, must make a new one realModel=model if debug: print "Starting new camera model at %s" % (model) newCamera=catalogdom.createElement('camera') newCameraName=catalogdom.createAttribute('name') newCamera.setAttributeNode(newCameraName) newCamera.setAttribute('name',model) photocatalog.appendChild(newCamera) newCurrent=catalogdom.createElement('current') currentTextNode=catalogdom.createTextNode(file) newCurrent.appendChild(currentTextNode) newCamera.appendChild(newCurrent) cameraNodes[model]=(newCamera,currentTextNode) if debug: print "new image at %s" % (num) # now update the current node # get the camera and current nodes for this model (camNode,curNode)=cameraNodes[model] # get the text node (the only child node) of the current node textNode=curNode.childNodes[0] # and update its value textNode.data=file xmlstring=catalogdom.toprettyxml(" ") #print xmlstring
Chunk referenced in 1.9

Retrieve the camera model information. This is used to determine where to save the original images for backup purposes.

(20110812:103157) The strategy used is to search for model aliases as defined in the camera file (and hence are stored as keys in the camera dictionary). If there is a substring match between the key and the EXIF model name, then the alias (as defined by the key) is used instead of the model name. The variable realModel is the name of the camera as defined by the maker, and model is the name used for saving the image files (the alias).

For example, the Canon SX230 model has a model name of "CanonPowerShotSX230HS" but I use the alias "SX230". Note that the alias must be a unique substring within the full model name.

(20110726:174623) reduced this to a generic routine that automatically creates new directories as it finds new camera models.

<process images 1.14> =
# begin process images 2 directory='%s/%s' % (picdir,model) daydir="%04d/%02d/%02d" % (yr,mn,day) newdir="%s/%s" % (directory,daydir) # make sure directory exists args="/bin/mkdir -p %s" % (newdir) res=Popen(args,shell=True).wait() if res: print "cannot create directory %s" % (newdir) sys.exit(1) elif debug: print "Created directory %s" % (newdir) # #new1="%s_%05d.%s" % (grp,num,typ) new1=orgname new2="%04d%02d%02d-%02d%02d%02d.%s" % (yr,mn,day,hr,min,sec,typ) datetime="%04d%02d%02d-%02d%02d%02d" % (yr,mn,day,hr,min,sec) new3="%s_%s" % (grp,num) archivename="%s-%s.%s" % (new3,datetime,typ) daydir="%04d/%02d/%02d" % (yr,mn,day) if subDirectory: daydir+="/%s" % subDirectory newfile1="%s/%s/%s" % (directory,daydir,archivename) directory2=albumdir newdir="%s/%s" % (directory2,daydir) # make sure directory exists args="/bin/mkdir -p %s" % (newdir) res=Popen(args,shell=True).wait() if res: print "cannot create directory %s" % (newdir) sys.exit(1) elif debug: print "Created directory %s" % (newdir) newfile2="%s/%s/%s" % (directory2,daydir,new2)
Chunk referenced in 1.1
Chunk defined in 1.9,1.14,1.15,1.16,1.17

The original file is also copied into a similar directory tree rooted in the Pictures/Albums directory. This file has a name that is generated from the date and time (name new2 and path newfile2), and if a file with the same name is found in the subdirectory, a (numeric) suffix is added to uniquely identify the image.

<process images 1.15> =
# begin process images 3 <add album.xml as required 1.18> # make sure no file overwrites existing images suffix=1 while os.path.isfile(newfile2): new2="%04d%02d%02d-%02d%02d%02d-%d.%s" % (yr,mn,day,hr,min,sec,suffix,typ) newfile2="%s/%s/%s" % (directory2,daydir,new2) suffix+=1 # let user know this image has been processed print "%s:%s->%s->%s" % (model,file,archivename,new2) if debug: print "%s copy to %s" % (file,newfile1) if debug: print "%s copy to %s" % (file,newfile2) # end process images 3
Chunk referenced in 1.1
Chunk defined in 1.9,1.14,1.15,1.16,1.17
<process images 1.16> =
# begin process images 4 # copy file to camera-year-month-day directory args = "/bin/cp" + ' ' + file + ' ' + newfile1 if debug or noMove: print "(Potential) %s" % (args) else: res = Popen(args,shell=True).wait() # copy file to album-year-month-day directory args = "/bin/cp" + ' ' + file + ' ' + newfile2 if debug or noMove: print "(Potential) %s" % (args) else: res = Popen(args,shell=True).wait() # move file to camera directory for backup args = "/bin/mv" + ' ' + file + ' ' + directory if debug or noMove: print "(Potential) %s" % (args) else: res = Popen(args,shell=True).wait() # check image for orientation and rotate as necessary # cleanImage now calls exifautotran.sh. Perhaps this should be changed? args = "/home/ajh/bin/cleanImage.sh %s" % (newfile2) if debug or noMove: print "(Potential) %s" % (args) else: res = Popen(args,shell=True).wait() # end process images 4
Chunk referenced in 1.1
Chunk defined in 1.9,1.14,1.15,1.16,1.17
<process images 1.17> =
# begin process images 5 # renames=open("%s/renames.txt" % (directory),'a') if debug or noMove: print "(Potential renames update) %s %s %s" % (file,new1,new2) #renames.write("%s %s %s\n" % (file,new1,new2)) #renames.close() pass # end process images 5
Chunk referenced in 1.1
Chunk defined in 1.9,1.14,1.15,1.16,1.17
<add album.xml as required 1.18> =
newparent=newdir+"/.." if not os.path.isdir(newdir): print "creating directory %s" % newdir os.makedirs(newdir) while not os.path.isfile(newparent+"/album.xml"): newparent=newparent+"/.." if not os.path.isfile(newdir+"/album.xml"): try: shutil.copyfile(newparent+"/album.xml",newdir+"/album.xml") except: print "Unable to open the album.xml file in %s" % (newparent) sys.exit(1)
Chunk referenced in 1.15

Check the directory in which this image is to be saved for a) whether it exists, and b) if it has an album.xml. If it doesn't exist, make it (TOFIX: missing parent directories). Collect an album.xml from somewhere up the directory tree, and copy that into this directory.

1.6 Save New Camera Details

<save new camera details 1.19> =
xmlstring=catalogdom.toprettyxml(" ") if saveCatalog: newcat=open(catalogName,'w') newcat.write(xmlstring) newcat.close() print "Updated catalog file %s" % (catalogName)
Chunk referenced in 1.1

Output the updated photocatalog minidom.

2. renamePhotos.py

renamePhotos.py draws upon components of filePics.py to rename photos according to their date and time of exposure. Each image file passed in as a parameter is examined to determine its date and time of exposure, and this is used to rename the image file in accordance with the filePics.py conventions. If the file is not an EXIF file, or the date and time cannot be extracted, a warning message is printed on standard error, and the file is not renamed.

"renamePhotos.py" 2.1 =
#!/usr/local/bin/python <imports 1.2,3.1> import getopt <procedure definitions 1.5,1.6,1.7> (opts,args)=getopt.getopt(sys.argv[1:],'') for arg in args: #print "Processing %s" % (arg) if os.path.isfile(arg): imgfld=open(arg,'r') exiftags={} try: exiftags=EXIF.process_file(imgfld) except: sys.stderr.write("%s has no exif data" % (arg)) continue <get date and time data for this image 1.12> fullpath=os.path.abspath(arg) (dir,name)=os.path.split(fullpath) (name,ext)=os.path.splitext(name) clash=1;disc="";count=0 while clash: newname="%s/%4d%02d%02d-%02d%02d%02d%s%s" % \ (dir,yr,mn,day,hr,min,sec,disc,ext) if os.path.isfile(newname): count+=1 disc="-%d" % (count) else: clash=0 # all clear to move to newname print "moving %s to %s" % (arg,newname) os.rename(arg,newname) else: sys.stderr.write("%s cannot be found" % (arg))

3. Handling an explicit album subdirectory

This is a bit of an experiment.

<imports 3.1> =
import getopt
Chunk referenced in 1.1 2.1
Chunk defined in 1.2,3.1
<initialization 3.2> =
subDirectory=""
Chunk referenced in 1.1
Chunk defined in 1.3,3.2
<get subdirectory option value 3.3> =
elif option == '-s': subDirectory=value
Chunk referenced in 1.4

If there is a subdirectory option, collect the supplied parameter.

4. Indices

File Name Defined in
filePics.py 1.1
renamePhotos.py 2.1
Chunk Name Defined in Used in
add album.xml as required 1.18 1.15
collect camera details 1.8 1.1
collect the command line options 1.4 1.1
current date 5.2
current version 5.1 1.4
extract EXIF data 1.11 1.9
extract and sort image names 1.10 1.9
get date and time data for this image 1.12 1.9, 2.1
get subdirectory option value 3.3 1.4
get the camera model information 1.13 1.9
imports 1.2, 3.1 1.1, 2.1
imports 1.2, 3.1 1.1, 2.1
initialization 1.3, 3.2 1.1
initialization 1.3, 3.2 1.1
procedure definitions 1.5, 1.6, 1.7 1.1, 2.1
process images 1.9, 1.14, 1.15, 1.16, 1.17 1.1
process images 1.9, 1.14, 1.15, 1.16, 1.17 1.1
save new camera details 1.19 1.1
Identifier Defined in Used in
catalogName 1.3
model 1.13 1.13, 1.13, 1.13, 1.13, 1.13
model 1.13 1.13, 1.13, 1.13, 1.13, 1.13
new1 1.14
new2 1.14
newfile1 1.14
newfile2 1.14
realModel 1.13
realModel 1.13
realModel 1.13
stringValue 1.5 1.5
stripspaces 1.6 1.6
subDirectory 3.2
20061002:165529 ajh 1.0.0 first literate version
20061003:150821 ajh 1.0.1 fix bug that overwrites images if two taken at the same time
20061202:192114 ajh 1.0.2 add album.xml update when new directories added.
20061204:105335 ajh 1.0.3 improve documentation.
20070106:141016 ajh 1.0.4 fix bug to make all dirs in missing path 'os.makedirs'
20070412:133333 ajh 1.1.0 added renamePhotos.py program
20090708:104018 ajh 1.2.0 add ability to file pics into album subdirectories
20110726:174725 ajh 1.2.1 made camera model extraction generic
20110812:102047 ajh 1.2.2 add search for model aliases in the camera file before creating new model files
20120903:141111 ajh 1.3.0 allow change of picture and album directory
20130916:233517 ajh 2.0.0 edited filePics.py to filePics2.py to use exiv2, xlp file NOT updated
20160109:174814 ajh 2.1.0 add version number
20160109:174814 ajh 2.1.1 migrated back to xlp file, otherwise same as 2.1.0
20160728:012406 ajh 2.1.2 fixed update of camera.xml
<current version 5.1> = 2.1.2
Chunk referenced in 1.4
<current date 5.2> = 20160728:012406