person.py - John's Family Web program

John Hurst

Version 1.0

Abstract

1. Introduction

This program provides cgi support to John Hurst's genealogical web. This web is organized as a collection of XML files, one per person, that define the family relationships between people ultimately related to Angas John Hurst (the author). The program traverses these files, showing variously:

  1. Information about the person themselves, including birth and death dates, spouse and offspring;
  2. Ancestors of a given person; and
  3. descendants of a given person.

2. The Main Program

"person.py" 1 =
#!/usr/bin/python from xml.dom.minidom import parse, parseString, Node import string import xml.dom.minidom import os,sys,re,getopt import urllib import cgi import cgitb; cgitb.enable() # do error handling of cgi #from xml.dom.ext import Print,PrettyPrint TREEHOME="/Library/WebServer/Documents/ajh/family/tree/persons" SCRIPT="http://localhost/cgi-bin/ajh/person.py" SCRIPTNAME=SCRIPT+"?name=" <procedure and class declarations 7,8,9,10,11> <main body 2,3,4,5,6>

TREEHOME
defines the directory where the person XML files are kept.
SCRIPT
defines the URL of this script.
SCRIPTNAME
defines the URL of this cgi script plus parameters.
<main body 2> =
print "Content-Type: text/html\n\n"; form = cgi.FieldStorage()
Chunk referenced in 1
Chunk defined in 2,3,4,5,6

Start a new cgi html page, and collect the cgi parameters.

<main body 3> =
if form.has_key("name"): pid=form["name"].value person=Person(pid) person.loadPerson() printPerson(person.dom,person,0)
Chunk referenced in 1
Chunk defined in 2,3,4,5,6

if a name call, collect and load person, and call the printPerson routine. exit (fall through).

<main body 4> =
if form.has_key("descendants"): pid=form["descendants"].value person=Person(pid) person.loadPerson() printDescendants(person,0)
Chunk referenced in 1
Chunk defined in 2,3,4,5,6

if a descendants call, collect and load person, and call the printDescendants routine. exit (fall through).

<main body 5> =
if form.has_key("ancestors"): pid=form["ancestors"].value person=Person(pid) person.loadPerson() printAncestors(person,0)
Chunk referenced in 1
Chunk defined in 2,3,4,5,6

if an ancestors call, collect and load person, and call the printAncestors routine. exit (fall through).

<main body 6> =
if form.has_key("match"): text=form["match"].value printAll(text)
Chunk referenced in 1
Chunk defined in 2,3,4,5,6

if a match call, collect the match pattern, and call the printAll routine. exit (fall through).

3. Procedure and class declarations

<procedure and class declarations 7> =
def getTextContent(node): if node: if node.nodeType == Node.ELEMENT_NODE: str="" for p in node.childNodes: if p.nodeType == Node.TEXT_NODE: str = str+p.nodeValue return str return "*** Not Element Node ***" else: return "" def getElement(n,tag): #print n.nodeType for p in n.childNodes: if p.nodeType == Node.ELEMENT_NODE: if p.nodeName == tag: return p return None def getElements(n,tag): ellist=[] for p in n.childNodes: if p.nodeType == Node.ELEMENT_NODE: if p.nodeName == tag: ellist.append(p) return ellist def pidExists(pid): filename=TREEHOME+"/"+pid+".xml" return os.path.isfile(filename) <PersonClass 12> def getNameAndEra(person): person.loadPerson() link=person.name if person.loaded: thispid=SCRIPTNAME+urllib.quote(person.pid) link="<A HREF=\"%s\">%s</A>" % (thispid,link) link = link + " " + person.gen if person.birthdate or person.deathdate: era="("+person.birthdate+" - "+person.deathdate+")" link=link+" "+era return link
Chunk referenced in 1
Chunk defined in 7,8,9,10,11
<procedure and class declarations 8> =
def printPerson(node,person,indent): print "<html>\n" ind="" for i in range(indent): ind = ind+" " person.loadPerson() if person.loaded: print " <head>\n" print " <title>%s</title>\n" % (person.name) print " </head>\n" print " <body>\n" print " <h1>%s (%s)</h1>\n" % (person.name,person.gen) print " <dl>\n" print " <dt>BORN</dt>\n" print " <dd>%s %s</dd>\n" % (person.birthdate,person.birthplace) if person.birthnotep: print " <dd><b>Notes:</b> <i>%s</i></dd>\n" % \ (getTextContent(person.birthnotep)) if person.deathdate or person.deathplace: print " <dt>DIED</dt>\n" print " <dd>%s %s</dd>\n" % (person.deathdate,person.deathplace) if person.deathnotep: print " <dd><b>Notes:</b> <i>%s</i></dd>\n" % \ (getTextContent(person.deathnotep)) if person.fatherp or person.motherp: print " <dt>PARENTS</dt>\n" if person.fatherp: link=getNameAndEra(person.fatherp) print " <dd><i>Father </i>%s</dd>\n" % (link) if person.motherp: link=getNameAndEra(person.motherp) print " <dd><i>Mother </i>%s</dd>\n" % (link) if person.elist: print " <dt>EDUCATED</dt>\n" for (edp,edd,edn) in person.elist: print " <dd>%s</dd>\n" % (edp) for (mplacep,mdatep,mnotesp,spousep,chlist) in person.mtuple: link=getNameAndEra(spousep) print " <dt>SPOUSE</dt>\n" print " <dd>%s</dd>\n" % (link) if mplacep or mdatep: print " <dd>m. %s %s</dd>\n" % (getTextContent(mdatep),getTextContent(mplacep)) if mnotesp: print " <dd>Notes: %s</dd>\n" % (getTextContent(mnotesp)) if chlist: print " <dt>CHILDREN</dt>\n" for child in chlist: link = getNameAndEra(child) print " <dd>%s</dd>\n" % (link) if person.notesp: print " <dt>NOTES:</dt>\n" print " <dd>%s</dd>\n" % (getTextContent(person.notesp)) print " </dl>\n" print " <p>" print " <form action=\"%s\">" % (SCRIPT) print " <input type=\"hidden\" name=\"descendants\" value=\"%s\"/>" % (person.pid) print " <input type=\"submit\" value=\"Show Descendants of %s\"/>" % (person.name) print " </form>" print " <form action=\"%s\">" % (SCRIPT) print " <input type=\"hidden\" name=\"ancestors\" value=\"%s\"/>" % (person.pid) print " <input type=\"submit\" value=\"Show Ancestors of %s\"/>" % (person.name) print " </form>" printShowMatches(person.pid) print " </p>\n" print " </body>\n" print "</html>\n"
Chunk referenced in 1
Chunk defined in 7,8,9,10,11
<procedure and class declarations 9> =
def scanDesc(person,indent): ind="" for i in range(indent): ind = ind+"...." person.loadPerson() if person.loaded: link=getNameAndEra(person) print "%s%s<BR/>\n" % (ind,link) for (mplacep,mdatep,mnotesp,spousep,chlist) in person.mtuple: for child in chlist: scanDesc(child,indent+1) else: print ind+person.name+"<BR/>\n" def printDescendants(person,indent): print "<html>\n" ind="" for i in range(indent): ind = ind+" " person.loadPerson() if person.loaded: print " <head>\n" print " <title>%s</title>\n" % (person.name) print " </head>\n" print " <body>\n" print " <h1>Descendants of %s (%s)</h1>\n" % (person.name,person.gen) scanDesc(person,0) printShowMatches(person.pid) print " </body>\n" print "</html>\n"
Chunk referenced in 1
Chunk defined in 7,8,9,10,11
<procedure and class declarations 10> =
def scanAncs(person,indent): ind="" for i in range(indent): ind = ind+"...." person.loadPerson() if person.loaded: link=getNameAndEra(person) print "%s%s<BR/>\n" % (ind,link) if person.fatherp or person.motherp: if person.fatherp: scanAncs(person.fatherp,indent+1) else: print ind+"....(father unknown)<BR/>\n" if person.motherp: scanAncs(person.motherp,indent+1) else: print ind+"....(mother unknown)<BR/>\n\n" else: print ind+person.name+"<BR/>\n" def printAncestors(person,indent): print "<html>\n" ind="" for i in range(indent): ind = ind+" " person.loadPerson() if person.loaded: print " <head>\n" print " <title>%s</title>\n" % (person.name) print " </head>\n" print " <body>\n" print " <h1>Ancestors of %s (%s)</h1>\n" % (person.name,person.gen) scanAncs(person,0) printShowMatches(person.pid) print " </body>\n" print "</html>\n"
Chunk referenced in 1
Chunk defined in 7,8,9,10,11
<procedure and class declarations 11> =
xmlmatch=re.compile('([A-Za-z]+)((\-|\+)([0-9]+))\.xml') def printAll(text): print "<html>\n" print " <head>\n" print " <title>All Persons Matching %s</title>\n" % (text) print " </head>\n" print " <body>\n" print " <h1>All Persons Matching '%s'</h1>\n" % (text) dirlist=os.listdir(TREEHOME) textmatch=re.compile(text) count=0 for filename in dirlist: res=xmlmatch.match(filename) if res: pid=res.group(1)+res.group(2) res=textmatch.search(pid) #print "pid=%s, text=%s<BR/>" % (pid,text) if res: person=Person(pid) link=getNameAndEra(person) print "%s<BR/>\n" % (link) count=+1 if count==0: print "<P>No matches found " if string.count(text,' ')>0: text=string.replace(text,' ','') print "(DO NOT use spaces between names: Did you mean to type " print "<span style=\"color:red\">%s</span>?)" % text print "</P>\n" printShowMatches(text) print " </body>\n" print "</html>\n" def printShowMatches(text): print " <p></p>\n" print " <form action=\"%s\">" % (SCRIPT) print " <input type=\"submit\" value=\"Show All Matches \"/>" print " <input type=\"text\" size=\"30\" name=\"match\" value=\"%s\"/>" % (text) print " </form>"
Chunk referenced in 1
Chunk defined in 7,8,9,10,11

4. The Person Class

A Person is a data structure to model the information stored externally in various XML files. The initialization parameter is the person id and generation level, e.g., "AngasJohnHurst-0". Information is not stored in the object until the 'loadPerson' method is called.

<PersonClass 12> =
# A Person is a data structure to model the information stored # externally in various XML files. The initialization parameter is the # person id and generation level, e.g., "AngasJohnHurst-0". # Information is not stored in the object until the 'loadPerson' # method is called. class Person: pidpat = re.compile('([^-+]*)((-|\+)\d*)') def __init__(self,pid): self.pid=pid res = self.pidpat.match(pid) #print pid if res == None: self.name=pid else: pidname = res.group(1) ; pidgen=res.group(2) self.name=re.sub(r'([a-z])([A-Z])',r'\1 \2',pidname,0) self.gen=pidgen self.dom = None self.fatherp=None self.motherp=None self.mtuple=[] self.elist = None self.birthdate=""; self.birthplace=""; self.birthnotep=None self.deathdate=""; self.deathplace=""; self.deathnotep=None self.marrnotesp=None self.educated="" self.notesp=None self.loaded=0 def loadPerson(self): filename=TREEHOME+"/"+self.pid+".xml" if not self.loaded: if os.path.isfile(filename): self.dom = parse(filename) # parse an XML file from stdin else: return p=getElement(self.dom,'PERSON') namep=getElement(p,'NAME') givenp=getElement(namep,'GIVEN') surnamep=getElement(namep,'SURNAME') self.name=getTextContent(givenp)+" "+getTextContent(surnamep) birthp=getElement(p,'BIRTH') if birthp: birthdatep=getElement(birthp,'DATE') self.birthdate=getTextContent(birthdatep) birthplacep=getElement(birthp,'PLACE') self.birthplace=getTextContent(birthplacep) self.birthnotep=getElement(birthp,'NOTES') deathp=getElement(p,'DEATH') if deathp: deathdatep=getElement(deathp,'DATE') self.deathdate=getTextContent(deathdatep) deathplacep=getElement(deathp,'PLACE') self.deathplace=getTextContent(deathplacep) self.deathnotep=getElement(deathp,'NOTES') # Now get parents, but don't yet load them fp = getElement(p,'FATHER') if fp: fpid = fp.getAttribute('PERSON') if fpid: self.fatherp=Person(fpid) mp = getElement(p,'MOTHER') if mp: mpid = mp.getAttribute('PERSON') if mpid: self.motherp=Person(mpid) # Now examine married list. For each marriage build a tuple # (spousep,childplist), where spousep is a spouse person, and # childplist is a list of child persons. marriedl = getElements(p,'MARRIED') tuplel = [] for m in marriedl: mplacep=getElement(m,'PLACE') mdatep=getElement(m,'DATE') mnotesp=getElement(m,'NOTES') sp = getElement(m,'SPOUSE') if sp != None: spid = sp.getAttribute('PERSON') spousep=Person(spid) spouse=spousep.name # Handle children by this spouse mlist = m.childNodes chlist = [] for cn in mlist: if cn.nodeName == 'CHILD': cpid=cn.getAttribute('PERSON') if cpid: cp=Person(cpid) chlist.append(cp) tuplel.append((mplacep,mdatep,mnotesp,spousep,chlist)) self.mtuple=tuplel educatedp = getElement(p,'EDUCATED') if educatedp: elistp = educatedp.childNodes elist = [] for enode in elistp: if enode.nodeType == Node.ELEMENT_NODE: if enode.nodeName == 'PLACE': eplace = getTextContent(enode) etriple = (eplace,None,None) elist.append(etriple) self.elist = elist self.notesp=getElement(p,'NOTES') self.loaded=1
Chunk referenced in 7

5. The DTD file for persons

"person.dtd" 13 =
<!ELEMENT PERSON (NAME,FATHER?,MOTHER?,BIRTH?,EDUCATED?,MARRIED*,DEATH?,NOTES?)> <!ATTLIST PERSON ID CDATA #REQUIRED SEX CDATA #REQUIRED> <!ELEMENT NAME (GIVEN,SURNAME)> <!ELEMENT GIVEN (#PCDATA)> <!ELEMENT SURNAME (#PCDATA)> <!ELEMENT FATHER EMPTY> <!ATTLIST FATHER PERSON CDATA #REQUIRED> <!ELEMENT MOTHER EMPTY> <!ATTLIST MOTHER PERSON CDATA #REQUIRED> <!ELEMENT BIRTH (PLACE,DATE,NOTES?)> <!ELEMENT EDUCATED (PLACE,DATE?,NOTES?)*> <!ELEMENT MARRIED (PLACE?,DATE?,SPOUSE,CHILD*,NOTES?)> <!ELEMENT DEATH (PLACE,DATE,NOTES?)> <!ELEMENT SPOUSE EMPTY> <!ATTLIST SPOUSE PERSON CDATA #REQUIRED> <!ELEMENT CHILD EMPTY> <!ATTLIST CHILD PERSON CDATA #REQUIRED> <!ELEMENT NOTES (#PCDATA)> <!ELEMENT PLACE (#PCDATA)> <!ELEMENT DATE (#PCDATA)>

This is the Document Type Definition file. It defines the structure of the XML files that describe each person in the database. The fields are largely self explanatory. Note that all person IDs are composed of all given names and surname, capitalized and concatenated together, along with a suffix of the form "-n" where n is the generation before the base generation (including "-0" for the base generation), or "+n" for all generations following the base generation.

For example, for the database for the author's family, the author is represented by the ID "AngasJohnHurst-0".

The NOTES field can be used to provide additional information for various fields.

Children are associated with a particular spouse (MARRIED is used in the sense of Jesus talking with the Samarian Woman at the Well, see St John's gospel), and there can be multiple spouses ("spice"?)


Document History

20050608:105904 ajh 1.0 first literate program
20050822:122350 ajh 1.0.1 added DTD definition