Monday, January 19, 2015

Spreadsheets and Databases

Spreadsheets and database are fraternal twins. Each format stores data in an organized manner and each format allows the user to manipulate that data in order to see data relationships and to see subsets of that data. But some people are more comfortable with one twin than the other. My older daughter is a spreadsheet person; I admire the database.

I have examined Thomas MacEntee’s Excel spreadsheet and have attempted to turn it into database form. Thomas has tabs for three entry-style spreadsheets (the other tabs are information only). I put the fields for each tab into database form; one database for each topic.

This is the database form of Thomas's Research Log.

 Here is the database form of the To Do list.

 And here is the database form of the Search Attempts spreadsheet. This screenshot shows an empty database, because I have no idea how to use this. Still, I used the forms in the database format, and will keep this form for future need. (I've opted for information over formatting here. I'm sorry — I don't know how to fix this, but full screen display of these databases is important to me.)

There are strong visual differences between the two styles: the spreadsheet stretches out in a continuous horizontal line while the database is either a vertical stack or a rectangular block of fields. The spreadsheet stretches down into a vertical stack of entries. The database can also be shown in a list view, which provides a vertical stack of entries in the database format I have shown in these screenshots. I have chosen to show screenshots of a single entry, because I believe this form illustrates my conversion more clearly.

In Thomas’s spreadsheet you can move from one collection of data to another by clicking on the appropriate tab. In a database you do that by clicking on scripted buttons. (Unfortunately I am a poor scripter and my scripts don’t work well. That’s OK, I can always work this out, given time.) In the spreadsheet, when you open one tab, you close the other. In a database, you have the option of keeping more than one database open at the same time.

I have two additional buttons. Since I am keeping narrative-style documents which combine a research goal/plan with a research log, one of my scripted buttons will open the folder where these logs are stored. From this location, I can open the appropriate document or create a new one, keeping the narrative style documents active along with the strict data-storage documents.

Another button opens a database which I created when I created my new family tree in Reunion, at the start of the Genealogy Do-Over studies. This is an index database, which lists each individual entered into the Reunion database, along with the person ID number attached to that person by Reunion as well as the Dollarhide number which I have assigned to that individual. I use these IDs when I am comparing entries between Reunion and my various other software entries and my various online trees.

I have also created a field in which I record the Source ID numbers used for the sources attached to each individual. As we have frequently mentioned on the Facebook page, an entry such as a birth certificate carries information about the subject (the person born), and about one or both of the parents of that person. Census records and wills might have many persons attached to the source. My software will prepare a list of the individuals whose names have been attached to the source. However, it does not tell me of any sources that have not been attached when they should be attached. In my earlier work, I found this index of sources to be a useful tool, especially when I wish to be sure that an online tree and my private tree are coordinated. This field is currently empty, so I haven’t shown it. ( I have created other empty fields in anticipation of other data entries which may need to be indexed.)

Earlier I mentioned that I can open one of my databases in the same screen as my genealogy program, and still maintain a relatively uncluttered screen. I am including some screenshots showing such multiple document configurations.

This shows the Index entry database to the left, ready for entry of information into the genealogy software that fills the remainder of the screen OR to receive information from the software program into the database.

Here we have the Research Log database along with the narrative style-research log for the same data; ready to be coordinated with the software program.

And finally this shows the To Do List in connection with the software screen.

 This blog was written in part because I wished to show my fellow participants how I converted Thomas’s spreadsheet into database form. I also wished to compare database and spreadsheet formats. Each format has strengths and weaknesses. A user's appreciation of one over the other is usually a matter of personal preference. But there is one big debit on the database side.

Constructing a database is expensive! It requires a high learning curve and it requires the use of a database construction program (which us usually costly). Although there may be others, I know of only three database construction programs. In order of dollar expense these are Access from Microsoft (runs on Windows), FileMaker Pro, (for both Windows and Mac), and MySQL (Windows, Mac — and Linux?). MySQL is freeware. In terms of learning curve, Access is hardest, MySQL is next, and FileMaker is less difficult, but still not an intuitive project. 

There were earlier, easier database construction programs that are no longer supported: Bento from Apple, and the database in AppleWorks. When AppleWorks gave way to iWorks, the database was dropped in favor of Keynote, a presentation program. If I remember corrctly, the 1980s MicrosoftWorks also had a simple database construction component.

I made my first database in Apple’s very old HyperTalk (I didn’t know it was a database!), moved on to AppleWorks when HyperTalk became too cumbersome, then graduated to FileMaker (version 1) when the size of my first database outgrew AppleWorks. I pay for an upgrade to each uneven numbered version of FileMaker Pro in order to keep my software compatible with the current operating system, because that database tracks a family library (mostly fiction) approaching 10,000 print volumes plus currently entered ebooks numbering more than 60. (I don’t count the ebooks very often, I know that 60 is too low a number.)

As a final word, almost every person in the Genealogy Do-Over group (if not all of us) uses a database and most of us love it, whichever one it is. From Family Tree Maker to Reunion, your genealogy software is a dedicated database. Most databases that appear on our computers are dedicated databases like our genealogy software. They have been prepared by professional programmers and are tailored to specific needs. The “do-it-yourself” programs are much more rare.

No comments:

Post a Comment