Many of the old index files available were "personsource" for "personbenefit". In plain English, that's one person building a file of index information for personal use. In this circumstance, the text and/or tags entered were possibly truncated and terse, while still being meaningful to that specific person. They might not be what you or I would use as a search word if we were looking for such items. The Index may be very useful for that individual, but practically useless for the hobby at large.
If data for a field is missing or incomplete, it will still be missing or incomplete if the file is stuffed into a database.
There are two things that can improve the situation. The first is to go back to the paper copy and put everything in correctly and completely. But if doing that, you might as well do it that way from the start.
The second way to improve things is to apply auto-tagging. This is scanning the text, tag, author and photographer fields for specific words and expanding the tags to add spelling variations, synonyms, name variations, singular/plural, etc. While this would probably greatly improve the tagging, it would not not make up for missing information.
The ideal situation is to do original input from the hard copy AND apply auto-tagging. This way is, I think, the only way to get reasonably complete items into the database. that's how to get the best crowdbenefit.
Rod Goodwin
indexguy