OldMansBeard wrote...
This is starting to get interesting! Coding aside, it's obviously not enough to parse the full name syntactically into words and spaces; we need to consider the semantics too. <snip> There are different conventions at work. If we can define a set of rules that cover most cases (rather than the single rule of "take the last word", which was my starting point), we can probably code them.
Indeed! I'm glad this has ended up offering so much food for thought... *ponders problem*
Let's imagine that, after finding all the sub-strings separated by spaces, we identify some of them as prepositions and conjunctions by checking them against a list. (If we identify no such sub-strings, we just use the standard method of making the first word the first name and the last word the last name.) Of the femaining words, our default position is that whatever sub-string (possibly including spaces) comes before the first conjunction or preposition (excluding any conjunction or preposition that might begin the whole string) is the most important part of the first name. Similarly, whatever sub-string comes after the last conjunction or preposition is taken to be the most important part of the last name.
That was quite a mouthful, so, to summarise:
The Duke of the Dark Plains
of Lirandon -> Duke Lirandon.
Under these rules, OMB's list of names would yield the following results; non-ideal results are bolded:
- John James the Gaunt -> John James (first name), Gaunt (last name)
- John James of the Gaunt Countenance -> John James (first name), Gaunt Countenance (last name)
- John James of Gaunt -> John James (first name), Gaunt (last name)
- John James Gaunt -> John (first name), Gaunt (last name)
- Gaunt John James -> Gaunt (first name), James(last name)
- John James de la pays de Gaunt -> John James (first name), Gaunt (last name)
- John James de la Pays -> John James (first name), Pays (last name)
- John James de Gaunt -> John James (first name), Gaunt (last name)
- John de Gaunt James -> John (first name), Gaunt James (last name)
- John James von Hohen Gaunt -> John James (first name), Hohen Gaunt (last name)
- John von Hohen-Gaunt James -> John (first name), Hohen-Gaunt James (last name)
- John James -> John (first name), James (last name)
- John Jamessohn -> John (first name), Jamessohn (last name)
Of all these, the ones that cause problems are those that have adjectives and nouns playing together, but I cannot think of any way of distinguishing between nouns and adjectives that is not mind-bogglingly clunky. It's possible we might be able to weed out titles that people put into their names, though, as the list of titles is relatively small compared to, say, all adjectives. ;-)
To make provision for eventualities not covered by this system, we could follow Bard's idea of letting the PC choose from a number of possibilities via conversation. I think I'd have the character in my module find out about his or her family through an old book; the PC would get the chance to choose a "best" translation of the family name through that conversation, with possible options for choosing being set via script and the player having the option to type a name into the chat window as a last resort.
Modifié par Estelindis, 31 août 2011 - 10:28 .