Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What information we put into entries and from where -- contextualisation of entry content & dictionary sources #6

Open
CallumBeaney opened this issue Oct 23, 2023 · 3 comments
Assignees

Comments

@CallumBeaney
Copy link
Member

I need to take a proper look at the entry details and decide on the usefulness of including all the info present, the context etc

@CallumBeaney CallumBeaney self-assigned this Oct 23, 2023
@alexobviously
Copy link
Member

Is this a parsing thing specific to the gutenberg dictionaries?

@CallumBeaney CallumBeaney changed the title Entry meaning What information we put into entries and from where -- contextualisation of entry content & dictionary sources Oct 27, 2023
@CallumBeaney
Copy link
Member Author

CallumBeaney commented Oct 27, 2023

@alexobviously All of them, but in particular, let's start with the numerals in Tolkien's entries. His Vocabulary is an assistant reader for this.

Ok here's a few notes about this from chat

Gawain is written in a much different dialect to most ME texts and is a good example of where struggles will arise
what I propose we do is build out final dictionary by combining sources and use these page notations and roman numeral info
in the Gawain case, An entry field would have page numbers noting usages. I would add a note to that value object saying:

"this is to (this ISBN) of Gawain, and these page numbers you can consult"

and likewise for the current dictionary file we have, we can add a note, "hey these numerals reference X Y Z sources, check them for more info"

alternatively we could just remove all the numerals bc I do wonder whether they're actually a bit noisy for the kind of usage we're looking at

if they want proper scholarly references they're gonna check an all-out dictionary resource right? In which case maybe we have references (numerals etc) free entry and the original entry. this makes our dictionary marginally heavier but it also means UX is probably better, and it means our "mouse dictionary" really is just an assistant tool, not a replacement for or bald expropriation of academic resources. If they're just double clicking words left & right all they /really/ need to know is:

word
meanings
variants
etymology

^ that entry less bulky, which means the user word history sidebar doesn't get massive full of info they don't need.

this means our final object is like this:

word
entry sans references
full original entry
variants
see ? see : null

@CallumBeaney
Copy link
Member Author

OK, suggestion RE: numerals.

  • Each numeral in our dictionary references a specific poem or text

  • And each numeral is followed by the line number in that poem for the given usage of a word that is being looked up. The problem is that if we substitute each numeral with the title, our dictionary entries start getting long.

  • SOLUTION: Each numeral is given a hyperlink in the modal, that links to an instance of that text

  • Each dictionary entry is parsed for Roman numerals and a matching numeral key has a value of a string that is a link to that text

  • This rule is applied specifically to entries from this first dictionary source.

  • By doing it like this, we do the least amount of processing to make what these numerals reference meaningful to a user. It keeps our dictionary entries that pop up in the modal terse as they are now as well:

    man: imper.pl. [+]
    † Variants: man, mase, makes, maketh, mad, made, maden, maid, maide, maked, maad,
    maade, mad, maid, maide, ymad, ymaked
    Man, VI 152; Mase, XIV 34, XVI 116;

So each source is accessible for quick easy reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants