-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Antonyms in the thesaurus #21
Comments
Hi, that's just the way the thesaurus is I guess. there's no relational
metadata like antonym or hypernym, etc afaik nor is there any real
maintenance or ownership of the work these days it seems. All I could find
was the file containing a list of adjacency lists in textual form
(words.txt, ~24M) which is what the the runtime files are built on.
You might want to check wordnet - it's part of the same umbrella public
domain umbrella as the Gutenberg Project (or visa versa) and contains that
kind of metadata you're interested in. It doesn't have every term in Moby
but more than I expected last I checked and might help you here.
The alternative would be in the machine learning space and would be a
really heavy lift from the get go unless you have some familiarity with it.
I think wordnet is your best bet -Adriaan
…On Sun, Nov 10, 2024 at 7:01 AM oriolgalceran ***@***.***> wrote:
Why is the thesaurus full of antonyms (good-bad, cold-warm)? I understand
this comes from the original dataset, but I'm wondering if there is any
explanation (or any way to separate them!)
Thanks!
—
Reply to this email directly, view it on GitHub
<#21>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVFKR6K7CZ7M7CURCTNOWPLZ75YMDAVCNFSM6AAAAABRQJL3UWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY2DOMZSGUYTSNA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks! Just so you see what the source of my issue is, i built this game www.synonuity.com and I'm having real trouble finding a reliable (and open) synonym dataset in English. I looked at wordnet way back when I built it and I decided against using it, can't really remember why. I'm going to have a look at it again. Thanks for your reply! |
I have a further question: in the entry for "computer" on the site https://moby-thesaurus.org/computer "machine" is included as a synonym. However, when I go on words.txt it's not there... why is that? I've seen this on other words too |
But "machine" *is* included in words.txt in the form of multi-word phrases
like "IBM machine' and "information machine" etc which are in the synset
for "computer" Don't forget case is important as well, that can get you
sometime.
Definitely wordnet is your best shot, Sourcing words from Moby seems like
the minimum effort way to get you going. I'd be interested in your results.
…On Sun, Nov 10, 2024 at 10:28 AM oriolgalceran ***@***.***> wrote:
I have a further question: in the entry for "computer" on the site
https://moby-thesaurus.org/computer "machine" is included as a synonym.
However, when I go on words.txt it's not there... why is that? I've seen
this on other words too
—
Reply to this email directly, view it on GitHub
<#21 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVFKR6KAUDPNXQWP4XJPDCTZ76QU5AVCNFSM6AAAAABRQJL3UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRWHAZTOMZWHA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Why is the thesaurus full of antonyms (good-bad, cold-warm)? I understand this comes from the original dataset, but I'm wondering if there is any explanation (or any way to separate them!)
Thanks!
The text was updated successfully, but these errors were encountered: