Need a CFG file that works for spacy v3 with webtrf to train a custom ner #10064
-
What is the correct way to create a CFG file spacyv3 for webtrf model to train a custom ner? I am using . CFG as shwn below but getting 0 accuracy after several epochs: =========================== Initializing pipeline =========================== ============================= Training pipeline ============================= /usr/local/lib/python3.7/dist-packages/torch/autocast_mode.py:141: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling [nlp] [components] [components.ner] [components.ner.model] [components.ner.model.tok2vec] [components.tok2vec] [components.tok2vec.model] [components.tok2vec.model.embed] [components.tok2vec.model.encode] |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 29 replies
-
You're using a tok2vec and a transformer in the same pipeline, which is not necessary and probably causing weird things to happen. You should only have either a transformer or a tok2vec in a pipeline, not both. You can source the transformer from the pretrained pipeline, see here for how to do that, though I would recommend just training from scratch using a GPU config from the quickstart. Also, it looks like you either do not have a GPU or do not have it configured correctly. Note that training Transformers on CPU is possible but extremely slow and not recommended. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Update: even after installing spacy_lookups, tokens recognized as new labels is still zero. interesting is that new labels appear in eval output but no tokens recognized as such. Input samples do have sufficient custom entities |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Your part of speech and dependency data look like they're just invalid and you won't be able to train a useful model. I see that in your real config you are trying to train these components - do you actually want to train these, or do you just want to use the pretrained models?
>>> NO, I do not want to train POS /DEP. Where in the CFG does it indicate that I am doing this? How do I ensure that I use pretrained models?
To clear things up a little I have two questions:
1. Can you give an example of a sentence with your entities in it?
>>> will do. I have to anon it.
2. What is the ultimate goal here? You want to have en_core_web_trf + NER for your custom entities, right?
3. >>>>>>>>>>>>>.YES:
4. Are you actually going to use the tagger, parser, lemmatizer etc.?
5. >>>>>>>>>>NO. (Components are frozen on the CFG) I only want to use trf+ NER for two new entities:
…________________________________
From: polm ***@***.***>
Sent: Sunday, February 6, 2022 1:41 AM
To: explosion/spaCy ***@***.***>
Cc: Badri Nath ***@***.***>; Author ***@***.***>
Subject: Re: [explosion/spaCy] Need a CFG file that works for spacy v3 with webtrf to train a custom ner (Discussion #10064)
Thanks, that's very helpful.
Couple of things that jump out at me from that output:
It looks like your training and dev data have a lot of overlap. That's fine for getting started but given your data volume they should be completely separate to test your model properly.
✘ 461 invalid whitespace entity spans
This is not a huge number given the size of your training data, but it is unusually large, and these kind of errors are not normal. You are included spaces before or after entities in the entity annotations. Those annotations will be basically unusable. You should look at why that is happening.
For the "Low number of examples" warnings, the model is probably going to forget all those labels because you have basically no data for them.
Your part of speech and dependency data look like they're just invalid and you won't be able to train a useful model. I see that in your real config you are trying to train these components - do you actually want to train these, or do you just want to use the pretrained models?
To clear things up a little I have two questions:
1. Can you give an example of a sentence with your entities in it?
2. What is the ultimate goal here? You want to have en_core_web_trf + NER for your custom entities, right? Are you actually going to use the tagger, parser, lemmatizer etc.?
—
Reply to this email directly, view it on GitHub<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fexplosion%2FspaCy%2Fdiscussions%2F10064%23discussioncomment-2118952&data=04%7C01%7Cbadri%40cs.rutgers.edu%7C14315367c0c74eec9dc708d9e93bc455%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C637797265185556720%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=2KxCDhW7qbkTT%2BjusDTbtwmEni7Wh37ngVSMvzBarGA%3D&reserved=0>, or unsubscribe<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAWKRUW6H32RBMADUVE2PCNTUZYJ3FANCNFSM5L7CYKGQ&data=04%7C01%7Cbadri%40cs.rutgers.edu%7C14315367c0c74eec9dc708d9e93bc455%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C637797265185556720%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Sm%2BAjAuu5KiboBKKdbrjmlHZ400TfoTofI%2B3s6Rip%2Bw%3D&reserved=0>.
Triage notifications on the go with GitHub Mobile for iOS<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cbadri%40cs.rutgers.edu%7C14315367c0c74eec9dc708d9e93bc455%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C637797265185556720%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=5ZINH4Z0LUQvdPC4%2F9yPy3PGaMd1YpRuMbAvn0cr2CM%3D&reserved=0> or Android<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Cbadri%40cs.rutgers.edu%7C14315367c0c74eec9dc708d9e93bc455%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C637797265185556720%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=CGargEDInwXFz4x45mLeFkrcBj%2FJD6UJJKrcEpxVdAE%3D&reserved=0>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
You're using a tok2vec and a transformer in the same pipeline, which is not necessary and probably causing weird things to happen. You should only have either a transformer or a tok2vec in a pipeline, not both.
You can source the transformer from the pretrained pipeline, see here for how to do that, though I would recommend just training from scratch using a GPU config from the quickstart.
Also, it looks like you either do not have a GPU or do not have it configured correctly. Note that training Transformers on CPU is possible but extremely slow and not recommended.