-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take over the original PyPI project? #32
Comments
I have reached out to the original author a long time ago but with no response. I forgot about PEP 541, thank you for bringing this to my attention. I will submit an application. Edit: It seems after reading the requirements for reachability:
It seems I need to reach out to PyYoshi a few more times before the owner is considered "unreachable". |
Thanks. I'm not sure if you are actually supposed to do that, and not the person handling your request. After all, how can PyPI admins know that you've actually contacted them? I think filing a bug on their GitHub would also be a good step, as that is publicly visible. |
I would forward emails to PyPi admins as evidence.
Agreed, I don't like the idea of invoking PEP 541, but it seems that this project is in need of it. Opening up an issue in advance would be morally right. Edit: Sorry, I'm tired. I misread |
Howdy! Are there any updates on this? Barring that, is there a future where the top-level name of this package is changed to alleviate collisions? (Granted it is useful that you can install this in place of cchardet and magically make other packages that know nothing about it work, but it does make a lot of situations messy, as the OP noted.) |
Sorry, there are no updates on this at the moment. I have not been able to allocate the time to work on this. 😓 |
Reached out to the original developer, haven't heard back. |
Could we use GitHub actions to automate the release of this package to PyPI under a second, separate namespace? That way people who are experiencing conflicts over Incomplete GitHub Actions ideaname: Publish to PyPI with Renamed Namespace
on:
push:
tags:
- 'v*'
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with: { python-version: '3.x' }
- name: Rename directory
run: mv src/cchardet src/faust_cchardet
- name: Update imports (if necessary)
run: >-
find . -type f -name '*.py' -exec
sed -i 's/import cchardet/import faust_cchardet/g' {} +
- name: Build the package
run: python setup.py sdist bdist_wheel
- name: Publish the package to PyPI
uses: pypa/gh-action-pypi-publish@v1.4.2
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }} |
Hi, sorry I've been away! I've bit off more than I could chew, I didn't expect this revival to become so important as a dependency. I'll file a PEP 541 request for |
no longer true
diff cd $(mktemp -d)
git clone --depth=1 https://github.com/PyYoshi/cChardet
cd cChardet/
git remote add faust-cchardet https://github.com/faust-streaming/cChardet
git fetch faust-cchardet master
git rev-parse master
# fa74a8e43a2685767296f4cc5bc4594d28713ab1
git rev-parse faust-cchardet/master
# 3af7068fc6f04dc777531da021057bfbe75313b2
git diff --stat master faust-cchardet/master -- src/cchardet/
git diff master faust-cchardet/master -- src/cchardet/ git diff --stat
git diffdiff --git a/src/cchardet/__init__.py b/src/cchardet/__init__.py
index f616d7f..c6db442 100644
--- a/src/cchardet/__init__.py
+++ b/src/cchardet/__init__.py
@@ -1,7 +1,5 @@
-from . import _cchardet
-
-version = (2, 2, 0, "alpha", 3)
-__version__ = "2.2.0a3"
+from cchardet import _cchardet
+from .version import __version__
def detect(msg):
@@ -17,10 +15,6 @@ def detect(msg):
encoding, confidence = _cchardet.detect_with_confidence(msg)
if isinstance(encoding, bytes):
encoding = encoding.decode()
-
- if encoding == "MAC-CENTRALEUROPE":
- encoding = "maccentraleurope"
-
return {"encoding": encoding, "confidence": confidence}
diff --git a/src/cchardet/__main__.py b/src/cchardet/__main__.py
deleted file mode 100644
index a3e0fd8..0000000
--- a/src/cchardet/__main__.py
+++ /dev/null
@@ -1,4 +0,0 @@
-from .cli.cchardetect import main
-
-if __name__ == "__main__":
- main()
diff --git a/src/cchardet/_cchardet.pyx b/src/cchardet/_cchardet.pyx
index 27d9f55..75af096 100644
--- a/src/cchardet/_cchardet.pyx
+++ b/src/cchardet/_cchardet.pyx
@@ -1,26 +1,19 @@
-# coding: utf-8
-#cython: embedsignature=True, c_string_encoding=ascii, language_level=3
-
cdef extern from *:
ctypedef char* const_char_ptr "const char*"
- ctypedef unsigned long size_t
-# uchardet v0.0.8
cdef extern from "uchardet.h":
ctypedef void* uchardet_t
cdef uchardet_t uchardet_new()
cdef void uchardet_delete(uchardet_t ud)
- cdef int uchardet_handle_data(uchardet_t ud, const_char_ptr data, size_t length)
+ cdef int uchardet_handle_data(uchardet_t ud, const_char_ptr data, int length)
cdef void uchardet_data_end(uchardet_t ud)
cdef void uchardet_reset(uchardet_t ud)
cdef const_char_ptr uchardet_get_charset(uchardet_t ud)
- cdef float uchardet_get_confidence(uchardet_t ud, size_t i)
- # cdef const_char_ptr uchardet_get_encoding(uchardet_t ud, size_t i)
- # cdef const_char_ptr uchardet_get_language(uchardet_t ud, size_t i)
+ cdef float uchardet_get_confidence(uchardet_t ud)
def detect_with_confidence(bytes msg):
- cdef size_t length = len(msg)
-
+ cdef int length = len(msg)
+
cdef uchardet_t ud = uchardet_new()
cdef int result = uchardet_handle_data(ud, msg, length)
@@ -30,17 +23,8 @@ def detect_with_confidence(bytes msg):
uchardet_data_end(ud)
- cdef bytes detected_charset
- # cdef bytes detected_encoding
- # cdef const_char_ptr detected_language
- cdef float detected_confidence
-
- detected_charset = uchardet_get_charset(ud)
- # detected_encoding = uchardet_get_encoding(ud, 0)
- # detected_language = uchardet_get_language(ud, 0)
- detected_confidence = uchardet_get_confidence(ud, 0)
-
- uchardet_reset(ud)
+ cdef bytes detected_charset = uchardet_get_charset(ud)
+ cdef float detected_confidence = uchardet_get_confidence(ud)
uchardet_delete(ud)
if detected_charset:
@@ -53,8 +37,6 @@ cdef class UniversalDetector:
cdef int _done
cdef int _closed
cdef bytes _detected_charset
- # cdef bytes _detected_encoding
- # cdef const_char_ptr _detected_language
cdef float _detected_confidence
def __init__(self):
@@ -62,8 +44,6 @@ cdef class UniversalDetector:
self._done = 0
self._closed = 0
self._detected_charset = b""
- # self._detected_encoding = b""
- # self._detected_language = b""
self._detected_confidence = 0.0
def reset(self):
@@ -71,8 +51,6 @@ cdef class UniversalDetector:
self._done = 0
self._closed = 0
self._detected_charset = b""
- # self._detected_encoding = b""
- # self._detected_language = b""
self._detected_confidence = 0.0
uchardet_reset(self._ud)
@@ -95,18 +73,13 @@ cdef class UniversalDetector:
self._done = 1
self._detected_charset = uchardet_get_charset(self._ud)
- # self._detected_encoding = uchardet_get_encoding(self._ud, 0)
- # self._detected_language = uchardet_get_language(self._ud, 0)
- self._detected_confidence = uchardet_get_confidence(self._ud, 0)
+ self._detected_confidence = uchardet_get_confidence(self._ud)
def close(self):
if not self._closed:
uchardet_data_end(self._ud)
-
self._detected_charset = uchardet_get_charset(self._ud)
- # self._detected_encoding = uchardet_get_encoding(self._ud, 0)
- # self._detected_language = uchardet_get_language(self._ud, 0)
- self._detected_confidence = uchardet_get_confidence(self._ud, 0)
+ self._detected_confidence = uchardet_get_confidence(self._ud)
uchardet_delete(self._ud)
self._closed = 1
diff --git a/src/cchardet/cli/__init__.py b/src/cchardet/cli/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/src/cchardet/cli/cchardetect.py b/src/cchardet/cli/cchardetect.py
deleted file mode 100755
index 485174c..0000000
--- a/src/cchardet/cli/cchardetect.py
+++ /dev/null
@@ -1,40 +0,0 @@
-import argparse
-import sys
-
-from .. import UniversalDetector, __version__
-
-
-def read_chunks(f, chunk_size):
- chunk = f.read(chunk_size)
- while chunk:
- yield chunk
- chunk = f.read(chunk_size)
-
-
-def main():
- parser = argparse.ArgumentParser()
- parser.add_argument(
- "files",
- nargs="*",
- help="Files to detect encoding of",
- type=argparse.FileType("rb"),
- default=[sys.stdin.buffer],
- )
- parser.add_argument("--chunk-size", type=int, default=(256 * 1024))
- parser.add_argument("--version", action="version", version="%(prog)s {0}".format(__version__))
- args = parser.parse_args()
-
- for f in args.files:
- detector = UniversalDetector()
- for chunk in read_chunks(f, args.chunk_size):
- detector.feed(chunk)
- detector.close()
- print(
- "{file.name}: {result[encoding]} with confidence {result[confidence]}".format(
- file=f, result=detector.result
- )
- )
-
-
-if __name__ == "__main__":
- main()
diff --git a/src/cchardet/version.py b/src/cchardet/version.py
new file mode 100644
index 0000000..f43fee1
--- /dev/null
+++ b/src/cchardet/version.py
@@ -0,0 +1 @@
+__version__ = '2.1.19' |
Since the original
cchardet
project is clearly no longer maintained, have you tried contacting the original author to give you permissions to take the PyPI project? And if that failed, applying for PEP 541 name reuse?Creating a fork has the problem that some packages will now require
cchardet
and some will requirefaust-cchardet
, and both can't be installed simultaneously which causes major problems for distributions.The text was updated successfully, but these errors were encountered: