Skip to content

Commit

Permalink
Merge branch 'master' into selector-text
Browse files Browse the repository at this point in the history
# Conflicts:
#	parsel/selector.py
#	tests/test_selector.py
#	tests/typing/selector.py
#	tox.ini
  • Loading branch information
kmike committed Apr 24, 2024
2 parents 419af4b + 7407342 commit b8d0352
Show file tree
Hide file tree
Showing 33 changed files with 1,030 additions and 520 deletions.
3 changes: 3 additions & 0 deletions .bandit.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
skips:
- B101
- B311
- B320
- B410
exclude_dirs: ['tests']
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.7.0
current_version = 1.9.1
commit = True
tag = True
tag_name = v{new_version}
Expand Down
1 change: 0 additions & 1 deletion .coveragerc
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
[run]
branch = true
include = parsel/*

[report]
exclude_lines =
Expand Down
3 changes: 2 additions & 1 deletion .flake8
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[flake8]
ignore = E203
ignore = E203,W503
per-file-ignores =
docs/conftest.py:E501
parsel/csstranslator.py:E501
Expand All @@ -9,6 +9,7 @@ per-file-ignores =
setup.py:E501
tests/test_selector.py:E501
tests/test_selector_csstranslator.py:E501
tests/test_selector_jmespath.py:E501
tests/test_utils.py:E501
tests/test_xpathfuncs.py:E501
tests/typing/*.py:E,F
2 changes: 2 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# applying pre-commit hooks to the project
a57c23e3b7be0f001595bd8767fe05e40a66e730
21 changes: 9 additions & 12 deletions .github/workflows/checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,30 +8,27 @@ jobs:
fail-fast: false
matrix:
include:
- python-version: "3.11"
- python-version: "3.12"
env:
TOXENV: security
- python-version: "3.11"
env:
TOXENV: flake8
- python-version: "3.11"
TOXENV: pre-commit
- python-version: "3.12"
env:
TOXENV: pylint
- python-version: "3.11" # Keep in sync with .readthedocs.yml
- python-version: "3.12"
env:
TOXENV: docs
- python-version: "3.11"
- python-version: "3.12"
env:
TOXENV: typing
- python-version: "3.11"
- python-version: "3.12"
env:
TOXENV: black
TOXENV: twinecheck

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ jobs:
if: startsWith(github.event.ref, 'refs/tags/')

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Set up Python 3.10
uses: actions/setup-python@v4
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.10"
python-version: "3.12"

- name: Check Tag
id: check-release-tag
Expand Down
14 changes: 7 additions & 7 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,6 @@ jobs:
fail-fast: false
matrix:
include:
- python-version: "3.7"
env:
TOXENV: py
- python-version: "3.8"
env:
TOXENV: py
Expand All @@ -23,21 +20,24 @@ jobs:
- python-version: "3.11"
env:
TOXENV: py
- python-version: pypy3.9
- python-version: "3.12"
env:
TOXENV: py
- python-version: pypy3.10
env:
TOXENV: pypy3

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Install system libraries
if: contains(matrix.python-version, 'pypy3.9')
if: contains(matrix.python-version, 'pypy')
run: |
sudo apt-get update
sudo apt-get install libxml2-dev libxslt-dev
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ pip-log.txt

# Unit test / coverage reports
.coverage
/coverage.xml
.tox
nosetests.xml
htmlcov
Expand Down
2 changes: 2 additions & 0 deletions .isort.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[settings]
profile = black
18 changes: 18 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
repos:
- repo: https://github.com/PyCQA/bandit
rev: 1.7.8
hooks:
- id: bandit
args: [-r, -c, .bandit.yml]
- repo: https://github.com/PyCQA/flake8
rev: 7.0.0
hooks:
- id: flake8
- repo: https://github.com/psf/black.git
rev: 24.2.0
hooks:
- id: black
- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort
2 changes: 1 addition & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ build:
tools:
# For available versions, see:
# https://docs.readthedocs.io/en/stable/config-file/v2.html#build-tools-python
python: "3.11" # Keep in sync with .github/workflows/checks.yml
python: "3.12" # Keep in sync with .github/workflows/checks.yml
python:
install:
- requirements: docs/requirements.txt
Expand Down
74 changes: 0 additions & 74 deletions Makefile

This file was deleted.

45 changes: 45 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,51 @@
History
-------

1.9.1 (2024-04-08)
~~~~~~~~~~~~~~~~~~

* Removed the dependency on ``pytest-runner``.
* Removed the obsolete ``Makefile``.

1.9.0 (2024-03-14)
~~~~~~~~~~~~~~~~~~

* Now requires ``cssselect >= 1.2.0`` (this minimum version was required since
1.8.0 but that wasn't properly recorded)
* Removed support for Python 3.7
* Added support for Python 3.12 and PyPy 3.10
* Fixed an exception when calling ``__str__`` or ``__repr__`` on some JSON
selectors
* Code formatted with ``black``
* CI fixes and improvements

1.8.1 (2023-04-18)
~~~~~~~~~~~~~~~~~~

* Remove a Sphinx reference from NEWS to fix the PyPI description
* Add a ``twine check`` CI check to detect such problems

1.8.0 (2023-04-18)
~~~~~~~~~~~~~~~~~~

* Add support for JMESPath: you can now create a selector for a JSON document
and call ``Selector.jmespath()``. See `the documentation`_ for more
information and examples.
* Selectors can now be constructed from ``bytes`` (using the ``body`` and
``encoding`` arguments) instead of ``str`` (using the ``text`` argument), so
that there is no internal conversion from ``str`` to ``bytes`` and the memory
usage is lower.
* Typing improvements
* The ``pkg_resources`` module (which was absent from the requirements) is no
longer used
* Documentation build fixes
* New requirements:

* ``jmespath``
* ``typing_extensions`` (on Python 3.7)

.. _the documentation: https://parsel.readthedocs.io/en/latest/usage.html

1.7.0 (2022-11-01)
~~~~~~~~~~~~~~~~~~

Expand Down
42 changes: 29 additions & 13 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,16 @@ Parsel
:alt: Coverage report


Parsel is a BSD-licensed Python_ library to extract and remove data from HTML_
and XML_ using XPath_ and CSS_ selectors, optionally combined with
`regular expressions`_.
Parsel is a BSD-licensed Python_ library to extract data from HTML_, JSON_, and
XML_ documents.

It supports:

- CSS_ and XPath_ expressions for HTML and XML documents

- JMESPath_ expressions for JSON documents

- `Regular expressions`_

Find the Parsel online documentation at https://parsel.readthedocs.org.

Expand All @@ -30,15 +37,18 @@ Example (`open online demo`_):
.. code-block:: python
>>> from parsel import Selector
>>> selector = Selector(text="""<html>
<body>
<h1>Hello, Parsel!</h1>
<ul>
<li><a href="http://example.com">Link 1</a></li>
<li><a href="http://scrapy.org">Link 2</a></li>
</ul>
</body>
</html>""")
>>> text = """
<html>
<body>
<h1>Hello, Parsel!</h1>
<ul>
<li><a href="http://example.com">Link 1</a></li>
<li><a href="http://scrapy.org">Link 2</a></li>
</ul>
<script type="application/json">{"a": ["b", "c"]}</script>
</body>
</html>"""
>>> selector = Selector(text=text)
>>> selector.css('h1::text').get()
'Hello, Parsel!'
>>> selector.xpath('//h1/text()').re(r'\w+')
Expand All @@ -47,12 +57,18 @@ Example (`open online demo`_):
... print(li.xpath('.//@href').get())
http://example.com
http://scrapy.org
>>> selector.css('script::text').jmespath("a").get()
'b'
>>> selector.css('script::text').jmespath("a").getall()
['b', 'c']
.. _CSS: https://en.wikipedia.org/wiki/Cascading_Style_Sheets
.. _HTML: https://en.wikipedia.org/wiki/HTML
.. _JMESPath: https://jmespath.org/
.. _JSON: https://en.wikipedia.org/wiki/JSON
.. _open online demo: https://colab.research.google.com/drive/149VFa6Px3wg7S3SEnUqk--TyBrKplxCN#forceEdit=true&sandboxMode=true
.. _Python: https://www.python.org/
.. _regular expressions: https://docs.python.org/library/re.html
.. _XML: https://en.wikipedia.org/wiki/XML
.. _XPath: https://en.wikipedia.org/wiki/XPath

Loading

0 comments on commit b8d0352

Please sign in to comment.