Copilot/dependabot merge requests#126
Open
mathun3003 wants to merge 44 commits into
Open
Conversation
Bumps [filelock](https://github.com/tox-dev/py-filelock) from 3.20.1 to 3.20.3. - [Release notes](https://github.com/tox-dev/py-filelock/releases) - [Changelog](https://github.com/tox-dev/filelock/blob/main/docs/changelog.rst) - [Commits](tox-dev/filelock@3.20.1...3.20.3) --- updated-dependencies: - dependency-name: filelock dependency-version: 3.20.3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [sentencepiece](https://github.com/google/sentencepiece) from 0.1.99 to 0.2.1. - [Release notes](https://github.com/google/sentencepiece/releases) - [Commits](google/sentencepiece@v0.1.99...v0.2.1) --- updated-dependencies: - dependency-name: sentencepiece dependency-version: 0.2.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 4.25.8 to 5.29.6. - [Release notes](https://github.com/protocolbuffers/protobuf/releases) - [Commits](https://github.com/protocolbuffers/protobuf/commits) --- updated-dependencies: - dependency-name: protobuf dependency-version: 5.29.6 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [orjson](https://github.com/ijl/orjson) from 3.9.15 to 3.11.6. - [Release notes](https://github.com/ijl/orjson/releases) - [Changelog](https://github.com/ijl/orjson/blob/master/CHANGELOG.md) - [Commits](ijl/orjson@3.9.15...3.11.6) --- updated-dependencies: - dependency-name: orjson dependency-version: 3.11.6 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [black](https://github.com/psf/black) from 24.3.0 to 26.3.1. - [Release notes](https://github.com/psf/black/releases) - [Changelog](https://github.com/psf/black/blob/main/CHANGES.md) - [Commits](psf/black@24.3.0...26.3.1) --- updated-dependencies: - dependency-name: black dependency-version: 26.3.1 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.5.1 to 6.5.5. - [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst) - [Commits](tornadoweb/tornado@v6.5.1...v6.5.5) --- updated-dependencies: - dependency-name: tornado dependency-version: 6.5.5 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [scrapy](https://github.com/scrapy/scrapy) from 2.13.4 to 2.14.2. - [Release notes](https://github.com/scrapy/scrapy/releases) - [Changelog](https://github.com/scrapy/scrapy/blob/master/docs/news.rst) - [Commits](scrapy/scrapy@2.13.4...2.14.2) --- updated-dependencies: - dependency-name: scrapy dependency-version: 2.14.2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [cryptography](https://github.com/pyca/cryptography) from 46.0.6 to 46.0.7. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](pyca/cryptography@46.0.6...46.0.7) --- updated-dependencies: - dependency-name: cryptography dependency-version: 46.0.7 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.4.4 to 9.0.3. - [Release notes](https://github.com/pytest-dev/pytest/releases) - [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst) - [Commits](pytest-dev/pytest@7.4.4...9.0.3) --- updated-dependencies: - dependency-name: pytest dependency-version: 9.0.3 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [python-dotenv](https://github.com/theskumar/python-dotenv) from 1.0.0 to 1.2.2. - [Release notes](https://github.com/theskumar/python-dotenv/releases) - [Changelog](https://github.com/theskumar/python-dotenv/blob/main/CHANGELOG.md) - [Commits](theskumar/python-dotenv@v1.0.0...v1.2.2) --- updated-dependencies: - dependency-name: python-dotenv dependency-version: 1.2.2 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 7.14.2 to 7.17.1. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Changelog](https://github.com/jupyter/nbconvert/blob/main/CHANGELOG.md) - [Commits](jupyter/nbconvert@v7.14.2...v7.17.1) --- updated-dependencies: - dependency-name: nbconvert dependency-version: 7.17.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [lxml](https://github.com/lxml/lxml) from 5.1.0 to 6.1.0. - [Release notes](https://github.com/lxml/lxml/releases) - [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt) - [Commits](lxml/lxml@lxml-5.1.0...lxml-6.1.0) --- updated-dependencies: - dependency-name: lxml dependency-version: 6.1.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [notebook](https://github.com/jupyter/notebook) from 7.4.6 to 7.5.6. - [Release notes](https://github.com/jupyter/notebook/releases) - [Changelog](https://github.com/jupyter/notebook/blob/@jupyter-notebook/tree@7.5.6/CHANGELOG.md) - [Commits](https://github.com/jupyter/notebook/compare/@jupyter-notebook/tree@7.4.6...@jupyter-notebook/tree@7.5.6) --- updated-dependencies: - dependency-name: notebook dependency-version: 7.5.6 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [jupyterlab](https://github.com/jupyterlab/jupyterlab) from 4.4.8 to 4.5.7. - [Release notes](https://github.com/jupyterlab/jupyterlab/releases) - [Changelog](https://github.com/jupyterlab/jupyterlab/blob/main/RELEASE.md) - [Commits](https://github.com/jupyterlab/jupyterlab/compare/@jupyterlab/lsp@4.4.8...@jupyterlab/lsp@4.5.7) --- updated-dependencies: - dependency-name: jupyterlab dependency-version: 4.5.7 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [jupyter-server](https://github.com/jupyter-server/jupyter_server) from 2.12.5 to 2.18.0. - [Release notes](https://github.com/jupyter-server/jupyter_server/releases) - [Changelog](https://github.com/jupyter-server/jupyter_server/blob/main/CHANGELOG.md) - [Commits](jupyter-server/jupyter_server@v2.12.5...v2.18.0) --- updated-dependencies: - dependency-name: jupyter-server dependency-version: 2.18.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [mistune](https://github.com/lepture/mistune) from 3.0.2 to 3.2.1. - [Release notes](https://github.com/lepture/mistune/releases) - [Changelog](https://github.com/lepture/mistune/blob/main/docs/changes.rst) - [Commits](lepture/mistune@v3.0.2...v3.2.1) --- updated-dependencies: - dependency-name: mistune dependency-version: 3.2.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.41 to 3.1.50. - [Release notes](https://github.com/gitpython-developers/GitPython/releases) - [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](gitpython-developers/GitPython@3.1.41...3.1.50) --- updated-dependencies: - dependency-name: gitpython dependency-version: 3.1.50 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.6.0 to 2.7.0. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](urllib3/urllib3@2.6.0...2.7.0) --- updated-dependencies: - dependency-name: urllib3 dependency-version: 2.7.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [idna](https://github.com/kjd/idna) from 3.7 to 3.15. - [Release notes](https://github.com/kjd/idna/releases) - [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.md) - [Commits](kjd/idna@v3.7...v3.15) --- updated-dependencies: - dependency-name: idna dependency-version: '3.15' dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [dulwich](https://github.com/dulwich/dulwich) from 0.21.7 to 1.2.5. - [Release notes](https://github.com/dulwich/dulwich/releases) - [Changelog](https://github.com/jelmer/dulwich/blob/main/NEWS) - [Commits](jelmer/dulwich@dulwich-0.21.7...dulwich-1.2.5) --- updated-dependencies: - dependency-name: dulwich dependency-version: 1.2.5 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
# Conflicts: # poetry.lock
# Conflicts: # poetry.lock
# Conflicts: # poetry.lock
# Conflicts: # poetry.lock
There was a problem hiding this comment.
Pull request overview
This PR updates core scraping/tooling dependencies and refactors the Goodreads and AZQuotes spiders to tighten parsing behavior and improve robustness when expected page elements are missing.
Changes:
- Bumped key dependencies in
pyproject.toml(notably Scrapy/Black/Pytest). - Refined
GoodreadsSpiderparsing flow and added explicit missing-element checks before constructing scraped items. - Simplified spider
parsemethod signatures by removing unused**kwargs(and related unused imports).
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
pyproject.toml |
Updates Scrapy/Black/Pytest versions used by the app and CI tooling. |
quotes_recommender/quote_scraper/spiders/goodreads_spider.py |
Tightens parse/subpage parsing and adds explicit validations before item construction. |
quotes_recommender/quote_scraper/spiders/azquotes_spider.py |
Removes unused typing import and simplifies parse signature. |
Comments suppressed due to low confidence (3)
quotes_recommender/quote_scraper/spiders/goodreads_spider.py:86
- After extracting likes with
re.findall(...), the code indexesnum_likes_list[0]without handling the case where the regex finds no numbers. This will raise anIndexErrorinstead of a clear error when likes text is present but doesn't contain a number.
num_likes_list: list[str] = re.findall(self.NUM_LIKES_REGEX, likes_text)
if len(num_likes_list) > 1:
raise StopDownload(fail=True)
if not num_likes_list[0].isdigit():
raise ValueError('num_likes is not a digit. Failed to convert to int.')
quotes_recommender/quote_scraper/spiders/goodreads_spider.py:86
- After extracting likes with
re.findall(...), the code indexesnum_likes_list[0]without handling the case where the regex finds no numbers. This will raise anIndexErrorinstead of a clear error when likes text is present but doesn't contain a number.
num_likes_list: list[str] = re.findall(self.NUM_LIKES_REGEX, likes_text)
if len(num_likes_list) > 1:
raise StopDownload(fail=True)
if not num_likes_list[0].isdigit():
raise ValueError('num_likes is not a digit. Failed to convert to int.')
quotes_recommender/quote_scraper/spiders/goodreads_spider.py:86
- After extracting likes with
re.findall(...), the code indexesnum_likes_list[0]without handling the case where the regex finds no numbers. This will raise anIndexErrorinstead of a clear error when likes text is present but doesn't contain a number.
num_likes_list: list[str] = re.findall(self.NUM_LIKES_REGEX, likes_text)
if len(num_likes_list) > 1:
raise StopDownload(fail=True)
if not num_likes_list[0].isdigit():
raise ValueError('num_likes is not a digit. Failed to convert to int.')
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
45
to
50
| """ | ||
| Function to select data from an object. | ||
| :param response: web response from scrapy | ||
| :param kwargs: additional kwargs | ||
| :return: Generator object | ||
| """ |
Comment on lines
51
to
+53
| for feed in response.css(self.QUOTE_FEED).extract(): | ||
| yield scrapy.Request(response.urljoin(feed), callback=self.parse_subpage) | ||
| if feed is not None: | ||
| yield scrapy.Request(response.urljoin(feed), callback=self.parse_subpage) |
Comment on lines
55
to
57
| next_page = response.css(self.NEXT_SELECTOR).extract_first() | ||
| if next_page: | ||
| if next_page is not None: | ||
| yield scrapy.Request(response.urljoin(next_page)) |
| yield scrapy.Request(url=url, callback=self.parse) | ||
|
|
||
| def parse(self, response, **kwargs: Any) -> Generator[QuoteItem, None, None]: | ||
| def parse(self, response) -> Generator[QuoteItem, None, None]: |
Comment on lines
45
to
50
| """ | ||
| Function to select data from an object. | ||
| :param response: web response from scrapy | ||
| :param kwargs: additional kwargs | ||
| :return: Generator object | ||
| """ |
| yield scrapy.Request(url=url, callback=self.parse) | ||
|
|
||
| def parse(self, response, **kwargs: Any) -> Generator[QuoteItem, None, None]: | ||
| def parse(self, response) -> Generator[QuoteItem, None, None]: |
Comment on lines
45
to
50
| """ | ||
| Function to select data from an object. | ||
| :param response: web response from scrapy | ||
| :param kwargs: additional kwargs | ||
| :return: Generator object | ||
| """ |
Comment on lines
51
to
+53
| for feed in response.css(self.QUOTE_FEED).extract(): | ||
| yield scrapy.Request(response.urljoin(feed), callback=self.parse_subpage) | ||
| if feed is not None: | ||
| yield scrapy.Request(response.urljoin(feed), callback=self.parse_subpage) |
Comment on lines
55
to
57
| next_page = response.css(self.NEXT_SELECTOR).extract_first() | ||
| if next_page: | ||
| if next_page is not None: | ||
| yield scrapy.Request(response.urljoin(next_page)) |
| yield scrapy.Request(url=url, callback=self.parse) | ||
|
|
||
| def parse(self, response, **kwargs: Any) -> Generator[QuoteItem, None, None]: | ||
| def parse(self, response) -> Generator[QuoteItem, None, None]: |
Owner
Author
|
@copilot fix the three failed code style checks. Fix the black and pylint linting errors. Also fix the mypy typing errors. |
Copilot stopped work on behalf of
mathun3003 due to an error
May 31, 2026 10:40
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request primarily updates dependencies and refactors the
GoodreadsSpiderandAZQuotesSpiderspiders to improve code robustness and clarity. The most significant changes are grouped below:Dependency Updates:
pyproject.toml, includingscrapyto2.14.2,pytestto9.0.3, andblackto26.3.1, ensuring compatibility with the latest features and bug fixes. [1] [2]Refactoring and Robustness Improvements in Spiders:
Anyimport from bothazquotes_spider.pyandgoodreads_spider.pyfor cleaner type imports. [1] [2]parsemethod signatures in both spiders by removing unnecessary**kwargsand updating return types for clarity and correctness. [1] [2]GoodreadsSpider, improved error handling inparse_subpageby adding explicit checks for missing elements (likes text, author, author profile, quote text) and raisingValueErrorwith clear messages when data is missing. [1] [2]QuoteIteminparse_subpagefor better readability and to avoid repeated code by extracting and validating fields before use.