Skip to content

Copilot/dependabot merge requests#126

Open
mathun3003 wants to merge 44 commits into
mainfrom
copilot/dependabot-merge-requests
Open

Copilot/dependabot merge requests#126
mathun3003 wants to merge 44 commits into
mainfrom
copilot/dependabot-merge-requests

Conversation

@mathun3003

Copy link
Copy Markdown
Owner

This pull request primarily updates dependencies and refactors the GoodreadsSpider and AZQuotesSpider spiders to improve code robustness and clarity. The most significant changes are grouped below:

Dependency Updates:

  • Updated key dependencies in pyproject.toml, including scrapy to 2.14.2, pytest to 9.0.3, and black to 26.3.1, ensuring compatibility with the latest features and bug fixes. [1] [2]

Refactoring and Robustness Improvements in Spiders:

  • Removed unused Any import from both azquotes_spider.py and goodreads_spider.py for cleaner type imports. [1] [2]
  • Simplified the parse method signatures in both spiders by removing unnecessary **kwargs and updating return types for clarity and correctness. [1] [2]
  • In GoodreadsSpider, improved error handling in parse_subpage by adding explicit checks for missing elements (likes text, author, author profile, quote text) and raising ValueError with clear messages when data is missing. [1] [2]
  • Refactored the construction of the QuoteItem in parse_subpage for better readability and to avoid repeated code by extracting and validating fields before use.

dependabot Bot and others added 30 commits January 13, 2026 21:02
Bumps [filelock](https://github.com/tox-dev/py-filelock) from 3.20.1 to 3.20.3.
- [Release notes](https://github.com/tox-dev/py-filelock/releases)
- [Changelog](https://github.com/tox-dev/filelock/blob/main/docs/changelog.rst)
- [Commits](tox-dev/filelock@3.20.1...3.20.3)

---
updated-dependencies:
- dependency-name: filelock
  dependency-version: 3.20.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [sentencepiece](https://github.com/google/sentencepiece) from 0.1.99 to 0.2.1.
- [Release notes](https://github.com/google/sentencepiece/releases)
- [Commits](google/sentencepiece@v0.1.99...v0.2.1)

---
updated-dependencies:
- dependency-name: sentencepiece
  dependency-version: 0.2.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 4.25.8 to 5.29.6.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Commits](https://github.com/protocolbuffers/protobuf/commits)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-version: 5.29.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [orjson](https://github.com/ijl/orjson) from 3.9.15 to 3.11.6.
- [Release notes](https://github.com/ijl/orjson/releases)
- [Changelog](https://github.com/ijl/orjson/blob/master/CHANGELOG.md)
- [Commits](ijl/orjson@3.9.15...3.11.6)

---
updated-dependencies:
- dependency-name: orjson
  dependency-version: 3.11.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [black](https://github.com/psf/black) from 24.3.0 to 26.3.1.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](psf/black@24.3.0...26.3.1)

---
updated-dependencies:
- dependency-name: black
  dependency-version: 26.3.1
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.5.1 to 6.5.5.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](tornadoweb/tornado@v6.5.1...v6.5.5)

---
updated-dependencies:
- dependency-name: tornado
  dependency-version: 6.5.5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [scrapy](https://github.com/scrapy/scrapy) from 2.13.4 to 2.14.2.
- [Release notes](https://github.com/scrapy/scrapy/releases)
- [Changelog](https://github.com/scrapy/scrapy/blob/master/docs/news.rst)
- [Commits](scrapy/scrapy@2.13.4...2.14.2)

---
updated-dependencies:
- dependency-name: scrapy
  dependency-version: 2.14.2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [cryptography](https://github.com/pyca/cryptography) from 46.0.6 to 46.0.7.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](pyca/cryptography@46.0.6...46.0.7)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 46.0.7
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.4.4 to 9.0.3.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](pytest-dev/pytest@7.4.4...9.0.3)

---
updated-dependencies:
- dependency-name: pytest
  dependency-version: 9.0.3
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [python-dotenv](https://github.com/theskumar/python-dotenv) from 1.0.0 to 1.2.2.
- [Release notes](https://github.com/theskumar/python-dotenv/releases)
- [Changelog](https://github.com/theskumar/python-dotenv/blob/main/CHANGELOG.md)
- [Commits](theskumar/python-dotenv@v1.0.0...v1.2.2)

---
updated-dependencies:
- dependency-name: python-dotenv
  dependency-version: 1.2.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 7.14.2 to 7.17.1.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Changelog](https://github.com/jupyter/nbconvert/blob/main/CHANGELOG.md)
- [Commits](jupyter/nbconvert@v7.14.2...v7.17.1)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-version: 7.17.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [lxml](https://github.com/lxml/lxml) from 5.1.0 to 6.1.0.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](lxml/lxml@lxml-5.1.0...lxml-6.1.0)

---
updated-dependencies:
- dependency-name: lxml
  dependency-version: 6.1.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [notebook](https://github.com/jupyter/notebook) from 7.4.6 to 7.5.6.
- [Release notes](https://github.com/jupyter/notebook/releases)
- [Changelog](https://github.com/jupyter/notebook/blob/@jupyter-notebook/tree@7.5.6/CHANGELOG.md)
- [Commits](https://github.com/jupyter/notebook/compare/@jupyter-notebook/tree@7.4.6...@jupyter-notebook/tree@7.5.6)

---
updated-dependencies:
- dependency-name: notebook
  dependency-version: 7.5.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [jupyterlab](https://github.com/jupyterlab/jupyterlab) from 4.4.8 to 4.5.7.
- [Release notes](https://github.com/jupyterlab/jupyterlab/releases)
- [Changelog](https://github.com/jupyterlab/jupyterlab/blob/main/RELEASE.md)
- [Commits](https://github.com/jupyterlab/jupyterlab/compare/@jupyterlab/lsp@4.4.8...@jupyterlab/lsp@4.5.7)

---
updated-dependencies:
- dependency-name: jupyterlab
  dependency-version: 4.5.7
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [jupyter-server](https://github.com/jupyter-server/jupyter_server) from 2.12.5 to 2.18.0.
- [Release notes](https://github.com/jupyter-server/jupyter_server/releases)
- [Changelog](https://github.com/jupyter-server/jupyter_server/blob/main/CHANGELOG.md)
- [Commits](jupyter-server/jupyter_server@v2.12.5...v2.18.0)

---
updated-dependencies:
- dependency-name: jupyter-server
  dependency-version: 2.18.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [mistune](https://github.com/lepture/mistune) from 3.0.2 to 3.2.1.
- [Release notes](https://github.com/lepture/mistune/releases)
- [Changelog](https://github.com/lepture/mistune/blob/main/docs/changes.rst)
- [Commits](lepture/mistune@v3.0.2...v3.2.1)

---
updated-dependencies:
- dependency-name: mistune
  dependency-version: 3.2.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.41 to 3.1.50.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases)
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES)
- [Commits](gitpython-developers/GitPython@3.1.41...3.1.50)

---
updated-dependencies:
- dependency-name: gitpython
  dependency-version: 3.1.50
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.6.0 to 2.7.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](urllib3/urllib3@2.6.0...2.7.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.7.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [idna](https://github.com/kjd/idna) from 3.7 to 3.15.
- [Release notes](https://github.com/kjd/idna/releases)
- [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.md)
- [Commits](kjd/idna@v3.7...v3.15)

---
updated-dependencies:
- dependency-name: idna
  dependency-version: '3.15'
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [dulwich](https://github.com/dulwich/dulwich) from 0.21.7 to 1.2.5.
- [Release notes](https://github.com/dulwich/dulwich/releases)
- [Changelog](https://github.com/jelmer/dulwich/blob/main/NEWS)
- [Commits](jelmer/dulwich@dulwich-0.21.7...dulwich-1.2.5)

---
updated-dependencies:
- dependency-name: dulwich
  dependency-version: 1.2.5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates core scraping/tooling dependencies and refactors the Goodreads and AZQuotes spiders to tighten parsing behavior and improve robustness when expected page elements are missing.

Changes:

  • Bumped key dependencies in pyproject.toml (notably Scrapy/Black/Pytest).
  • Refined GoodreadsSpider parsing flow and added explicit missing-element checks before constructing scraped items.
  • Simplified spider parse method signatures by removing unused **kwargs (and related unused imports).

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 12 comments.

File Description
pyproject.toml Updates Scrapy/Black/Pytest versions used by the app and CI tooling.
quotes_recommender/quote_scraper/spiders/goodreads_spider.py Tightens parse/subpage parsing and adds explicit validations before item construction.
quotes_recommender/quote_scraper/spiders/azquotes_spider.py Removes unused typing import and simplifies parse signature.
Comments suppressed due to low confidence (3)

quotes_recommender/quote_scraper/spiders/goodreads_spider.py:86

  • After extracting likes with re.findall(...), the code indexes num_likes_list[0] without handling the case where the regex finds no numbers. This will raise an IndexError instead of a clear error when likes text is present but doesn't contain a number.
        num_likes_list: list[str] = re.findall(self.NUM_LIKES_REGEX, likes_text)
        if len(num_likes_list) > 1:
            raise StopDownload(fail=True)
        if not num_likes_list[0].isdigit():
            raise ValueError('num_likes is not a digit. Failed to convert to int.')

quotes_recommender/quote_scraper/spiders/goodreads_spider.py:86

  • After extracting likes with re.findall(...), the code indexes num_likes_list[0] without handling the case where the regex finds no numbers. This will raise an IndexError instead of a clear error when likes text is present but doesn't contain a number.
        num_likes_list: list[str] = re.findall(self.NUM_LIKES_REGEX, likes_text)
        if len(num_likes_list) > 1:
            raise StopDownload(fail=True)
        if not num_likes_list[0].isdigit():
            raise ValueError('num_likes is not a digit. Failed to convert to int.')

quotes_recommender/quote_scraper/spiders/goodreads_spider.py:86

  • After extracting likes with re.findall(...), the code indexes num_likes_list[0] without handling the case where the regex finds no numbers. This will raise an IndexError instead of a clear error when likes text is present but doesn't contain a number.
        num_likes_list: list[str] = re.findall(self.NUM_LIKES_REGEX, likes_text)
        if len(num_likes_list) > 1:
            raise StopDownload(fail=True)
        if not num_likes_list[0].isdigit():
            raise ValueError('num_likes is not a digit. Failed to convert to int.')

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 45 to 50
"""
Function to select data from an object.
:param response: web response from scrapy
:param kwargs: additional kwargs
:return: Generator object
"""
Comment on lines 51 to +53
for feed in response.css(self.QUOTE_FEED).extract():
yield scrapy.Request(response.urljoin(feed), callback=self.parse_subpage)
if feed is not None:
yield scrapy.Request(response.urljoin(feed), callback=self.parse_subpage)
Comment on lines 55 to 57
next_page = response.css(self.NEXT_SELECTOR).extract_first()
if next_page:
if next_page is not None:
yield scrapy.Request(response.urljoin(next_page))
yield scrapy.Request(url=url, callback=self.parse)

def parse(self, response, **kwargs: Any) -> Generator[QuoteItem, None, None]:
def parse(self, response) -> Generator[QuoteItem, None, None]:
Comment on lines 45 to 50
"""
Function to select data from an object.
:param response: web response from scrapy
:param kwargs: additional kwargs
:return: Generator object
"""
yield scrapy.Request(url=url, callback=self.parse)

def parse(self, response, **kwargs: Any) -> Generator[QuoteItem, None, None]:
def parse(self, response) -> Generator[QuoteItem, None, None]:
Comment on lines 45 to 50
"""
Function to select data from an object.
:param response: web response from scrapy
:param kwargs: additional kwargs
:return: Generator object
"""
Comment on lines 51 to +53
for feed in response.css(self.QUOTE_FEED).extract():
yield scrapy.Request(response.urljoin(feed), callback=self.parse_subpage)
if feed is not None:
yield scrapy.Request(response.urljoin(feed), callback=self.parse_subpage)
Comment on lines 55 to 57
next_page = response.css(self.NEXT_SELECTOR).extract_first()
if next_page:
if next_page is not None:
yield scrapy.Request(response.urljoin(next_page))
yield scrapy.Request(url=url, callback=self.parse)

def parse(self, response, **kwargs: Any) -> Generator[QuoteItem, None, None]:
def parse(self, response) -> Generator[QuoteItem, None, None]:
@mathun3003 mathun3003 self-assigned this May 31, 2026
@mathun3003

Copy link
Copy Markdown
Owner Author

@copilot fix the three failed code style checks. Fix the black and pylint linting errors. Also fix the mypy typing errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants