Skip to content

Support delta and iceberg via builder pattern#146

Merged
george-zubrienko merged 13 commits into
mainfrom
iceberg
Dec 18, 2025
Merged

Support delta and iceberg via builder pattern#146
george-zubrienko merged 13 commits into
mainfrom
iceberg

Conversation

@george-zubrienko

@george-zubrienko george-zubrienko commented Dec 17, 2025

Copy link
Copy Markdown
Contributor

Key changes:

1. SparkSessionProvider Refactor and Catalog Support

  • Refactored SparkSessionProvider to enable modular configuration for Delta Lake, Iceberg REST, and Hive Metastore backends using dedicated with_* methods and config models.
  • Added support for using Apache Iceberg through session configurator

2. Docker Compose Stack for Local Development

  • Added a new docker-compose.yaml for testing Iceberg integration.

3. CI/CD Workflow and Dependency Updates

  • Updated GitHub Actions workflows for building and releasing: switched to Docker Compose for environment setup, upgraded action versions, improved dependency installation via a reusable Poetry action, and set up code coverage configuration to exclude tests.
  • Modernized dependencies in pyproject.toml, including making delta-spark and kubernetes optional and updating dev dependencies to latest major versions. Added pytest config to specify test paths.

4. Linting and Minor Configuration Improvements

  • Adjusted .pylintrc to enable missing module docstring warnings and increased maximum arguments.

AI generated summary - edited

@github-actions

github-actions Bot commented Dec 17, 2025

Copy link
Copy Markdown

Coverage

Coverage Report
FileStmtsMissCoverMissing
bootstrap-lk.py131023%6–46, 50
spark_utils/common
   functions.py32488%56, 62, 80, 112
   spark_job_args.py68987%106, 113, 135, 151, 170, 179–183
   spark_session_provider.py832471%60–61, 110, 128, 155, 171, 190–201, 211–281, 294, 310
   spark_sql_utils.py201050%54–63, 79–82
   spark_udf.py4250%36–37
spark_utils/dataframes
   functions.py984851%48, 55–65, 76–78, 93–95, 108–122, 141, 151, 191, 199–202, 211, 230, 246–250, 259–272
spark_utils/dataframes/sets
   functions.py6433%39–42
spark_utils/delta_lake
   delta_log.py17194%52
   functions.py603837%57–67, 97–127, 140–160, 196–197, 200, 220–223
spark_utils/models
   iceberg_rest_config.py21386%51–53
   k8s_config.py18289%18–19
test
   test_iceberg.py15473%9–10, 20–21
   test_spark_session_provider.py24675%11–12, 24–25, 35–36
TOTAL69316576% 

Tests Skipped Failures Errors Time
59 0 💤 0 ❌ 0 🔥 1m 30s ⏱️

@george-zubrienko george-zubrienko marked this pull request as ready for review December 18, 2025 08:50
@george-zubrienko george-zubrienko requested a review from a team as a code owner December 18, 2025 08:50
@george-zubrienko george-zubrienko merged commit 8dbdf4a into main Dec 18, 2025
1 check passed
@george-zubrienko george-zubrienko deleted the iceberg branch December 18, 2025 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants