Skip to content
This repository was archived by the owner on Jun 13, 2026. It is now read-only.

Update solr 8.3 to 8.11.2 dockerfile#124

Closed
aprilrieger wants to merge 90 commits into
notch8:mainfrom
wvulibraries:update-solr-8.3-to-8.11.2-Dockerfile
Closed

Update solr 8.3 to 8.11.2 dockerfile#124
aprilrieger wants to merge 90 commits into
notch8:mainfrom
wvulibraries:update-solr-8.3-to-8.11.2-Dockerfile

Conversation

@aprilrieger

Copy link
Copy Markdown
Member

Brings in the updated solr image to build and a symblink to the solr folder it needs for build the image that is inside hyku through the knapsack.

trmccormick and others added 30 commits February 26, 2026 17:03
… using docker volumes for production. some changes broke stackcar.
…veloment reverted change due to stack_car issues.
Added .env to each env_file section.
…tion

config/environments/production.rb in the knapsack is never loaded by Rails
(only the main app's environment files are loaded). The host allowlist must
be set from a knapsack initializer, which the engine loads via
hyku_knapsack.load_initializers after :load_config_initializers.

Allows:
  *.lib.wvu.edu (production)
  *.localhost.direct (local compose testing)
  localhost (local compose testing)

Patterns include optional port suffix to match raw_host_with_port used by
Rails 7.2 HostAuthorization middleware.
hyrax-webapp hard-codes config.force_ssl = true but the local compose stack
has no SSL-terminating reverse proxy. Without SSL termination, Rails redirects
every HTTP request to HTTPS, Puma receives an SSL handshake it can't handle,
and the stack is unreachable.

When DISABLE_FORCE_SSL=true the knapsack initializer removes ActionDispatch::SSL
from the middleware stack, allowing plain HTTP access. Set this in the local
.env.production (which is gitignored via .env.*).

On a real deployment where a CDN / LB provides SSL termination, this var
should be absent (default behaviour: force_ssl enforced).
…uction

localhost.direct is specific to the Stackcar/Traefik local dev tooling and
is not a general solution for local production-mode testing.

Keep only the *.lib.wvu.edu Regexp for the real VM deployment.
Retain the DISABLE_FORCE_SSL escape hatch (off by default, must be
explicitly set in .env.production to activate).
'.lib.wvu.edu' with a leading dot is Rails' standard subdomain wildcard —
it matches request.host (no port) as a suffix, covering hyku.lib.wvu.edu
and any other *.lib.wvu.edu hostname. Simpler and consistent with the
established WVU pattern.
Handles the three tasks that must be done manually on the VM after
first boot (unlike the Stackcar/Traefik local dev setup where the
admin tenant is created automatically):

  1. Database - db:create + db:schema:load (or db:migrate) + db:seed
  2. Solr configset upload and assignment
  3. Admin tenant Account record via CreateAccount service

The script is idempotent (safe to re-run) and uses the Rails runner
heredoc form to avoid shell quoting issues.

Run from the host after the stack is healthy:
  docker-compose -f docker-compose.production.yml exec web \
    sh /app/samvera/scripts/setup.sh

No extra volume mount needed - ./ is already bind-mounted at
/app/samvera in the web container.
- startup-solr.sh: replace 'solr start -f' with solr-foreground via SolrCloud
  - Push security.json to ZooKeeper before starting
  - Use SOLR_ENABLE_CLOUD_MODE=yes read by solr-foreground script
  - Remove manual core directory setup (not needed in cloud mode)

- docker-compose.production.yml: align Solr service with hyrax-webapp pattern
  - Add SOLR_ENABLE_CLOUD_MODE=yes, SOLR_CLOUD_BOOTSTRAP=yes, ZK_HOST env vars
  - Add depends_on: zoo: condition: service_healthy
  - Change volume from hydra_prod core dir to /var/solr (cloud layout)
  - Update healthcheck to use SolrCloud collections API with credentials

- scripts/setup.sh: fix Step 3 - admin host is NOT an Account record
  - Hyku routes admin interface separately; HYKU_ADMIN_HOST cname is reserved
  - Step 3 now creates first *repository* tenant from optional env vars
  - If unset, skip and direct user to admin UI

- .env.production.example: add HYKU_FIRST_TENANT_NAME/CNAME optional vars
…ction

- Rename solr9-setup/ to solr-setup/ (Solr image is actually 8.3.1, not 9)
- startup-solr.sh: use 'solr start -f -c -z' for SolrCloud mode instead of
  solr-foreground (which breaks bind mounts due to cp -p permission errors)
  - Seed solr.xml from /opt/solr/server/solr/solr.xml when data dir is fresh
  - Reference /solr-setup/security.json (renamed folder)
- docker-compose.production.yml:
  - Volume: /var/solr/data bind mount (not full /var/solr)
  - solr-setup:/solr-setup (renamed volume)
  - initialize_app: make hydra-production collection explicit
- .env.production.example: SOLR_COLLECTION_NAME=hydra-production, fix comment
These were only used by the old standalone startup-solr.sh.
The configset is now uploaded from hyrax-webapp/solr/conf by initialize_app.
Only security.json remains, used to push auth config to ZooKeeper.
HSTS cached by browsers for localhost.direct (from Stack Car/Traefik)
causes them to force HTTPS, breaking plain-HTTP Puma in local prod testing.

- host_authorization.rb: read HYKU_EXTRA_HOSTS (comma-separated) and add
  each entry to config.hosts at boot - keeps production config clean
- .env.production.example: document HYKU_EXTRA_HOSTS and lvh.me pattern

Local .env.production (gitignored) uses:
  HYKU_ROOT_HOST=lvh.me
  HYKU_ADMIN_HOST=admin-wvu_knapsack.lvh.me
  HYKU_DEFAULT_HOST=%{tenant}-wvu_knapsack.lvh.me
  HYKU_EXTRA_HOSTS=.lvh.me

Access admin UI at: http://admin-wvu_knapsack.lvh.me:3000
On VM: leave HYKU_EXTRA_HOSTS unset; set HYKU_*_HOST to lib.wvu.edu values.
trmccormick and others added 27 commits March 9, 2026 20:07
- Split .env.production into per-service files:
  .env.db      → POSTGRES_* vars for postgres container
  .env.redis   → REDIS_PASSWORD for redis container
  .env.solr    → SOLR_ADMIN_* + ZooKeeper for solr container
  .env.fedora  → fully-expanded JAVA_OPTS for fcrepo container
- Add .env.*.example committed templates for each service file
- Wire per-service env_file: in both docker-compose files
- Fix redis auth: requirepass via sh -c wrapper (exec form
  does not expand shell vars without sh -c)
- Fix solr healthcheck: use $$SOLR_ADMIN_USER:$$SOLR_ADMIN_PASSWORD
- Remove fcrepo environment: block (JAVA_OPTS now in .env.fedora)
- Remove db YAML translation block (POSTGRES_* read directly)
- Update HYKU_BUILD_GUIDE.md: repo layout, Step 1/2, Key files table
fits was present in docker-compose.yml (Stack Car) via extends from
hyrax-webapp but was omitted from the standalone production files.

- Add fits service (ghcr.io/samvera/fitsservlet:1.6.0) to both
  docker-compose.local.yml and docker-compose.production.yml
- Add fits: condition: service_started to worker and web depends_on
  in both files
…arm64 backlog

- docker-compose.local.yml: start_period 300s→600s, retries 30→60
  (~16 min total window; M4 QEMU amd64 JVM can take 10+ min)
- Guide: update RHEL vs Mac perf note to call out M4 specifically
- Guide: add Solr unhealthy troubleshooting row with M4/QEMU root cause
- Note: native arm64 Solr image is on Notch8 backlog
…bug)

Root cause:
- hyrax-webapp's FileSetDerivativesServiceDecorator passes layer: 0
  unconditionally for all image types (intended for pyramidal TIFFs).
- hydra-derivatives' selected_layers() treats integer 0 as truthy and
  calls image.layers[0], which returns Image.new(path+'[0]') — @tempfile nil.
- Image#format('jpg') with @tempfile nil computes new_path by calling
  Pathname(path+[0]).sub_ext('.jpg') → strips '[0]' → same as source path.
- IM7 runs: magick convert file.jpg[0] file.jpg (same src/dest),
  truncates the destination before reading → No such file or directory.

Symptom: ValkyrieCreateDerivativesJob fails on every image ingest;
UV shows broken images because no thumbnail derivative is written.

Fix (config/initializers/derivatives_im7_fix.rb):
- Prepend a patch after_initialize that overrides create_image_derivatives
  to only pass layer: 0 for TIFF and PDF sources (the only formats that
  actually have multiple layers requiring selection).
- For JPEG and other single-layer formats, layer: 0 is omitted, so
  selected_layers returns the original Image.open() with @tempfile set,
  and format() correctly writes to a separate temp file.

Re-run derivatives on existing broken records via rails console:
  ValkyrieCreateDerivativesJob.perform_later(file_set.id.to_s)
…OPTS→.env.fedora, SOLR_ADMIN_*→.env.solr)
… marks importer failed instead of crashing worker
… fix stale script names and compose file in README
@gitguardian

gitguardian Bot commented Apr 7, 2026

Copy link
Copy Markdown

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
29776573 Triggered Generic Database Assignment f4c742f .env.production.example View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@aprilrieger

Copy link
Copy Markdown
Member Author

wrong repo

@aprilrieger aprilrieger closed this Apr 7, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants