Sync with WVU Main sha #9074370#123
Conversation
… using docker volumes for production. some changes broke stackcar.
…veloment reverted change due to stack_car issues.
Added .env to each env_file section.
…tion config/environments/production.rb in the knapsack is never loaded by Rails (only the main app's environment files are loaded). The host allowlist must be set from a knapsack initializer, which the engine loads via hyku_knapsack.load_initializers after :load_config_initializers. Allows: *.lib.wvu.edu (production) *.localhost.direct (local compose testing) localhost (local compose testing) Patterns include optional port suffix to match raw_host_with_port used by Rails 7.2 HostAuthorization middleware.
hyrax-webapp hard-codes config.force_ssl = true but the local compose stack has no SSL-terminating reverse proxy. Without SSL termination, Rails redirects every HTTP request to HTTPS, Puma receives an SSL handshake it can't handle, and the stack is unreachable. When DISABLE_FORCE_SSL=true the knapsack initializer removes ActionDispatch::SSL from the middleware stack, allowing plain HTTP access. Set this in the local .env.production (which is gitignored via .env.*). On a real deployment where a CDN / LB provides SSL termination, this var should be absent (default behaviour: force_ssl enforced).
…uction localhost.direct is specific to the Stackcar/Traefik local dev tooling and is not a general solution for local production-mode testing. Keep only the *.lib.wvu.edu Regexp for the real VM deployment. Retain the DISABLE_FORCE_SSL escape hatch (off by default, must be explicitly set in .env.production to activate).
'.lib.wvu.edu' with a leading dot is Rails' standard subdomain wildcard — it matches request.host (no port) as a suffix, covering hyku.lib.wvu.edu and any other *.lib.wvu.edu hostname. Simpler and consistent with the established WVU pattern.
Handles the three tasks that must be done manually on the VM after
first boot (unlike the Stackcar/Traefik local dev setup where the
admin tenant is created automatically):
1. Database - db:create + db:schema:load (or db:migrate) + db:seed
2. Solr configset upload and assignment
3. Admin tenant Account record via CreateAccount service
The script is idempotent (safe to re-run) and uses the Rails runner
heredoc form to avoid shell quoting issues.
Run from the host after the stack is healthy:
docker-compose -f docker-compose.production.yml exec web \
sh /app/samvera/scripts/setup.sh
No extra volume mount needed - ./ is already bind-mounted at
/app/samvera in the web container.
- startup-solr.sh: replace 'solr start -f' with solr-foreground via SolrCloud - Push security.json to ZooKeeper before starting - Use SOLR_ENABLE_CLOUD_MODE=yes read by solr-foreground script - Remove manual core directory setup (not needed in cloud mode) - docker-compose.production.yml: align Solr service with hyrax-webapp pattern - Add SOLR_ENABLE_CLOUD_MODE=yes, SOLR_CLOUD_BOOTSTRAP=yes, ZK_HOST env vars - Add depends_on: zoo: condition: service_healthy - Change volume from hydra_prod core dir to /var/solr (cloud layout) - Update healthcheck to use SolrCloud collections API with credentials - scripts/setup.sh: fix Step 3 - admin host is NOT an Account record - Hyku routes admin interface separately; HYKU_ADMIN_HOST cname is reserved - Step 3 now creates first *repository* tenant from optional env vars - If unset, skip and direct user to admin UI - .env.production.example: add HYKU_FIRST_TENANT_NAME/CNAME optional vars
…ction - Rename solr9-setup/ to solr-setup/ (Solr image is actually 8.3.1, not 9) - startup-solr.sh: use 'solr start -f -c -z' for SolrCloud mode instead of solr-foreground (which breaks bind mounts due to cp -p permission errors) - Seed solr.xml from /opt/solr/server/solr/solr.xml when data dir is fresh - Reference /solr-setup/security.json (renamed folder) - docker-compose.production.yml: - Volume: /var/solr/data bind mount (not full /var/solr) - solr-setup:/solr-setup (renamed volume) - initialize_app: make hydra-production collection explicit - .env.production.example: SOLR_COLLECTION_NAME=hydra-production, fix comment
These were only used by the old standalone startup-solr.sh. The configset is now uploaded from hyrax-webapp/solr/conf by initialize_app. Only security.json remains, used to push auth config to ZooKeeper.
HSTS cached by browsers for localhost.direct (from Stack Car/Traefik)
causes them to force HTTPS, breaking plain-HTTP Puma in local prod testing.
- host_authorization.rb: read HYKU_EXTRA_HOSTS (comma-separated) and add
each entry to config.hosts at boot - keeps production config clean
- .env.production.example: document HYKU_EXTRA_HOSTS and lvh.me pattern
Local .env.production (gitignored) uses:
HYKU_ROOT_HOST=lvh.me
HYKU_ADMIN_HOST=admin-wvu_knapsack.lvh.me
HYKU_DEFAULT_HOST=%{tenant}-wvu_knapsack.lvh.me
HYKU_EXTRA_HOSTS=.lvh.me
Access admin UI at: http://admin-wvu_knapsack.lvh.me:3000
On VM: leave HYKU_EXTRA_HOSTS unset; set HYKU_*_HOST to lib.wvu.edu values.
- Split .env.production into per-service files: .env.db → POSTGRES_* vars for postgres container .env.redis → REDIS_PASSWORD for redis container .env.solr → SOLR_ADMIN_* + ZooKeeper for solr container .env.fedora → fully-expanded JAVA_OPTS for fcrepo container - Add .env.*.example committed templates for each service file - Wire per-service env_file: in both docker-compose files - Fix redis auth: requirepass via sh -c wrapper (exec form does not expand shell vars without sh -c) - Fix solr healthcheck: use $$SOLR_ADMIN_USER:$$SOLR_ADMIN_PASSWORD - Remove fcrepo environment: block (JAVA_OPTS now in .env.fedora) - Remove db YAML translation block (POSTGRES_* read directly) - Update HYKU_BUILD_GUIDE.md: repo layout, Step 1/2, Key files table
fits was present in docker-compose.yml (Stack Car) via extends from hyrax-webapp but was omitted from the standalone production files. - Add fits service (ghcr.io/samvera/fitsservlet:1.6.0) to both docker-compose.local.yml and docker-compose.production.yml - Add fits: condition: service_started to worker and web depends_on in both files
…arm64 backlog - docker-compose.local.yml: start_period 300s→600s, retries 30→60 (~16 min total window; M4 QEMU amd64 JVM can take 10+ min) - Guide: update RHEL vs Mac perf note to call out M4 specifically - Guide: add Solr unhealthy troubleshooting row with M4/QEMU root cause - Note: native arm64 Solr image is on Notch8 backlog
…bug) Root cause: - hyrax-webapp's FileSetDerivativesServiceDecorator passes layer: 0 unconditionally for all image types (intended for pyramidal TIFFs). - hydra-derivatives' selected_layers() treats integer 0 as truthy and calls image.layers[0], which returns Image.new(path+'[0]') — @tempfile nil. - Image#format('jpg') with @tempfile nil computes new_path by calling Pathname(path+[0]).sub_ext('.jpg') → strips '[0]' → same as source path. - IM7 runs: magick convert file.jpg[0] file.jpg (same src/dest), truncates the destination before reading → No such file or directory. Symptom: ValkyrieCreateDerivativesJob fails on every image ingest; UV shows broken images because no thumbnail derivative is written. Fix (config/initializers/derivatives_im7_fix.rb): - Prepend a patch after_initialize that overrides create_image_derivatives to only pass layer: 0 for TIFF and PDF sources (the only formats that actually have multiple layers requiring selection). - For JPEG and other single-layer formats, layer: 0 is omitted, so selected_layers returns the original Image.open() with @tempfile set, and format() correctly writes to a separate temp file. Re-run derivatives on existing broken records via rails console: ValkyrieCreateDerivativesJob.perform_later(file_set.id.to_s)
…event uid 1001 permission errors
…r is writable on fresh clone
…l.sh / down.sc.local.sh
…ontainer reads .env.db)
…OPTS→.env.fedora, SOLR_ADMIN_*→.env.solr)
… marks importer failed instead of crashing worker
…xes UV ERR_CONNECTION_REFUSED locally)
… fix stale script names and compose file in README
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| - | - | Generic Database Assignment | f4c742f | .env.production.example | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Story
Refs #issuenumber
Expected Behavior Before Changes
Expected Behavior After Changes
Screenshots / Video
Notes