Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 이번 PR은 서비스의 안정적인 운영을 위해 무중단 배포 체계를 도입하고, 장애 발생 시 신속하게 대응할 수 있도록 Slack 알림 시스템을 구축하는 것을 목적으로 합니다. Traefik을 이용한 Blue-Green 배포 구조를 통해 배포 중 서비스 중단을 최소화하며, 커스텀 로깅 필터와 레이아웃을 통해 가독성 높은 에러 리포팅 환경을 마련했습니다. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Blue and Green in dance, Traefik guides the traffic flow, Errors caught in Slack, Stability starts to grow, Deployment without a woe. Footnotes
|
Summary by CodeRabbit릴리스 노트
Walkthrough블루-그린 배포 아키텍처가 도입되었으며, Traefik 리버스 프록시를 통해 무중단 배포를 구현하고, Slack 통합 로깅 기능이 추가되었습니다. 배포 및 롤백 스크립트가 헬스 체크와 함께 제공됩니다. Changes
Sequence Diagram(s)sequenceDiagram
participant CD as GitHub Actions
participant Docker as Docker Daemon
participant Blue as API Blue
participant Green as API Green
participant Traefik as Traefik
participant Health as Health Check
participant Slack as Slack
CD->>Docker: 새로운 타겟(Green) 시작
Docker->>Green: 컨테이너 시작
Green-->>Docker: 시작됨
CD->>Health: Green 헬스 체크 시작 (poll)
loop 헬스 대기
Health->>Docker: inspect Green 상태
Docker-->>Health: Running...
alt Healthy 확인
Health-->>CD: 건강 상태 확인
end
end
CD->>Traefik: 라우팅 안정화 대기
CD->>Docker: Blue 컨테이너 종료 (graceful)
Docker->>Blue: SIGTERM 신호
Blue-->>Docker: 종료됨
Docker->>Blue: 강제 제거
Note over Traefik: 트래픽 Green → Blue 전환 (라벨 기반)
CD->>Health: 외부 헬스 체크 (curl)
Health-->>CD: 200 OK
CD->>Slack: 배포 완료 로그 전송
sequenceDiagram
participant User as 사용자
participant Script as Rollback Script
participant Docker as Docker Daemon
participant Blue as API Blue
participant Green as API Green
participant Health as Health Check
User->>Script: rollback.sh 실행
Script->>Docker: 활성 컨테이너 확인 (Blue/Green)
Docker-->>Script: Green 활성 (Blue 비활성)
Script->>Docker: Blue 존재 여부 확인
Docker-->>Script: Blue 없음
Script->>Docker: Blue 생성 및 시작
Docker->>Blue: 컨테이너 시작
Blue-->>Docker: Running...
Script->>Health: Blue 헬스 체크 폴링 시작
loop 헬스 대기
Health->>Docker: inspect Blue 상태
Docker-->>Health: Health: healthy
end
Script->>Docker: 라우팅 안정화 대기
Script->>Docker: Green 컨테이너 종료
Docker->>Green: SIGTERM
Green-->>Docker: 종료됨
Docker->>Green: 강제 제거
Script-->>User: 롤백 완료
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements a Blue-Green deployment strategy using Traefik and Docker Compose, supported by new deployment and rollback scripts. It also integrates a Slack notification system for error logging, including a custom layout and a rate-limiting filter to manage alert frequency. Feedback focuses on enhancing code quality and maintainability, specifically recommending the use of thread-safe DateTimeFormatter for date handling, utilizing YAML anchors to eliminate configuration redundancy, and removing sudo from deployment scripts to improve portability. Additionally, it is suggested to use variables for timeout values in scripts to ensure consistency.
| public class SlackErrorLayout extends LayoutBase<ILoggingEvent> { | ||
|
|
||
| private static final int MAX_STACK_TRACE_LENGTH = 2000; | ||
| private static final SimpleDateFormat DATE_FORMAT = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); |
There was a problem hiding this comment.
SimpleDateFormat은 스레드 안전(thread-safe)하지 않습니다. Layout.doLayout은 여러 스레드에서 동시에 호출될 수 있으므로, 날짜 형식이 손상되거나 예외가 발생할 위험이 있습니다. Java 8 이상에서는 스레드 안전한 java.time.format.DateTimeFormatter를 사용하는 것이 권장됩니다.
| private static final SimpleDateFormat DATE_FORMAT = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); | |
| private static final java.time.format.DateTimeFormatter DATE_FORMAT = java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss").withZone(java.time.ZoneId.systemDefault()); |
| sb.append(":label: *Request ID:* `").append(requestId).append("`\n"); | ||
| } | ||
|
|
||
| sb.append(":clock3: *Time:* ").append(DATE_FORMAT.format(new Date(event.getTimeStamp()))).append("\n"); |
There was a problem hiding this comment.
DateTimeFormatter를 사용하도록 변경함에 따라 날짜 포맷팅 로직을 수정해야 합니다.
| sb.append(":clock3: *Time:* ").append(DATE_FORMAT.format(new Date(event.getTimeStamp()))).append("\n"); | |
| sb.append(":clock3: *Time:* ").append(DATE_FORMAT.format(java.time.Instant.ofEpochMilli(event.getTimeStamp()))).append("\n"); |
| container_name: nginx | ||
| ports: | ||
| - "80:80" | ||
| kokomen-api-green: |
| # Step 1: Traefik이 실행 중인지 확인 | ||
| if ! docker ps -q -f name=traefik | grep -q .; then | ||
| log_info "Step 0: Traefik 시작" | ||
| sudo -E docker compose -f $COMPOSE_FILE up -d traefik |
|
|
||
| # 현재 활성 컨테이너 종료 | ||
| log_info "현재 컨테이너 종료: kokomen-api-$CURRENT" | ||
| docker stop -t 65 "kokomen-api-$CURRENT" || true |
Test Results 50 files 50 suites 1m 23s ⏱️ Results for commit 8552a6d. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/cd-api-prod.yml (1)
53-62:⚠️ Potential issue | 🟠 Major배포 job이 빌드한 커밋이 아니라
main최신 상태를 다시 가져오고 있습니다.Line 59-62는 deploy 시점의
mainHEAD를 다시 checkout/pull하므로, build job이 만든 이미지와 다른 revision의deploy.sh/docker-compose-prod.yml/traefik.yml조합을 배포할 수 있습니다. blue-green 전환 로직은 이미지와 배포 스크립트가 같은 커밋이어야 하니${{ github.sha }}로 고정하는 편이 안전합니다.수정 예시
- git fetch origin main - git checkout main + git fetch origin "${GITHUB_SHA}" + git checkout --detach "${GITHUB_SHA}" git sparse-checkout set docker/prod/api - git pull origin main🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/cd-api-prod.yml around lines 53 - 62, The deployment step "Pull docker compose and deployment files" currently checks out and pulls the latest main, causing deploy to potentially use a different revision than the build; change it to fetch and checkout the exact build commit (use the GitHub Actions commit variable ${{ github.sha }}) instead of pulling main so the deployed docker/prod/api files (docker-compose-prod.yml, deploy.sh, traefik.yml) match the built image; specifically, replace the git checkout/pull of main with a fetch of the specific commit and checkout that commit (or use FETCH_HEAD/$GITHUB_SHA) after sparse-checkout set, ensuring the workflow variable ${{ github.sha }} is used to lock the revision.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/cd-api-prod.yml:
- Around line 93-96: The "Verify deployment" step currently swallows failures by
using `curl -sf ... || echo ...`; change it so a failed health check causes the
job to fail by replacing the swallow logic with an explicit non-zero exit path
(e.g., run the same curl and on failure print the message and exit 1). Locate
the step named "Verify deployment" and the line containing `curl -sf
http://localhost:80/actuator/health || echo "Health check endpoint not
responding"` and update it to print the error message and then exit non-zero so
the workflow fails and can trigger rollback/alerting.
In `@build.gradle`:
- Line 57: Replace the incompatible Logback Slack appender dependency
'com.github.maricn:logback-slack-appender:1.6.1' with the maintained fork
'com.cyfrania:logback-slack-appender:1.2' in the Gradle dependencies so the app
uses a Logback 1.5.x–compatible, actively maintained artifact; update the
dependency declaration where the current implementation line contains
'com.github.maricn:logback-slack-appender:1.6.1' to use
'com.cyfrania:logback-slack-appender:1.2' instead.
In `@docker/prod/api/deploy.sh`:
- Around line 13-21: The script mixes plain docker calls with sudo -E docker
compose causing failures on hosts where docker requires sudo; define a single
command variable (e.g., DOCKER_CMD) that detects whether sudo is needed (try
running `docker ps` and fall back to `sudo -E docker`) and use that variable
everywhere instead of raw `docker` or `sudo -E docker compose`, and update all
affected helpers (get_active, wait_healthy, the old-container cleanup logic and
any docker compose invocations) to invoke $DOCKER_CMD (and $DOCKER_CMD compose)
so all docker operations run under the same privilege model.
- Line 4: COMPOSE_FILE is set to a relative file name
("docker-compose-prod.yml") which breaks when deploy.sh is run outside
docker/prod/api; change the assignment in deploy.sh so COMPOSE_FILE points to
the docker-compose-prod.yml located next to the script by resolving the script's
directory (i.e., base the path on the script's directory rather than the current
working directory) so references to "docker-compose-prod.yml" always work
regardless of where the script is executed.
In `@docker/prod/api/docker-compose-prod.yml`:
- Around line 49-57: The Traefik labels currently use identical router/service
names (traefik.http.routers.api.* and traefik.http.services.api.*) so blue and
green containers are merged into one LB pool; change the strategy to perform
real blue-green switches by either (a) assigning distinct service/router names
per color (e.g., traefik.http.services.api-blue.* and
traefik.http.services.api-green.* with matching
traefik.http.routers.api-blue.rule / api-green.rule), or (b) toggling
traefik.enable=true/false on the inactive container from your deployment scripts
(deploy.sh / rollback.sh), or (c) use router priority and switch the active
router during deploy; update the docker-compose labels and the deploy/rollback
scripts to implement one of these approaches so only the active color receives
traffic.
In `@docker/prod/api/rollback.sh`:
- Around line 12-29: Mixed use of sudo causes failures when Docker requires
elevated privileges; make Docker invocations consistent by introducing a single
command variable (e.g., DOCKER_CMD) that conditionally includes "sudo -E" and
use that variable everywhere (replace direct calls in get_active,
check_container_exists and all other docker ps/inspect/start/stop/rm/docker
compose usages across the script) so active color detection, health checks and
container cleanup all run with the same privilege context.
- Around line 5-6: 현재 HEALTH_TIMEOUT(60)과 HEALTH_INTERVAL(5)은 Compose
healthcheck 설정(start_period: 40s, interval: 10s)의 마지막 가능한 검사 시점보다 짧아 false
negative가 발생할 수 있으므로 HEALTH_TIMEOUT 값을 Compose의 start_period 및 interval을 고려해 늘려야
합니다; 업데이트할 곳은 스크립트 상단의 HEALTH_TIMEOUT/HEALTH_INTERVAL 변수와 루프 검사 타이밍을 사용하는
rollback logic(해당 블록 및 37-47행 근처)이고, 간단히 HEALTH_TIMEOUT을 start_period + interval
(또는 그 이상, 예: 40s + 10s = 50s보다 크게) 기준으로 재계산하거나 문서화된 Compose 값에 맞춰 더 큰 값(예: 70s)을
설정하여 마지막 Compose 건강검사 후에도 롤백 타임아웃이 남도록 조정하세요.
- Line 4: The COMPOSE_FILE variable in rollback.sh is currently a bare filename
which breaks when the script is run outside docker/prod/api; update rollback.sh
to build an explicit path for COMPOSE_FILE (e.g., derive the script directory
with dirname "$0" or resolve the repository root) and join it with
"docker-compose-prod.yml" so the script always points to the correct file
regardless of current working directory; change the COMPOSE_FILE assignment in
rollback.sh accordingly (refer to the COMPOSE_FILE variable and the rollback.sh
script).
In
`@src/main/java/com/samhap/kokomen/global/logging/RateLimitingTurboFilter.java`:
- Around line 11-44: The current RateLimitingTurboFilter class (decide method)
is a TurboFilter that returns DENY for rate-limited ERROR events which prevents
all appenders (Slack, file, console) from receiving them; change this to an
appender-level Filter implementation (e.g., create RateLimitingFilter that
extends ch.qos.logback.core.filter.Filter and implements decide/decideInternal)
and attach that filter only to the Slack appender so only Slack notifications
are rate-limited; update places that register RateLimitingTurboFilter to instead
add the new RateLimitingFilter to the Slack appender configuration and keep the
global TurboFilter logic removed or converted to a NEUTRAL-only TurboFilter if
global behavior is still needed.
- Around line 29-37: The current pattern in RateLimitingTurboFilter using
lastLogTimes.get(errorKey) followed by lastLogTimes.put(errorKey, now) is not
atomic and can race; replace that logic by using lastLogTimes.compute(errorKey,
...) to atomically inspect the previous timestamp and decide whether to update
to now or keep the old value so the cooldownMillis check is done atomically;
update the code around buildErrorKey(...) / lastLogTimes / cooldownMillis so
compute returns the correct stored timestamp and use its result to return
FilterReply.DENY when the cooldown has not expired, otherwise allow logging and
set the timestamp to now.
In `@src/main/java/com/samhap/kokomen/global/logging/SlackErrorLayout.java`:
- Around line 13-15: SlackErrorLayout currently defines DATE_FORMAT as a shared
SimpleDateFormat which is not thread-safe; replace it with a thread-safe
java.time DateTimeFormatter (e.g. DateTimeFormatter.ofPattern("yyyy-MM-dd
HH:mm:ss")) and update all code in SlackErrorLayout that uses DATE_FORMAT to
format timestamps via java.time types (Instant/LocalDateTime/ZonedDateTime with
a ZoneId) instead of SimpleDateFormat; keep MAX_STACK_TRACE_LENGTH as-is and
ensure the new DateTimeFormatter is a static final constant so formatting is
immutable and safe in multithreaded contexts.
In `@src/main/resources/logback-spring.xml`:
- Around line 49-58: The AsyncAppender named ASYNC_SLACK is currently using
blocking behavior (default neverBlock=false) which can delay request threads
when the queue is full; update the ASYNC_SLACK appender configuration to enable
non-blocking mode by adding the neverBlock element with value true (i.e., set
neverBlock to true on the AsyncAppender definition) so events are dropped
instead of blocking request processing; ensure you modify the <appender
name="ASYNC_SLACK" class="ch.qos.logback.classic.AsyncAppender"> block (the
AsyncAppender configuration that references SLACK) to include
<neverBlock>true</neverBlock> and keep includeCallerData, queueSize and
discardingThreshold as appropriate.
- Around line 9-12: The global turboFilter using
com.samhap.kokomen.global.logging.RateLimitingTurboFilter currently denies
repeated ERROR events for all appenders, causing FILE logs to be dropped; update
the logging config so rate-limiting applies only to the Slack appender: remove
or disable the <turboFilter> block that registers RateLimitingTurboFilter
globally and instead add an equivalent rate-limit filter configuration inside
the SLACK appender (or create a separate appender-ref/group for SLACK that uses
RateLimitingTurboFilter), ensuring the FILE appender remains unfiltered so ERROR
events still reach file logs.
---
Outside diff comments:
In @.github/workflows/cd-api-prod.yml:
- Around line 53-62: The deployment step "Pull docker compose and deployment
files" currently checks out and pulls the latest main, causing deploy to
potentially use a different revision than the build; change it to fetch and
checkout the exact build commit (use the GitHub Actions commit variable ${{
github.sha }}) instead of pulling main so the deployed docker/prod/api files
(docker-compose-prod.yml, deploy.sh, traefik.yml) match the built image;
specifically, replace the git checkout/pull of main with a fetch of the specific
commit and checkout that commit (or use FETCH_HEAD/$GITHUB_SHA) after
sparse-checkout set, ensuring the workflow variable ${{ github.sha }} is used to
lock the revision.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 845eb313-365c-4e95-9725-e52a7abad74e
📒 Files selected for processing (13)
.github/workflows/cd-api-dev.yml.github/workflows/cd-api-prod.ymlbuild.gradledocker/dev/docker-compose-dev.ymldocker/prod/api/deploy.shdocker/prod/api/docker-compose-prod.ymldocker/prod/api/rollback.shdocker/prod/api/traefik/traefik.ymlsrc/main/java/com/samhap/kokomen/global/logging/RateLimitingTurboFilter.javasrc/main/java/com/samhap/kokomen/global/logging/SlackErrorLayout.javasrc/main/resources/application-dev.ymlsrc/main/resources/application-prod.ymlsrc/main/resources/logback-spring.xml
| implementation 'org.springframework.boot:spring-boot-starter-actuator' | ||
| implementation 'ch.qos.logback.contrib:logback-json-classic:0.1.5' | ||
| implementation 'ch.qos.logback.contrib:logback-jackson:0.1.5' | ||
| implementation 'com.github.maricn:logback-slack-appender:1.6.1' |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Is com.github.maricn:logback-slack-appender:1.6.1 compatible with Logback 1.5.x and Spring Boot 3.5.x?
💡 Result:
No, com.github.maricn:logback-slack-appender:1.6.1 is likely not fully compatible with Logback 1.5.x (used by Spring Boot 3.5.x). The original repository is archived since Feb 2024 with no updates since 2021, and the maintainer recommends the maintained fork https://github.com/cyfrania/logback-slack-appender, which explicitly supports Logback 1.3.x/1.4.x but notes Java 11+ for 1.4.x (Spring Boot 3.5.x requires Java 21+). The original appender depends on Logback via ${logback.version} (unspecified concrete version), but lacks confirmation of testing with Logback 1.5.x changes (e.g., supplier patterns). Potential configuration or API incompatibilities may arise, as seen in other Logback appenders with Spring Boot 3.x. Use the cyfrania fork (v1.2, 2024) for better compatibility.
Citations:
- 1: https://github.com/maricn/logback-slack-appender
- 2: https://central.sonatype.com/artifact/com.github.maricn/logback-slack-appender
- 3: https://github.com/cyfrania/logback-slack-appender
- 4: Upgrade to Logback 1.5.22 spring-projects/spring-boot#48507
- 5: https://central.sonatype.com/artifact/com.cyfrania/logback-slack-appender
🏁 Script executed:
cat build.gradleRepository: samhap-soft/kokomen-backend
Length of output: 1905
Spring Boot 3.5.3과 호환되지 않는 Logback appender 라이브러리 교체 필요
Line 57의 com.github.maricn:logback-slack-appender:1.6.1은 Spring Boot 3.5.3에서 사용하는 Logback 1.5.x와 호환되지 않습니다. 또한 원본 리포지토리는 2024년 2월부터 아카이브된 상태로 2021년 이후 업데이트가 없습니다. 로깅이 애플리케이션 초기화 경로에 직접 영향을 주므로 배포 시 런타임 오류가 발생할 수 있습니다.
대신 유지보수 중인 포크인 com.cyfrania:logback-slack-appender:1.2를 사용하세요. 이 버전은 최근에 업데이트되었으며 현재 스택과의 호환성이 보장됩니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@build.gradle` at line 57, Replace the incompatible Logback Slack appender
dependency 'com.github.maricn:logback-slack-appender:1.6.1' with the maintained
fork 'com.cyfrania:logback-slack-appender:1.2' in the Gradle dependencies so the
app uses a Logback 1.5.x–compatible, actively maintained artifact; update the
dependency declaration where the current implementation line contains
'com.github.maricn:logback-slack-appender:1.6.1' to use
'com.cyfrania:logback-slack-appender:1.2' instead.
| get_active() { | ||
| if docker ps -q -f name=kokomen-api-blue | grep -q .; then | ||
| echo "blue" | ||
| elif docker ps -q -f name=kokomen-api-green | grep -q .; then | ||
| echo "green" | ||
| else | ||
| echo "none" | ||
| fi | ||
| } |
There was a problem hiding this comment.
sudo 사용이 섞여 있어서 Docker 권한이 필요한 서버에서는 배포가 중간에 끊깁니다.
docker compose만 sudo -E로 띄우고, 상태 조회/헬스체크/정리 단계는 일반 docker를 쓰고 있습니다. 이 구성은 Docker 그룹 권한이 없는 서버에서 get_active, wait_healthy, old container 정리 단계가 전부 실패합니다.
🔧 제안 수정
+DOCKER="sudo -E docker"
...
- if docker ps -q -f name=kokomen-api-blue | grep -q .; then
+ if $DOCKER ps -q -f name=kokomen-api-blue | grep -q .; then
...
- status=$(docker inspect --format='{{.State.Health.Status}}' "$container" 2>/dev/null || echo "starting")
+ status=$($DOCKER inspect --format='{{.State.Health.Status}}' "$container" 2>/dev/null || echo "starting")
...
- if ! docker ps -q -f name=traefik | grep -q .; then
+ if ! $DOCKER ps -q -f name=traefik | grep -q .; then
...
- sudo -E docker compose -f $COMPOSE_FILE up -d traefik
+ $DOCKER compose -f "$COMPOSE_FILE" up -d traefik
...
- sudo -E docker compose -f $COMPOSE_FILE --profile $TARGET up -d "kokomen-api-$TARGET"
+ $DOCKER compose -f "$COMPOSE_FILE" --profile "$TARGET" up -d "kokomen-api-$TARGET"
...
- docker rm -f "kokomen-api-$TARGET" 2>/dev/null || true
+ $DOCKER rm -f "kokomen-api-$TARGET" 2>/dev/null || true
...
- docker stop -t $GRACEFUL_SHUTDOWN_WAIT "$OLD" || true
- docker rm -f "$OLD" 2>/dev/null || true
+ $DOCKER stop -t "$GRACEFUL_SHUTDOWN_WAIT" "$OLD" || true
+ $DOCKER rm -f "$OLD" 2>/dev/null || trueAlso applies to: 23-44, 67-75, 79-82, 93-94
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker/prod/api/deploy.sh` around lines 13 - 21, The script mixes plain
docker calls with sudo -E docker compose causing failures on hosts where docker
requires sudo; define a single command variable (e.g., DOCKER_CMD) that detects
whether sudo is needed (try running `docker ps` and fall back to `sudo -E
docker`) and use that variable everywhere instead of raw `docker` or `sudo -E
docker compose`, and update all affected helpers (get_active, wait_healthy, the
old-container cleanup logic and any docker compose invocations) to invoke
$DOCKER_CMD (and $DOCKER_CMD compose) so all docker operations run under the
same privilege model.
| labels: | ||
| - "traefik.enable=true" | ||
| - "traefik.http.routers.api.rule=Host(`api.kokomen.kr`)" | ||
| - "traefik.http.routers.api.entrypoints=web" | ||
| - "traefik.http.services.api.loadbalancer.server.port=8080" | ||
| - "traefik.http.services.api.loadbalancer.healthcheck.path=/actuator/health" | ||
| - "traefik.http.services.api.loadbalancer.healthcheck.port=8081" | ||
| - "traefik.http.services.api.loadbalancer.healthcheck.interval=5s" | ||
| - "traefik.http.services.api.loadbalancer.healthcheck.timeout=3s" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "docker-compose-prod.yml" -o -name "deploy.sh" -o -name "rollback.sh" | head -20Repository: samhap-soft/kokomen-backend
Length of output: 1905
🏁 Script executed:
# Get the full structure of the docker directory
fd -t f "docker" | grep -E "(compose|deploy|rollback)" | head -30Repository: samhap-soft/kokomen-backend
Length of output: 1905
🏁 Script executed:
# Look for the specific file mentioned in the review
cat docker/prod/api/docker-compose-prod.yml 2>/dev/null || find . -name "docker-compose-prod.yml" -exec cat {} \;Repository: samhap-soft/kokomen-backend
Length of output: 1905
🌐 Web query:
Traefik identical labels multiple containers blue-green deployment behavior
💡 Result:
Traefik's behavior with multiple Docker containers having identical labels in blue-green deployments depends on how the labels are configured, particularly the router and service names. By default, Traefik creates one router and one service per container based on container-specific identifiers (like container name or ID). Identical router rules (e.g., same Host(example.com)) across multiple containers lead to configuration conflicts, as Traefik detects multiple routers with the same rule but different services, resulting in errors like "Router defined multiple times with different configurations" and failure to route properly. To enable load balancing across multiple containers with identical routing needs (useful for scaling or canary/blue-green with weights), all containers must explicitly share the same service name via labels like traefik.http.services.shared-service.loadbalancer.server.port=80, and reference it in routers with traefik.http.routers.myrouter.service=shared-service. Traefik then aggregates matching containers into a single load-balanced service with servers for each container instance. For strict blue-green deployments (zero-downtime version switching without concurrent load balancing): - Use separate containers/services for blue/green with distinct router names or priorities (e.g., traefik.http.routers.app-blue.rule=Host(app.localhost); traefik.http.routers.app-blue.priority=10 vs green with higher priority). - Enable/disable via traefik.enable=true/false or weights (e.g., traefik.http.services.app.loadbalancer.weight=100 for active, 0 for inactive). - Switch by updating labels (recreating containers), health checks, or dynamic file provider configs pointing to @docker services. This avoids conflicts while allowing atomic traffic shifts. Official docs confirm Traefik creates services with servers per container instance when configurations match, supporting load balancing natively when service names align. Conflicts arise only from mismatched router/service declarations across identical-rule containers.
Citations:
- 1: https://doc.traefik.io/reference/routing-configuration/other-providers/docker/
- 2: https://doc.traefik.io/traefik/v3.5/reference/routing-configuration/other-providers/docker/
- 3: Cannot load balance multiple instances of the same Docker container image traefik/traefik#8694
- 4: https://lours.me/posts/compose-tip-015-blue-green-deployments/
- 5: Docker service labels prevent assigning multiple containers to a backend traefik/traefik#2293
🌐 Web query:
Traefik http.routers http.services same labels multiple backends load balancing
💡 Result:
In Traefik, to configure an HTTP router (e.g., http.routers.http) to load balance across multiple backends using the same service labels, use the Docker provider with identical service names across multiple containers. Traefik automatically aggregates containers with the same service name into a single loadBalancer service, adding each container's instance as a server. Key steps: 1. Define the same service name (e.g., my-service) on multiple containers via labels like traefik.http.services.my-service.loadbalancer.server.port=8080. 2. Point one or more routers to that service name using traefik.http.routers.http.service=my-service. 3. Traefik detects matching service configurations and load balances traffic across all containers exposing that service (using strategies like wrr by default). Example Docker Compose for two whoami containers load balanced by router "http": version: '3' services: whoami-1: image: traefik/whoami labels: - traefik.enable=true - traefik.http.routers.http.rule=Host(example.com) - traefik.http.routers.http.service=my-service - traefik.http.services.my-service.loadbalancer.server.port=80 whoami-2: image: traefik/whoami labels: - traefik.enable=true - traefik.http.routers.http.service=my-service # Router points to shared service - traefik.http.services.my-service.loadbalancer.server.port=80 # Same service name deploy: replicas: 1 # Scale as needed Traefik creates a loadBalancer for my-service with servers from both containers. Without explicit router labels on the second container, Traefik auto-creates routers if rules are defined, but explicit shared service ensures aggregation. For static/file provider, explicitly list servers: http: services: my-service: loadBalancer: servers: - url: http://backend1:80 - url: http://backend2:80 Load balancing strategies (wrr, p2c, etc.) configurable via traefik.http.services.my-service.loadbalancer.strategy=wrr. This works in Traefik v2+ and v3 (current as of 2026), per official docs.
Citations:
- 1: https://doc.traefik.io/reference/routing-configuration/http/load-balancing/service/
- 2: https://docs.traefik.io/reference/routing-configuration/other-providers/docker/
- 3: Cannot load balance multiple instances of the same Docker container image traefik/traefik#8694
- 4: https://stackoverflow.com/questions/59856722/multiple-routers-and-services-on-the-same-container-with-traefik-2
- 5: https://doc.traefik.io/traefik/v3.3/routing/providers/docker/
🌐 Web query:
Traefik blue-green deployment routing cutover strategy
💡 Result:
Traefik supports blue-green deployment routing through its dynamic configuration and weighted services, enabling zero-downtime cutovers by switching traffic between blue (current) and green (new) environments. Core Strategy: Maintain two identical environments: blue (live) and green (staging new version). Deploy and test the new version in green while blue handles all production traffic. Once validated, perform cutover by updating Traefik routing to direct 100% traffic to green. Rollback by switching back to blue. Cutover Methods by Environment: 1. Docker/Docker Compose (using labels): - Enable/disable services via labels like traefik.enable=true/false. - Update docker-compose.yml with env vars: BLUE_ENABLED=true, GREEN_ENABLED=false initially. - Deploy: docker compose up -d - Test green, then cutover: BLUE_ENABLED=false GREEN_ENABLED=true; docker compose up -d - Traefik auto-detects changes and switches instantly. - Example labels: traefik.http.services.app-blue.loadbalancer.server.port=3000 traefik.enable=${BLUE_ENABLED:-true} 2. File Provider (dynamic config): - Define separate services for blue/green. - Router points to active service: service: green@file - Edit config file (watched by Traefik), change service reference, save for instant switch. 3. Kubernetes (CRDs recommended): - Deploy blue/green deployments + services (e.g., app-blue-svc, app-green-svc). - Use IngressRoute with TraefikService for weighted routing: apiVersion: traefik.io/v1alpha1 kind: TraefikService metadata: name: bg-service spec: weighted: services: - name: app-blue-svc port: 80 weight: 100 # Initially 100% blue - name: app-green-svc port: 80 weight: 0 - IngressRoute references bg-service. - Cutover: Update weights (100->0 blue, 0->100 green), apply. Supports gradual shift (90/10 -> 50/50 -> 0/100). - Health checks ensure only healthy services receive traffic. Additional Best Practices: - Use health checks (Traefik auto-removes unhealthy servers). - Weighted Round Robin (WRR) for gradual cutover or canary. - Traffic mirroring for green testing (duplicate requests to green while blue responds). - Tools like Flagger automate with metrics-based promotion. - Monitor via Traefik dashboard. This leverages Traefik's service discovery and hot-reload for zero-downtime.
Citations:
- 1: https://traefik.io/glossary/kubernetes-deployment-strategies-blue-green-canary
- 2: https://www.straypaper.com/blue-green/
- 3: https://lours.me/posts/compose-tip-015-blue-green-deployments/
- 4: https://abendigo.github.io/2021/03/16/traefik_blue_green.html
- 5: https://traefik.io/blog/achieve-zero-downtime-deployments-with-traefik-and-kubernetes
- 6: https://doc.traefik.io/reference/routing-configuration/http/load-balancing/service/
- 7: https://doc.traefik.io/reference/routing-configuration/kubernetes/crd/http/traefikservice/
- 8: https://doc.traefik.io/traefik/v3.0/routing/services/
- 9: https://doc.traefik.io/reference/routing-configuration/kubernetes/crd/http/ingressroute/
현재 Traefik 설정은 동시에 두 백엔드로 트래픽을 분산합니다. 진정한 blue-green 전환이 아닙니다.
동일한 traefik.http.routers.api.*와 traefik.http.services.api.* 라벨을 blue/green 컨테이너에 붙이면, Traefik이 두 컨테이너를 하나의 로드밸런싱 풀로 통합하여 동시에 트래픽을 분산합니다. blue-green 디플로이먼트에서 진정한 zero-downtime 전환을 위해서는 다음 중 하나가 필요합니다:
traefik.enable=true/false를 사용한 활성/비활성 제어- 각 색상별 별도의 서비스 이름 지정
- 라우터 우선순위 설정 후 전환 시 활성 라우터 변경
현재 deploy.sh/rollback.sh에 이러한 전환 메커니즘이 없으면, 실제 동작은 두 버전이 동시에 트래픽을 받는 rolling/canary 배포에 가깝습니다.
Also applies to: 96-104
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker/prod/api/docker-compose-prod.yml` around lines 49 - 57, The Traefik
labels currently use identical router/service names (traefik.http.routers.api.*
and traefik.http.services.api.*) so blue and green containers are merged into
one LB pool; change the strategy to perform real blue-green switches by either
(a) assigning distinct service/router names per color (e.g.,
traefik.http.services.api-blue.* and traefik.http.services.api-green.* with
matching traefik.http.routers.api-blue.rule / api-green.rule), or (b) toggling
traefik.enable=true/false on the inactive container from your deployment scripts
(deploy.sh / rollback.sh), or (c) use router priority and switch the active
router during deploy; update the docker-compose labels and the deploy/rollback
scripts to implement one of these approaches so only the active color receives
traffic.
| public class RateLimitingTurboFilter extends TurboFilter { | ||
|
|
||
| private static final int MAX_CACHE_SIZE = 1000; | ||
|
|
||
| private final ConcurrentHashMap<String, Long> lastLogTimes = new ConcurrentHashMap<>(); | ||
| private long cooldownMillis = 60000; | ||
|
|
||
| public void setCooldownSeconds(int seconds) { | ||
| this.cooldownMillis = seconds * 1000L; | ||
| } | ||
|
|
||
| @Override | ||
| public FilterReply decide(Marker marker, Logger logger, Level level, | ||
| String format, Object[] params, Throwable t) { | ||
| if (level != Level.ERROR) { | ||
| return FilterReply.NEUTRAL; | ||
| } | ||
|
|
||
| String errorKey = buildErrorKey(logger, format, t); | ||
| long now = System.currentTimeMillis(); | ||
|
|
||
| Long lastTime = lastLogTimes.get(errorKey); | ||
| if (lastTime != null && (now - lastTime) < cooldownMillis) { | ||
| return FilterReply.DENY; | ||
| } | ||
|
|
||
| lastLogTimes.put(errorKey, now); | ||
|
|
||
| if (lastLogTimes.size() > MAX_CACHE_SIZE) { | ||
| cleanupOldEntries(now); | ||
| } | ||
|
|
||
| return FilterReply.NEUTRAL; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd -type f -name "*.xml" | grep -i logback | head -20Repository: samhap-soft/kokomen-backend
Length of output: 1905
🏁 Script executed:
rg "RateLimitingTurboFilter" -A 5 -B 5Repository: samhap-soft/kokomen-backend
Length of output: 1905
🏁 Script executed:
fd -type f -name "logback*.xml"Repository: samhap-soft/kokomen-backend
Length of output: 1905
🏁 Script executed:
rg "TurboFilter" -A 5 -B 5Repository: samhap-soft/kokomen-backend
Length of output: 1905
🏁 Script executed:
rg "appender" -t xml | head -30Repository: samhap-soft/kokomen-backend
Length of output: 1905
🌐 Web query:
Logback TurboFilter vs Filter appender level difference
💡 Result:
TurboFilters and appender-level Filters (like ThresholdFilter) in Logback serve different purposes in the logging pipeline, with TurboFilters operating earlier and more globally than appender-specific filters. Key differences: Scope: TurboFilters apply to the entire logging context (all loggers), while Filters are attached to specific appenders. Timing: TurboFilters are invoked on every logging request, before LoggingEvent creation (using parameters like level, logger, format), enabling high-performance early rejection. Regular Filters act after LoggingEvent creation, within the appender's doAppend method. Level handling: Appender level thresholds (or ThresholdFilter) determine the minimum level events reaching that appender will be processed/output. Logger levels enable/disable requests via the basic selection rule (request level >= effective logger level). TurboFilters can override or supplement these by deciding ACCEPT/DENY/NEUTRAL early based on dynamic criteria (e.g., DynamicThresholdFilter uses MDC for per-user levels), potentially bypassing logger/appender level checks. Processing order (from architecture): 1. TurboFilters (early), 2. Logger effective level check, 3. LoggingEvent creation, 4. Appender filters (including ThresholdFilter). Example: A TurboFilter can deny DEBUG logs for certain users before logger level evaluation, while a ThresholdFilter on an appender denies below INFO after the event reaches it. Appender "level" often refers to ThresholdFilter's or appender threshold, filtering post-event.
Citations:
- 1: https://logback.qos.ch/manual/filters.html
- 2: https://logback.qos.ch/apidocs/ch.qos.logback.classic/ch/qos/logback/classic/turbo/TurboFilter.html
- 3: https://logback.qos.ch/apidocs/ch.qos.logback.classic/ch/qos/logback/classic/turbo/DynamicThresholdFilter.html
- 4: https://stackoverflow.com/questions/41394606/which-is-the-difference-between-logger-level-and-threshold-filter-level-in-logba
- 5: https://logback.qos.ch/manual/architecture.html
TurboFilter로 구현하면 Slack뿐 아니라 모든 ERROR 로그가 누락됩니다.
TurboFilter는 appender 선택 전 전역 범위에서 동작하므로, 여기서 DENY를 반환하면 Slack뿐 아니라 파일/콘솔 appender까지 모든 ERROR 이벤트를 잃게 됩니다. 의도가 "Slack 알림만 rate limit"이라면 Slack appender에 붙는 일반 filter로 구현해야 합니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main/java/com/samhap/kokomen/global/logging/RateLimitingTurboFilter.java`
around lines 11 - 44, The current RateLimitingTurboFilter class (decide method)
is a TurboFilter that returns DENY for rate-limited ERROR events which prevents
all appenders (Slack, file, console) from receiving them; change this to an
appender-level Filter implementation (e.g., create RateLimitingFilter that
extends ch.qos.logback.core.filter.Filter and implements decide/decideInternal)
and attach that filter only to the Slack appender so only Slack notifications
are rate-limited; update places that register RateLimitingTurboFilter to instead
add the new RateLimitingFilter to the Slack appender configuration and keep the
global TurboFilter logic removed or converted to a NEUTRAL-only TurboFilter if
global behavior is still needed.
| String errorKey = buildErrorKey(logger, format, t); | ||
| long now = System.currentTimeMillis(); | ||
|
|
||
| Long lastTime = lastLogTimes.get(errorKey); | ||
| if (lastTime != null && (now - lastTime) < cooldownMillis) { | ||
| return FilterReply.DENY; | ||
| } | ||
|
|
||
| lastLogTimes.put(errorKey, now); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/main/java/com/samhap/kokomen/global/logging/RateLimitingTurboFilter.javaRepository: samhap-soft/kokomen-backend
Length of output: 1905
🏁 Script executed:
web_search
Java ConcurrentHashMap race condition get-put pattern thread safety best practicesRepository: samhap-soft/kokomen-backend
Length of output: 1905
🌐 Web query:
Java ConcurrentHashMap race condition get-put pattern atomicity thread safety
💡 Result:
ConcurrentHashMap provides thread-safe individual get and put operations, but the common get-put pattern (e.g., if (!map.containsKey(key)) map.put(key, value)) is NOT atomic and susceptible to race conditions in multithreaded environments. Individual operations like get, put, putIfAbsent, etc., are atomic and thread-safe. Retrievals (get) reflect the most recently completed updates due to happens-before guarantees: an update operation for a key happens-before any subsequent non-null retrieval for that key. However, composing get followed by put (or similar patterns like check-then-act) creates a race window where multiple threads can pass the check concurrently before any completes the put, leading to duplicate computations/insertions or lost updates. To achieve atomicity for get-put patterns: - Use computeIfAbsent(key, k -> computeValue): Atomically checks absence and computes/inserts if needed (Java 8+). - Use putIfAbsent(key, value) for simple insertion if absent. - For counters: map.computeIfAbsent(key, k -> new AtomicInteger).incrementAndGet. Official docs confirm: "Retrieval operations generally do not block, so may overlap with update operations." Compound user logic requires atomic methods. Examples from sources demonstrate races in naive patterns and fixes via atomic methods.
Citations:
- 1: https://docs.oracle.com/en/java/javase/25/docs/api/java.base/java/util/concurrent/ConcurrentHashMap.html
- 2: https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/util/concurrent/ConcurrentHashMap.html
- 3: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html
- 4: https://stackoverflow.com/questions/76234152/how-to-maintain-atomicity-with-concurrenthashmap-get-and-put-methods
- 5: https://stackoverflow.com/questions/1770166/is-concurrenthashmap-get-guaranteed-to-see-a-previous-concurrenthashmap-put
- 6: https://dzone.com/articles/java-8-concurrenthashmap-atomic-updates
중복 억제 체크 시 race condition으로 인한 동시성 문제를 해결해야 합니다.
get() 후 put()을 하는 패턴은 원자적이지 않아서, 같은 에러가 동시에 들어올 때 여러 스레드가 모두 cooldown 체크를 통과할 수 있습니다. compute() 메서드로 한 번에 처리하여 실제 rate limiting이 작동하도록 수정하세요.
🔧 제안 수정
+import java.util.concurrent.atomic.AtomicBoolean;
...
- Long lastTime = lastLogTimes.get(errorKey);
- if (lastTime != null && (now - lastTime) < cooldownMillis) {
- return FilterReply.DENY;
- }
-
- lastLogTimes.put(errorKey, now);
+ AtomicBoolean allowed = new AtomicBoolean(false);
+ lastLogTimes.compute(errorKey, (key, lastTime) -> {
+ if (lastTime == null || (now - lastTime) >= cooldownMillis) {
+ allowed.set(true);
+ return now;
+ }
+ return lastTime;
+ });
if (lastLogTimes.size() > MAX_CACHE_SIZE) {
cleanupOldEntries(now);
}
- return FilterReply.NEUTRAL;
+ return allowed.get() ? FilterReply.NEUTRAL : FilterReply.DENY;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main/java/com/samhap/kokomen/global/logging/RateLimitingTurboFilter.java`
around lines 29 - 37, The current pattern in RateLimitingTurboFilter using
lastLogTimes.get(errorKey) followed by lastLogTimes.put(errorKey, now) is not
atomic and can race; replace that logic by using lastLogTimes.compute(errorKey,
...) to atomically inspect the previous timestamp and decide whether to update
to now or keep the old value so the cooldownMillis check is done atomically;
update the code around buildErrorKey(...) / lastLogTimes / cooldownMillis so
compute returns the correct stored timestamp and use its result to return
FilterReply.DENY when the cooldown has not expired, otherwise allow logging and
set the timestamp to now.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
docker/prod/api/deploy.sh (2)
70-70:⚠️ Potential issue | 🟡 Minor변수를 따옴표로 감싸 word splitting을 방지하세요.
Shellcheck SC2086:
$COMPOSE_FILE변수가 공백이나 특수문자를 포함할 경우 word splitting이 발생할 수 있습니다.🔧 제안 수정
- sudo -E docker compose -f $COMPOSE_FILE up -d traefik + sudo -E docker compose -f "$COMPOSE_FILE" up -d traefik🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/prod/api/deploy.sh` at line 70, The shell command invoking docker compose uses an unquoted variable and can suffer word-splitting; update the invocation that references $COMPOSE_FILE in the deploy script so the variable is wrapped in double quotes (e.g., use "$COMPOSE_FILE") to prevent splitting and globbing when the variable contains spaces or special characters; ensure any other similar variables in the same script are quoted consistently.
76-76:⚠️ Potential issue | 🟡 Minor변수를 따옴표로 감싸 word splitting을 방지하세요.
Shellcheck SC2086: 여러 변수가 따옴표 없이 사용되고 있습니다.
🔧 제안 수정
- sudo -E docker compose -f $COMPOSE_FILE --profile $TARGET up -d "kokomen-api-$TARGET" + sudo -E docker compose -f "$COMPOSE_FILE" --profile "$TARGET" up -d "kokomen-api-$TARGET"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/prod/api/deploy.sh` at line 76, Quote all shell variable expansions in the docker compose command to prevent word splitting: replace unquoted $COMPOSE_FILE and $TARGET with "$COMPOSE_FILE" and "$TARGET" (also use --profile="$TARGET") so the command becomes sudo -E docker compose -f "$COMPOSE_FILE" --profile="$TARGET" up -d "kokomen-api-$TARGET"; this ensures safe handling of paths/values containing spaces or special characters.docker/prod/api/rollback.sh (1)
80-80:⚠️ Potential issue | 🟡 Minor변수를 따옴표로 감싸 word splitting을 방지하세요.
Shellcheck SC2086: 여러 변수가 따옴표 없이 사용되고 있습니다.
🔧 제안 수정
- sudo -E docker compose -f $COMPOSE_FILE --profile $ROLLBACK_TARGET up -d "kokomen-api-$ROLLBACK_TARGET" + sudo -E docker compose -f "$COMPOSE_FILE" --profile "$ROLLBACK_TARGET" up -d "kokomen-api-$ROLLBACK_TARGET"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/prod/api/rollback.sh` at line 80, The docker compose invocation uses unquoted variables which can cause word-splitting (SC2086); update the command in rollback.sh to quote the variables: wrap $COMPOSE_FILE and $ROLLBACK_TARGET in double quotes (i.e., use "$COMPOSE_FILE" and "$ROLLBACK_TARGET") while keeping the existing "kokomen-api-$ROLLBACK_TARGET" quoted, so the flags --profile and -f receive single arguments even if the variables contain spaces.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/cd-api-prod.yml:
- Around line 89-91: Split the assignment and export so the exit status of the
hostname command isn't masked: first run the hostname command and capture its
result into HOSTNAME (and ensure you propagate failure by checking the exit
status or using set -e), then export the HOSTNAME variable before running
./deploy.sh; update the workflow lines that currently use export
HOSTNAME=$(hostname) to perform assignment, error-checking, and then export
HOSTNAME so a failed hostname does not return a successful status.
In `@src/main/java/com/samhap/kokomen/global/logging/SlackErrorLayout.java`:
- Around line 30-33: The SlackErrorLayout can lose requestId for async ERROR
logs because MDC is cleared in LoggingFilter before async events are handled; to
fix, ensure requestId is captured into a context property before clearing MDC
and update SlackErrorLayout to fallback to that context property: in
LoggingFilter, grab the MDC requestId and call
logger.getLoggerContext().putProperty("requestId", requestId) (or similar)
before MDC.clear(); in SlackErrorLayout, when
event.getMDCPropertyMap().get("requestId") is null/empty, read
event.getLoggerContextVO().getPropertyMap().get("requestId") and include that
value (or a clear "missing" marker) in the message; reference requestId,
LoggingFilter, ASYNC_SLACK, includeCallerData, SlackErrorLayout,
event.getMDCPropertyMap(), and event.getLoggerContextVO() when making the
changes.
---
Duplicate comments:
In `@docker/prod/api/deploy.sh`:
- Line 70: The shell command invoking docker compose uses an unquoted variable
and can suffer word-splitting; update the invocation that references
$COMPOSE_FILE in the deploy script so the variable is wrapped in double quotes
(e.g., use "$COMPOSE_FILE") to prevent splitting and globbing when the variable
contains spaces or special characters; ensure any other similar variables in the
same script are quoted consistently.
- Line 76: Quote all shell variable expansions in the docker compose command to
prevent word splitting: replace unquoted $COMPOSE_FILE and $TARGET with
"$COMPOSE_FILE" and "$TARGET" (also use --profile="$TARGET") so the command
becomes sudo -E docker compose -f "$COMPOSE_FILE" --profile="$TARGET" up -d
"kokomen-api-$TARGET"; this ensures safe handling of paths/values containing
spaces or special characters.
In `@docker/prod/api/rollback.sh`:
- Line 80: The docker compose invocation uses unquoted variables which can cause
word-splitting (SC2086); update the command in rollback.sh to quote the
variables: wrap $COMPOSE_FILE and $ROLLBACK_TARGET in double quotes (i.e., use
"$COMPOSE_FILE" and "$ROLLBACK_TARGET") while keeping the existing
"kokomen-api-$ROLLBACK_TARGET" quoted, so the flags --profile and -f receive
single arguments even if the variables contain spaces.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: de277bf1-58b9-4c06-8e06-b8768382b395
📒 Files selected for processing (6)
.github/workflows/cd-api-prod.ymldocker/prod/api/deploy.shdocker/prod/api/rollback.shsrc/main/java/com/samhap/kokomen/global/logging/RateLimitingFilter.javasrc/main/java/com/samhap/kokomen/global/logging/SlackErrorLayout.javasrc/main/resources/logback-spring.xml
| run: | | ||
| export HOSTNAME=$(hostname) | ||
| cd kokomen-backend/docker/prod/api | ||
| sudo -E docker compose -f docker-compose-prod.yml up -d | ||
| sudo -E ./deploy.sh |
There was a problem hiding this comment.
export와 명령어 할당을 분리하여 반환값 마스킹을 방지하세요.
Shellcheck SC2155: export HOSTNAME=$(hostname)은 hostname 명령의 반환값을 마스킹합니다. hostname이 실패해도 export의 성공으로 인해 $?가 0이 됩니다.
🔧 제안 수정
run: |
- export HOSTNAME=$(hostname)
+ HOSTNAME=$(hostname)
+ export HOSTNAME
sudo -E ./deploy.sh📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| run: | | |
| export HOSTNAME=$(hostname) | |
| cd kokomen-backend/docker/prod/api | |
| sudo -E docker compose -f docker-compose-prod.yml up -d | |
| sudo -E ./deploy.sh | |
| run: | | |
| HOSTNAME=$(hostname) | |
| export HOSTNAME | |
| sudo -E ./deploy.sh |
🧰 Tools
🪛 actionlint (1.7.12)
[error] 89-89: shellcheck reported issue in this script: SC2155:warning:1:8: Declare and assign separately to avoid masking return values
(shellcheck)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/cd-api-prod.yml around lines 89 - 91, Split the assignment
and export so the exit status of the hostname command isn't masked: first run
the hostname command and capture its result into HOSTNAME (and ensure you
propagate failure by checking the exit status or using set -e), then export the
HOSTNAME variable before running ./deploy.sh; update the workflow lines that
currently use export HOSTNAME=$(hostname) to perform assignment, error-checking,
and then export HOSTNAME so a failed hostname does not return a successful
status.
| String requestId = event.getMDCPropertyMap().get("requestId"); | ||
| if (requestId != null && !requestId.isEmpty()) { | ||
| sb.append(":label: *Request ID:* `").append(requestId).append("`\n"); | ||
| } |
There was a problem hiding this comment.
비동기 로깅 시 requestId가 누락될 수 있습니다.
LoggingFilter에서 요청 처리 완료 후 MDC.clear()를 호출하므로, 비동기적으로 발생하는 ERROR 로그는 requestId를 가져올 수 없을 수 있습니다. 현재 구현은 null/empty 체크를 하고 있어 크래시는 발생하지 않지만, 디버깅 시 유용한 정보가 누락될 수 있습니다.
ASYNC_SLACK appender에서 includeCallerData=true가 설정되어 있지만, MDC 데이터는 이벤트 생성 시점에 캡처되므로 동기적으로 발생한 ERROR는 정상적으로 포함됩니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main/java/com/samhap/kokomen/global/logging/SlackErrorLayout.java` around
lines 30 - 33, The SlackErrorLayout can lose requestId for async ERROR logs
because MDC is cleared in LoggingFilter before async events are handled; to fix,
ensure requestId is captured into a context property before clearing MDC and
update SlackErrorLayout to fallback to that context property: in LoggingFilter,
grab the MDC requestId and call
logger.getLoggerContext().putProperty("requestId", requestId) (or similar)
before MDC.clear(); in SlackErrorLayout, when
event.getMDCPropertyMap().get("requestId") is null/empty, read
event.getLoggerContextVO().getPropertyMap().get("requestId") and include that
value (or a clear "missing" marker) in the message; reference requestId,
LoggingFilter, ASYNC_SLACK, includeCallerData, SlackErrorLayout,
event.getMDCPropertyMap(), and event.getLoggerContextVO() when making the
changes.
closed #351
작업 내용
스크린샷
참고 사항