Skip to content

Feature: migration of merge requests that were merged and their branches deleted #253

@woutdenolf

Description

@woutdenolf

Merge requests that were merged and their branches deleted are currently migrated as issues

Pull request #60 (source branch ....' does not exist => cannot migrate pull request, creating an issue instead.

My first idea to improve this was to restore all branches, backup and clean the main branch, run the migration and restore the main branch:

#!/usr/bin/env bash
set -euo pipefail

echo "LOAD ENVIRONMENT VARIABLES"
source ./scripts/load-env.sh

echo "RE-CREATE MR BRANCHES ON GITLAB"
./scripts/recreate-mr-branches.sh

log_section "CLONE PROJECT AND PUSH TO GITHUB"
./scripts/clone-and-push.sh

echo "EMPTY MAIN BRANCH"
./scripts/empty-main.sh

echo "MIGRATE"
./scripts/migrate.sh

echo "RESTORE ORIGINAL MAIN"
./scripts/restore-main.sh
recreate-mr-branches.sh
#!/usr/bin/env bash
#
# export GITLAB_URL=
# export GITLAB_TOKEN=
# export GITLAB_PROJECT_ID=
#
# ./recreate-mr-branches.sh           # real execution
# ./recreate-mr-branches.sh --dry-run # no changes
# ./recreate-mr-branches.sh -n        # shorthand

set -euo pipefail

# -------------------------
# Configuration (env vars)
# -------------------------
: "${GITLAB_URL:?Missing GITLAB_URL}"
: "${GITLAB_TOKEN:?Missing GITLAB_TOKEN}"
: "${GITLAB_PROJECT_ID:?Missing GITLAB_PROJECT_ID}"

PER_PAGE=100
PAGE=1
DRY_RUN=false

# -------------------------
# Parse arguments
# -------------------------
for arg in "$@"; do
  case "$arg" in
    --dry-run|-n)
      DRY_RUN=true
      ;;
    *)
      echo "Unknown argument: $arg"
      exit 1
      ;;
  esac
done

# -------------------------
# Helpers
# -------------------------
_log() {
  echo -e "$1"
}

_api() {
  curl -s \
    --header "PRIVATE-TOKEN: $GITLAB_TOKEN" \
    "$@"
}

_http_status() {
  curl -s -o /dev/null -w "%{http_code}" \
    --header "PRIVATE-TOKEN: $GITLAB_TOKEN" \
    "$1"
}

# -------------------------
# Main loop
# -------------------------
_log "🔁 Starting merge request branch recreation"
_log "🧪 Dry-run mode: $DRY_RUN"
_log ""

while :; do
  _log "📄 Fetching merge requests (page $PAGE)"

  MRS=$(_api \
    "$GITLAB_URL/api/v4/projects/$GITLAB_PROJECT_ID/merge_requests?state=all&per_page=$PER_PAGE&page=$PAGE")

  COUNT=$(echo "$MRS" | jq length)
  [ "$COUNT" -eq 0 ] && break

  echo "$MRS" | jq -c '.[]' | while read -r mr; do
    IID=$(echo "$mr" | jq -r '.iid')
    SOURCE_BRANCH=$(echo "$mr" | jq -r '.source_branch')

    # Prefer diff_refs.head_sha (most accurate)
    HEAD_SHA=$(echo "$mr" | jq -r '.diff_refs.head_sha // .sha')

    _log "• MR !${IID}"
    _log "    branch: $SOURCE_BRANCH"
    _log "    sha:    $HEAD_SHA"

    # Check if branch exists
    STATUS=$(_http_status \
      "$GITLAB_URL/api/v4/projects/$GITLAB_PROJECT_ID/repository/branches/$SOURCE_BRANCH")

    if [ "$STATUS" -eq 200 ]; then
      _log "    ✔ Branch exists"
      continue
    fi

    _log "    ⚠ Branch missing"

    if [ "$DRY_RUN" = true ]; then
      _log "    🧪 DRY-RUN → would create branch '$SOURCE_BRANCH' at $HEAD_SHA"
      continue
    fi

    _log "    🔧 Creating branch..."

    curl -s -X POST \
      --header "PRIVATE-TOKEN: $GITLAB_TOKEN" \
      --data-urlencode "branch=$SOURCE_BRANCH" \
      --data-urlencode "ref=$HEAD_SHA" \
      "$GITLAB_URL/api/v4/projects/$GITLAB_PROJECT_ID/repository/branches" \
      >/dev/null

    _log "    ✅ Branch recreated"
  done

  PAGE=$((PAGE + 1))
done

_log ""
_log "🎉 Done processing merge request branches"
load-env.sh
#!/usr/bin/env bash

[ -f ./env.sh ] || { echo "env.sh not found"; exit 1; }
source ./env.sh

REQUIRED_VARS=(
  GITLAB_DOMAIN
  GITLAB_PROJECT_ID
  GITLAB_REPO
  GITLAB_GROUP
  GITLAB_TOKEN
  GITLAB_SESSION_COOKIE
  GITHUB_DOMAIN
  GITHUB_TOKEN
  GITHUB_TOKEN_OWNER
  GITHUB_REPO
  GITHUB_OWNER
  GITHUB_OWNER_IS_ORG
)

MISSING=false

for var in "${REQUIRED_VARS[@]}"; do
  if [ -z "${!var:-}" ]; then
    echo "❌ Environment variable $var is not set"
    MISSING=true
  fi
done

if [ "$MISSING" = true ]; then
  echo
  echo "Please set the missing environment variables before running the migration."
  exit 1
fi

echo "✅ All required environment variables are set."

export GITLAB_URL="https://$GITLAB_DOMAIN"
export GITHUB_URL="https://$GITHUB_DOMAIN"
export GITHUB_API_URL="https://api.$GITHUB_DOMAIN"

export GIT_GITLAB_REPO="git@$GITLAB_DOMAIN:$GITLAB_GROUP/$GITLAB_REPO.git"
export GIT_GITHUB_REPO="git@$GITHUB_DOMAIN:$GITHUB_OWNER/$GITHUB_REPO.git"
export GIT_WORKDIR="$GITLAB_REPO.git"
export GIT_TMP_WORKDIR="$GITLAB_REPO-tmp.git"
clone-end-push.sh
#!/usr/bin/env bash
set -euo pipefail

echo "Cloning GitLab repository (mirror) to $GIT_WORKDIR"
rm -rf "$GIT_WORKDIR"
git clone --mirror "$GIT_GITLAB_REPO" "$GIT_WORKDIR"

echo "Set GitHub as push target and mirror-push"
cd "$GIT_WORKDIR"
git remote set-url --push origin "$GIT_GITHUB_REPO"
git push --no-verify --mirror
cd ..
empty-main.sh
#!/usr/bin/env bash
set -euo pipefail

echo "Backup original main branch locally..."
rm -rf "$GIT_TMP_WORKDIR"
git clone "$GIT_GITHUB_REPO" "$GIT_TMP_WORKDIR"
cd "$GIT_TMP_WORKDIR"

git branch backup-main main

echo "Reset main branch to first commit..."
git checkout main
FIRST_COMMIT=$(git rev-list --max-parents=0 HEAD)
git reset --hard "$FIRST_COMMIT"

echo "Force push reset main branch to GitHub..."
git push --force origin main:main

cd ..
echo "Main branch temporarily reset to first commit."
restore-main.sh
#!/usr/bin/env bash
set -euo pipefail

cd "$GIT_TMP_WORKDIR"

echo "Restore original main branch on GitHub..."
git push --force origin backup-main:main

cd ..
echo "Original main branch restored."
migrate.sh
#!/usr/bin/env bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

if [ ! -d node-gitlab-2-github ]; then
  git clone https://github.com/piceaTech/node-gitlab-2-github
fi

cp "$SCRIPT_DIR/../config/settings.ts" "node-gitlab-2-github/settings.ts"

cd node-gitlab-2-github

npm install --no-save
npm run start

cd ..

The merge requests are being created and closed. So that works fine. However their diff's do not look right because it is always the diff with respect to the first commit of the project.

So this can probably be fixed by advancing the target branch and creating missing source branch in createPullRequestAndComments:

  • reset the target branch before the merge request to the point before the merge request
  • restore the source branch when missing
  • do the merge request
  • restore the target branch after the merge request
  • remove the source branch when it was missing in step 2

I'm not well versed in TypeScript so apart from small improvements like #252 I won't be able to help much.

@spruce Do you think this is worth pursuing and if yes, are you motivated to help out?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions