Skip to content

Conversation

@yasithdev
Copy link
Contributor

@yasithdev yasithdev commented Oct 29, 2025

🎯 Conceptual Model

This PR is more of a RFC to think through what the fundamental problem that airavata should tackle. This POC is built around 6 core domain interfaces that represent the fundamental operations of distributed task execution. The POC uses a gRPC-based worker system for task execution:

┌─────────────────────────────────────────────────────────────────┐
│                    Core Domain Interfaces                       │
├─────────────────────────────────────────────────────────────────┤
│  ResourceRegistry    │  CredentialVault     │  ExperimentOrch   │
│  • Register compute  │  • Secure storage    │  • Create exper   │
│  • Register storage  │  • Unix permissions  │  • Generate tasks │
│  • Validate access   │  • Encrypt/decrypt   │  • Submit for exec│
├─────────────────────────────────────────────────────────────────┤
│  TaskScheduler       │  DataMover          │  WorkerLifecycle  │
│  • Cost optimization │  • 3-hop staging    │  • Spawn workers  │
│  • Worker distrib    │  • Persistent cache │  • gRPC workers   │
│  • Atomic assignment │  • Lineage tracking │  • Task execution │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                    gRPC Worker System                          │
├─────────────────────────────────────────────────────────────────┤
│  Worker Binary      │  Script Generation   │  Task Execution   │
│  • Standalone exec  │  • SLURM scripts     │  • Poll for tasks │
│  • gRPC client      │  • K8s manifests     │  • Execute tasks  │
│  • Auto-deployment  │  • Bare metal scripts│  • Report results │
└─────────────────────────────────────────────────────────────────┘

🔐 Credential Management Architecture

The Airavata Scheduler implements a three-layer credential architecture that separates authorization logic from storage for maximum security and scalability:

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                        │
│   (Experiments, Resources, Users, Groups)                   │
└────────────┬────────────────────────────────────────────────┘
             │
             ├──────────────────┬──────────────────┐
             │                  │                  │
┌────────────▼─────┐  ┌────────▼────────┐  ┌─────▼──────────┐
│   PostgreSQL     │  │    SpiceDB      │  │   OpenBao      │
│                  │  │                 │  │                │
│  Domain Data     │  │  Authorization  │  │  Secrets       │
│  - Users         │  │  - Permissions  │  │  - SSH Keys    │
│  - Groups        │  │  - Ownership    │  │  - Passwords   │
│  - Experiments   │  │  - Sharing      │  │  - Tokens      │
│  - Resources     │  │  - Hierarchies  │  │  (Encrypted)   │
└──────────────────┘  └─────────────────┘  └────────────────┘

Key Benefits

  • 🔒 Separation of Concerns: Authorization (SpiceDB) separate from secret storage (OpenBao)
  • 🛡️ Fine-grained Permissions: Read/write/delete permissions with hierarchical group inheritance
  • 📋 Complete Audit Trail: All operations logged across all three systems
  • 🔄 Credential Rotation: Support for automatic key rotation with zero downtime
  • 👥 Group Management: Groups can contain groups with transitive permission inheritance
  • 🔗 Resource Binding: Credentials bound to specific compute/storage resources

// Build SSH config
config := &ssh.ClientConfig{
User: sshCreds.Username,
HostKeyCallback: ssh.InsecureIgnoreHostKey(), // In production, use proper host key verification

Check failure

Code scanning / CodeQL

Use of insecure HostKeyCallback implementation High

Configuring SSH ClientConfig with insecure HostKeyCallback implementation from
this source
.

Copilot Autofix

AI 2 months ago

To address this issue, the code must validate the host key provided by the server against a trusted list. The single best way in this scenario is to switch from ssh.InsecureIgnoreHostKey() to ssh.FixedHostKey(pubKey) where pubKey is the trusted public key of the server. This requires loading the public key from a file (e.g., a user-specific or resource-specific location such as "~/.ssh/known_hosts" or from a configuration variable), parsing it, and setting it in the ssh.ClientConfig. All changes are localized to the block around the SSH client config creation, lines 747–751 and nearby as needed. You need to:

  • Load the trusted public key (from a path—this should be obtained from known resource metadata, or you may use a new field in resource/metadata if available),
  • Parse it with ssh.ParsePublicKey,
  • Replace ssh.InsecureIgnoreHostKey() with ssh.FixedHostKey(pubKey),
  • Import "io/ioutil" if not already present.

If there’s no pre-existing way to get the trusted public key path, a fallback could be a configurable constant or extracting a metadata field.


Suggested changeset 1
scheduler/adapters/compute_baremetal.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/adapters/compute_baremetal.go b/scheduler/adapters/compute_baremetal.go
--- a/scheduler/adapters/compute_baremetal.go
+++ b/scheduler/adapters/compute_baremetal.go
@@ -744,9 +744,18 @@
 	}
 
 	// Build SSH config
+	// Load trusted host public key
+	pubKeyBytes, err := os.ReadFile(sshCreds.HostPublicKeyPath)
+	if err != nil {
+		return fmt.Errorf("failed to read host public key: %w", err)
+	}
+	pubKey, err := ssh.ParsePublicKey(pubKeyBytes)
+	if err != nil {
+		return fmt.Errorf("failed to parse host public key: %w", err)
+	}
 	config := &ssh.ClientConfig{
 		User:            sshCreds.Username,
-		HostKeyCallback: ssh.InsecureIgnoreHostKey(), // In production, use proper host key verification
+		HostKeyCallback: ssh.FixedHostKey(pubKey), // Host key is verified against allow-list
 		Timeout:         10 * time.Second,
 	}
 
EOF
@@ -744,9 +744,18 @@
}

// Build SSH config
// Load trusted host public key
pubKeyBytes, err := os.ReadFile(sshCreds.HostPublicKeyPath)
if err != nil {
return fmt.Errorf("failed to read host public key: %w", err)
}
pubKey, err := ssh.ParsePublicKey(pubKeyBytes)
if err != nil {
return fmt.Errorf("failed to parse host public key: %w", err)
}
config := &ssh.ClientConfig{
User: sshCreds.Username,
HostKeyCallback: ssh.InsecureIgnoreHostKey(), // In production, use proper host key verification
HostKeyCallback: ssh.FixedHostKey(pubKey), // Host key is verified against allow-list
Timeout: 10 * time.Second,
}

Copilot is powered by AI and may make mistakes. Always verify output.

// Write script to temporary file
scriptPath := fmt.Sprintf("/tmp/worker_spawn_%s.sh", req.WorkerID)
if err := os.WriteFile(scriptPath, []byte(spawnScript), 0755); err != nil {

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 2 months ago

To fix this problem, we need to ensure that the WorkerID component (and, by extension, those values from which it is constructed: computeResourceID and experimentID) is sanitized/validated before being used as part of a file path. Since the path is supposed to be a filename in /tmp, we should ensure the WorkerID portion contains only safe characters, such as alphanumerics, underscores, and dashes. Any other input should result in an error. This prevents directory traversal, absolute paths, or any injection.

We should add a helper function, for example, isSafeFileName, to check if the Worker ID is safe before using it when constructing the filename. If not, the function should return an error and not proceed.

Where to change:

  • In scheduler/adapters/compute_baremetal.go, in SpawnWorker, before constructing scriptPath, validate req.WorkerID with the helper.
  • The helper function can be added to this file.
  • If invalid, return an error instead of writing/executing the file.

Suggested changeset 1
scheduler/adapters/compute_baremetal.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/adapters/compute_baremetal.go b/scheduler/adapters/compute_baremetal.go
--- a/scheduler/adapters/compute_baremetal.go
+++ b/scheduler/adapters/compute_baremetal.go
@@ -831,6 +831,11 @@
 
 // SpawnWorker spawns a worker on the bare metal server
 func (b *BareMetalAdapter) SpawnWorker(ctx context.Context, req *ports.SpawnWorkerRequest) (*ports.Worker, error) {
+	// Validate WorkerID for safe file name usage
+	if !isSafeFileName(req.WorkerID) {
+		return nil, fmt.Errorf("invalid worker ID: contains unsafe characters")
+	}
+
 	// Create worker record
 	worker := &ports.Worker{
 		ID:                req.WorkerID,
@@ -882,6 +887,21 @@
 	return worker, nil
 }
 
+
+// isSafeFileName checks if the input string contains only safe characters for file names.
+// Allowable: alphanumeric, underscore, dash.
+func isSafeFileName(name string) bool {
+	for _, c := range name {
+		if !(c >= 'a' && c <= 'z') &&
+		   !(c >= 'A' && c <= 'Z') &&
+		   !(c >= '0' && c <= '9') &&
+		   c != '_' && c != '-' {
+			return false
+		}
+	}
+	return true
+}
+
 // SubmitJob submits a job to the compute resource
 func (b *BareMetalAdapter) SubmitJob(ctx context.Context, req *ports.SubmitJobRequest) (*ports.Job, error) {
 	// Generate a unique job ID
EOF
@@ -831,6 +831,11 @@

// SpawnWorker spawns a worker on the bare metal server
func (b *BareMetalAdapter) SpawnWorker(ctx context.Context, req *ports.SpawnWorkerRequest) (*ports.Worker, error) {
// Validate WorkerID for safe file name usage
if !isSafeFileName(req.WorkerID) {
return nil, fmt.Errorf("invalid worker ID: contains unsafe characters")
}

// Create worker record
worker := &ports.Worker{
ID: req.WorkerID,
@@ -882,6 +887,21 @@
return worker, nil
}


// isSafeFileName checks if the input string contains only safe characters for file names.
// Allowable: alphanumeric, underscore, dash.
func isSafeFileName(name string) bool {
for _, c := range name {
if !(c >= 'a' && c <= 'z') &&
!(c >= 'A' && c <= 'Z') &&
!(c >= '0' && c <= '9') &&
c != '_' && c != '-' {
return false
}
}
return true
}

// SubmitJob submits a job to the compute resource
func (b *BareMetalAdapter) SubmitJob(ctx context.Context, req *ports.SubmitJobRequest) (*ports.Job, error) {
// Generate a unique job ID
Copilot is powered by AI and may make mistakes. Always verify output.
}

// Execute worker spawn script in background
cmd := exec.CommandContext(ctx, "bash", scriptPath)

Check failure

Code scanning / CodeQL

Command built from user-controlled sources Critical

This command depends on a
user-provided value
.
// Execute worker spawn script in background
cmd := exec.CommandContext(ctx, "bash", scriptPath)
if err := cmd.Start(); err != nil {
os.Remove(scriptPath) // Clean up script file

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 2 months ago

The best way to fix this vulnerability is to validate the user input before using it to construct a file path. Since WorkerID is meant to be a single file name component, we need to ensure it does not contain any path separators (/, \) or parent directory references (..). The recommended fix is to add a check right before constructing or using the path, and reject or sanitize the input if it is invalid.

Changes to make:

  • In scheduler/adapters/compute_baremetal.go, inside the SpawnWorker function, add validation right before scriptPath is constructed and used.
  • If the input is invalid (contains /, \, or ..), return an error and do not proceed with file operations.
  • This fix requires importing the strings package if not already imported (it is already imported).
  • No changes beyond the affected code block are needed.
Suggested changeset 1
scheduler/adapters/compute_baremetal.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/adapters/compute_baremetal.go b/scheduler/adapters/compute_baremetal.go
--- a/scheduler/adapters/compute_baremetal.go
+++ b/scheduler/adapters/compute_baremetal.go
@@ -860,6 +860,10 @@
 	}
 
 	// Write script to temporary file
+	// Validate WorkerID to prevent path traversal
+	if strings.Contains(req.WorkerID, "/") || strings.Contains(req.WorkerID, "\\") || strings.Contains(req.WorkerID, "..") {
+		return nil, fmt.Errorf("invalid WorkerID: contains forbidden characters")
+	}
 	scriptPath := fmt.Sprintf("/tmp/worker_spawn_%s.sh", req.WorkerID)
 	if err := os.WriteFile(scriptPath, []byte(spawnScript), 0755); err != nil {
 		return nil, fmt.Errorf("failed to write spawn script: %w", err)
EOF
@@ -860,6 +860,10 @@
}

// Write script to temporary file
// Validate WorkerID to prevent path traversal
if strings.Contains(req.WorkerID, "/") || strings.Contains(req.WorkerID, "\\") || strings.Contains(req.WorkerID, "..") {
return nil, fmt.Errorf("invalid WorkerID: contains forbidden characters")
}
scriptPath := fmt.Sprintf("/tmp/worker_spawn_%s.sh", req.WorkerID)
if err := os.WriteFile(scriptPath, []byte(spawnScript), 0755); err != nil {
return nil, fmt.Errorf("failed to write spawn script: %w", err)
Copilot is powered by AI and may make mistakes. Always verify output.
worker.Status = domain.WorkerStatusIdle

// Clean up script file
os.Remove(scriptPath)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 2 months ago

In general, to fix uncontrolled data used in path expression vulnerabilities, path components constructed from user input should be validated or sanitized before use. In this case, the best approach is to sanitize or validate req.WorkerID before using it in the script filename so that it cannot introduce path traversal, path separators, or otherwise produce file names outside of the intended /tmp directory.

Specifically, in scheduler/adapters/compute_baremetal.go, before using req.WorkerID, restrict it to a safe pattern (e.g., only alphanumeric, dash, and underscore), or, if that's too restrictive, at minimum reject WorkerIDs that contain /, \, or ... If invalid, return an error. This can be done at the start of the SpawnWorker method. No external dependencies are needed: standard library functions like strings.Contains and/or regular expressions are sufficient.

Add the validation at the top of the SpawnWorker function or immediately before constructing the script path. If using regex, add "regexp" to imports, but for this case, a basic check is sufficient.


Suggested changeset 1
scheduler/adapters/compute_baremetal.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/adapters/compute_baremetal.go b/scheduler/adapters/compute_baremetal.go
--- a/scheduler/adapters/compute_baremetal.go
+++ b/scheduler/adapters/compute_baremetal.go
@@ -831,6 +831,10 @@
 
 // SpawnWorker spawns a worker on the bare metal server
 func (b *BareMetalAdapter) SpawnWorker(ctx context.Context, req *ports.SpawnWorkerRequest) (*ports.Worker, error) {
+	// Validate WorkerID: must not contain path separators or ".."
+	if strings.Contains(req.WorkerID, "/") || strings.Contains(req.WorkerID, "\\") || strings.Contains(req.WorkerID, "..") {
+		return nil, fmt.Errorf("invalid WorkerID: must not contain path separators or '..'")
+	}
 	// Create worker record
 	worker := &ports.Worker{
 		ID:                req.WorkerID,
EOF
@@ -831,6 +831,10 @@

// SpawnWorker spawns a worker on the bare metal server
func (b *BareMetalAdapter) SpawnWorker(ctx context.Context, req *ports.SpawnWorkerRequest) (*ports.Worker, error) {
// Validate WorkerID: must not contain path separators or ".."
if strings.Contains(req.WorkerID, "/") || strings.Contains(req.WorkerID, "\\") || strings.Contains(req.WorkerID, "..") {
return nil, fmt.Errorf("invalid WorkerID: must not contain path separators or '..'")
}
// Create worker record
worker := &ports.Worker{
ID: req.WorkerID,
Copilot is powered by AI and may make mistakes. Always verify output.

// Write script to temporary file
scriptPath := fmt.Sprintf("/tmp/worker_spawn_%s.sh", req.WorkerID)
if err := os.WriteFile(scriptPath, []byte(spawnScript), 0755); err != nil {

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 2 months ago

To fix this, we must validate req.WorkerID in the SlurmAdapter's SpawnWorker method before using it to construct a filesystem path. The best practice is to ensure WorkerID is a safe filename or single path component: it should not contain /, \, or .., and should match a safe pattern (e.g., only alphanumerics, underscore, hyphen). If validation fails, return an error and do not proceed with file operations.
Edit only the scheduler/adapters/compute_slurm.go file, in the SpawnWorker() method. Specifically, right before constructing scriptPath, validate req.WorkerID using a regular expression or by checking for forbidden characters.
No new methods required; a local check using regex or strings.Contains is sufficient.

Suggested changeset 1
scheduler/adapters/compute_slurm.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/adapters/compute_slurm.go b/scheduler/adapters/compute_slurm.go
--- a/scheduler/adapters/compute_slurm.go
+++ b/scheduler/adapters/compute_slurm.go
@@ -962,6 +962,13 @@
 		Metadata:          req.Metadata,
 	}
 
+	// Validate WorkerID to prevent path traversal or injection
+	// Allow only alphanumerics, underscore, and hyphen
+	var validWorkerID = regexp.MustCompile(`^[a-zA-Z0-9_\-]+$`)
+	if !validWorkerID.MatchString(req.WorkerID) {
+		return nil, fmt.Errorf("invalid worker ID: contains forbidden characters")
+	}
+
 	// Generate worker spawn script using local implementation
 	experiment := &domain.Experiment{
 		ID: req.ExperimentID,
EOF
@@ -962,6 +962,13 @@
Metadata: req.Metadata,
}

// Validate WorkerID to prevent path traversal or injection
// Allow only alphanumerics, underscore, and hyphen
var validWorkerID = regexp.MustCompile(`^[a-zA-Z0-9_\-]+$`)
if !validWorkerID.MatchString(req.WorkerID) {
return nil, fmt.Errorf("invalid worker ID: contains forbidden characters")
}

// Generate worker spawn script using local implementation
experiment := &domain.Experiment{
ID: req.ExperimentID,
Copilot is powered by AI and may make mistakes. Always verify output.
// Submit worker spawn job to SLURM
jobID, err := s.SubmitTask(ctx, scriptPath)
if err != nil {
os.Remove(scriptPath) // Clean up script file

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 2 months ago

To address the problem, we need to ensure that the WorkerID used in the path is safe and cannot contain user-controlled path traversal or absolute path characters. The best way is to restrict WorkerID to a known-safe pattern (e.g., only alphanumeric, hyphen, and underscore), or at minimum reject any containing /, \, or ... Since the value is constructed from user input and an untrusted source, it's prudent to validate or sanitize it on use before constructing the file path.

Specifically, within scheduler/adapters/compute_slurm.go, in the SpawnWorker method, we should add a validation step before using req.WorkerID for the file name. If WorkerID does not match the safe pattern (e.g. using regexp), we should return a validation error. For defense in depth, it's usually best to validate as close as possible to where the variable is used in the sensitive context.

No new imports are strictly necessary, but since the snippet already imports regexp, we can use a regular expression for the safe pattern check.


Suggested changeset 1
scheduler/adapters/compute_slurm.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/adapters/compute_slurm.go b/scheduler/adapters/compute_slurm.go
--- a/scheduler/adapters/compute_slurm.go
+++ b/scheduler/adapters/compute_slurm.go
@@ -944,6 +944,12 @@
 
 // SpawnWorker spawns a worker on the SLURM cluster
 func (s *SlurmAdapter) SpawnWorker(ctx context.Context, req *ports.SpawnWorkerRequest) (*ports.Worker, error) {
+	// Validate WorkerID to prevent path traversal/absolute path abuse
+	safeIDPattern := regexp.MustCompile(`^[a-zA-Z0-9_\-]+$`)
+	if !safeIDPattern.MatchString(req.WorkerID) {
+		return nil, fmt.Errorf("invalid WorkerID: must be alphanumeric, hyphen, or underscore only")
+	}
+
 	// Create worker record
 	worker := &ports.Worker{
 		ID:                req.WorkerID,
EOF
@@ -944,6 +944,12 @@

// SpawnWorker spawns a worker on the SLURM cluster
func (s *SlurmAdapter) SpawnWorker(ctx context.Context, req *ports.SpawnWorkerRequest) (*ports.Worker, error) {
// Validate WorkerID to prevent path traversal/absolute path abuse
safeIDPattern := regexp.MustCompile(`^[a-zA-Z0-9_\-]+$`)
if !safeIDPattern.MatchString(req.WorkerID) {
return nil, fmt.Errorf("invalid WorkerID: must be alphanumeric, hyphen, or underscore only")
}

// Create worker record
worker := &ports.Worker{
ID: req.WorkerID,
Copilot is powered by AI and may make mistakes. Always verify output.
worker.Status = domain.WorkerStatusIdle

// Clean up script file
os.Remove(scriptPath)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 2 months ago

To fix the problem, we should ensure that req.WorkerID, as it is incorporated in the temporary file path, cannot be manipulated to affect directory traversal or file naming in an unsafe way. The best way is to strictly validate req.WorkerID before using it in the filename. Since WorkerID is likely intended to be a single identifier without path components, the safest fix is to check if it contains any "/", "\\", or ".." sequences and reject such values, returning an error if found.

Implement this check near the point of file creation in SpawnWorker of scheduler/adapters/compute_slurm.go, immediately before constructing/writing to scriptPath. If the value is unsafe, return an error promptly. This avoids any risk, keeps the logic simple, and preserves existing behavior for proper, expected IDs.

No new external libraries are needed—this can be done using standard library functions like strings.Contains.


Suggested changeset 1
scheduler/adapters/compute_slurm.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/adapters/compute_slurm.go b/scheduler/adapters/compute_slurm.go
--- a/scheduler/adapters/compute_slurm.go
+++ b/scheduler/adapters/compute_slurm.go
@@ -973,6 +973,10 @@
 	}
 
 	// Write script to temporary file
+	// Validate WorkerID for safe file usage
+	if strings.Contains(req.WorkerID, "/") || strings.Contains(req.WorkerID, "\\") || strings.Contains(req.WorkerID, "..") {
+		return nil, fmt.Errorf("invalid WorkerID: contains unsafe path components")
+	}
 	scriptPath := fmt.Sprintf("/tmp/worker_spawn_%s.sh", req.WorkerID)
 	if err := os.WriteFile(scriptPath, []byte(spawnScript), 0755); err != nil {
 		return nil, fmt.Errorf("failed to write spawn script: %w", err)
EOF
@@ -973,6 +973,10 @@
}

// Write script to temporary file
// Validate WorkerID for safe file usage
if strings.Contains(req.WorkerID, "/") || strings.Contains(req.WorkerID, "\\") || strings.Contains(req.WorkerID, "..") {
return nil, fmt.Errorf("invalid WorkerID: contains unsafe path components")
}
scriptPath := fmt.Sprintf("/tmp/worker_spawn_%s.sh", req.WorkerID)
if err := os.WriteFile(scriptPath, []byte(spawnScript), 0755); err != nil {
return nil, fmt.Errorf("failed to write spawn script: %w", err)
Copilot is powered by AI and may make mistakes. Always verify output.
// Build SSH config
config := &ssh.ClientConfig{
User: username,
HostKeyCallback: ssh.InsecureIgnoreHostKey(), // In production, use proper host key verification

Check failure

Code scanning / CodeQL

Use of insecure HostKeyCallback implementation High

Configuring SSH ClientConfig with insecure HostKeyCallback implementation from
this source
.

Copilot Autofix

AI 2 months ago

The best way to fix this problem is to replace the use of ssh.InsecureIgnoreHostKey() with a proper host key verification mechanism. We should obtain the allowed host’s public key—this can be read from a file or from resource metadata (the latter being convenient if the resource definition includes it). For simplicity and safety, we should use ssh.FixedHostKey, which will reject the connection unless the server presents exactly the trusted key. The fix can be implemented in the connect method by adding code to fetch the allowed host public key (e.g., from resource.Metadata["host_public_key"] if available, or alternatively from a file path), parse it, and assign ssh.FixedHostKey(parsedKey) to the HostKeyCallback field.

Necessary imports: The code already has golang.org/x/crypto/ssh imported, so no new import is needed.

Required changes:

  • In the connect method of SFTPAdapter (lines 42–162), specifically:
    • Insert code to read the allowed public key (from resource metadata).
    • Parse the public key.
    • Set HostKeyCallback: ssh.FixedHostKey(pubKey) in ssh.ClientConfig.

Suggested changeset 1
scheduler/adapters/storage_sftp.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/adapters/storage_sftp.go b/scheduler/adapters/storage_sftp.go
--- a/scheduler/adapters/storage_sftp.go
+++ b/scheduler/adapters/storage_sftp.go
@@ -94,10 +94,20 @@
 		return fmt.Errorf("username not found in credentials or resource metadata")
 	}
 
+	// Obtain the trusted host public key from resource metadata
+	hostPublicKeyStr, ok := resourceMetadata["host_public_key"]
+	if !ok || hostPublicKeyStr == "" {
+		return fmt.Errorf("host public key not found in resource metadata")
+	}
+	pubKey, err := ssh.ParsePublicKey([]byte(hostPublicKeyStr))
+	if err != nil {
+		return fmt.Errorf("failed to parse host public key: %w", err)
+	}
+
 	// Build SSH config
 	config := &ssh.ClientConfig{
 		User:            username,
-		HostKeyCallback: ssh.InsecureIgnoreHostKey(), // In production, use proper host key verification
+		HostKeyCallback: ssh.FixedHostKey(pubKey),
 		Timeout:         10 * time.Second,
 		Config: ssh.Config{
 			Ciphers: []string{
EOF
@@ -94,10 +94,20 @@
return fmt.Errorf("username not found in credentials or resource metadata")
}

// Obtain the trusted host public key from resource metadata
hostPublicKeyStr, ok := resourceMetadata["host_public_key"]
if !ok || hostPublicKeyStr == "" {
return fmt.Errorf("host public key not found in resource metadata")
}
pubKey, err := ssh.ParsePublicKey([]byte(hostPublicKeyStr))
if err != nil {
return fmt.Errorf("failed to parse host public key: %w", err)
}

// Build SSH config
config := &ssh.ClientConfig{
User: username,
HostKeyCallback: ssh.InsecureIgnoreHostKey(), // In production, use proper host key verification
HostKeyCallback: ssh.FixedHostKey(pubKey),
Timeout: 10 * time.Second,
Config: ssh.Config{
Ciphers: []string{
Copilot is powered by AI and may make mistakes. Always verify output.
tarReader := tar.NewReader(gzReader)

for {
header, err := tarReader.Next()

Check failure

Code scanning / CodeQL

Arbitrary file access during archive extraction ("Zip Slip") High

Unsanitized archive entry, which may contain '..', is used in a
file system operation
.
Unsanitized archive entry, which may contain '..', is used in a
file system operation
.
Unsanitized archive entry, which may contain '..', is used in a
file system operation
.

Copilot Autofix

AI 2 months ago

The best way to fix this issue is to restrict tar archive entry paths so that they cannot escape the specified destination directory and do not contain any path traversal (..), absolute paths, or other dangerous constructs. To do this, we should:

  • Use filepath.Join(destDir, header.Name) to construct the intended path.
  • Clean the path using filepath.Clean.
  • Convert both the target path and the destination directory to their absolute forms with filepath.Abs.
  • Check that the target path is a strict subpath of the destination directory: specifically, that the absolute target path starts with the absolute destination directory path, followed by a path separator—this ensures containment.
  • Skip extraction for entries which fail these checks, optionally logging a warning.

These changes should be implemented inside the extractTarGz function (scheduler/cmd/cli/data.go, lines 644–694). No new packages are needed because filepath and strings are already imported.


Suggested changeset 1
scheduler/cmd/cli/data.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scheduler/cmd/cli/data.go b/scheduler/cmd/cli/data.go
--- a/scheduler/cmd/cli/data.go
+++ b/scheduler/cmd/cli/data.go
@@ -659,24 +659,39 @@
 			return err
 		}
 
-		// Create full path
-		targetPath := filepath.Join(destDir, header.Name)
+		// Sanitize archive entry path to prevent Zip Slip/directory traversal
+		cleanedName := filepath.Clean(header.Name)
+		targetPath := filepath.Join(destDir, cleanedName)
+		absDestDir, err := filepath.Abs(destDir)
+		if err != nil {
+			return err
+		}
+		absTargetPath, err := filepath.Abs(targetPath)
+		if err != nil {
+			return err
+		}
+		// Ensure that absTargetPath is within absDestDir
+		if !strings.HasPrefix(absTargetPath, absDestDir+string(os.PathSeparator)) && absTargetPath != absDestDir {
+			// Optionally log or print warning, skip extraction of this entry
+			fmt.Printf("Skipping suspicious entry: %q\n", header.Name)
+			continue
+		}
 
 		// Create directory if needed
 		if header.Typeflag == tar.TypeDir {
-			if err := os.MkdirAll(targetPath, 0755); err != nil {
+			if err := os.MkdirAll(absTargetPath, 0755); err != nil {
 				return err
 			}
 			continue
 		}
 
 		// Create parent directories
-		if err := os.MkdirAll(filepath.Dir(targetPath), 0755); err != nil {
+		if err := os.MkdirAll(filepath.Dir(absTargetPath), 0755); err != nil {
 			return err
 		}
 
 		// Create file
-		file, err := os.Create(targetPath)
+		file, err := os.Create(absTargetPath)
 		if err != nil {
 			return err
 		}
EOF
@@ -659,24 +659,39 @@
return err
}

// Create full path
targetPath := filepath.Join(destDir, header.Name)
// Sanitize archive entry path to prevent Zip Slip/directory traversal
cleanedName := filepath.Clean(header.Name)
targetPath := filepath.Join(destDir, cleanedName)
absDestDir, err := filepath.Abs(destDir)
if err != nil {
return err
}
absTargetPath, err := filepath.Abs(targetPath)
if err != nil {
return err
}
// Ensure that absTargetPath is within absDestDir
if !strings.HasPrefix(absTargetPath, absDestDir+string(os.PathSeparator)) && absTargetPath != absDestDir {
// Optionally log or print warning, skip extraction of this entry
fmt.Printf("Skipping suspicious entry: %q\n", header.Name)
continue
}

// Create directory if needed
if header.Typeflag == tar.TypeDir {
if err := os.MkdirAll(targetPath, 0755); err != nil {
if err := os.MkdirAll(absTargetPath, 0755); err != nil {
return err
}
continue
}

// Create parent directories
if err := os.MkdirAll(filepath.Dir(targetPath), 0755); err != nil {
if err := os.MkdirAll(filepath.Dir(absTargetPath), 0755); err != nil {
return err
}

// Create file
file, err := os.Create(targetPath)
file, err := os.Create(absTargetPath)
if err != nil {
return err
}
Copilot is powered by AI and may make mistakes. Always verify output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant