Sadservers with instance on demand#451
Conversation
61cc74f to
66d89f4
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces “instance on demand” support for Sadservers-style challenges by adding Redis-backed caching for per-user instances and dynamic port allocation, alongside new API endpoints and periodic cleanup/backup routines.
Changes:
- Add Redis cache layer for instance lifecycle management and port allocation/freeing.
- Add manager + API endpoints to spawn/extend/kill instances; add background cleanup for expired/orphaned instances.
- Replace flag submission endpoint with a container
check.sh-based verification flow and update challenge/config schemas to support instancing.
Reviewed changes
Copilot reviewed 44 out of 45 changed files in this pull request and generated 32 comments.
Show a summary per file
| File | Description |
|---|---|
| utils/prompt.go | Fix defaulting behavior for empty integer input. |
| utils/datatypes.go | Reframe ParsePortMapping for port ranges (but still used broadly). |
| utils/cache.go | Add Redis key helpers for host/container port tracking. |
| pkg/remoteManager/ssh.go | Add remote docker exec command runner returning exit code/output. |
| pkg/cr/containers.go | Add ExecResult, support extra labels, and local container exec helper. |
| core/manager/utils.go | Store instancing metadata in DB entries; switch port table population to cached ports. |
| core/manager/sync.go | Remove used-port list refresh during sync. |
| core/manager/pipeline.go | Allocate ports from configured ranges via Redis; skip deploy stage for instanced challenges. |
| core/manager/instance.go | New instance orchestration (spawn/extend/kill), local+remote deploy/verify helpers. |
| core/manager/health_check.go | Add instance cleanup prober + cache backups; orphan cleanup logic. |
| core/manager/challenge.go | Kill active instances on undeploy; free container ports via cache. |
| core/database/tag.go | Include instancing/status fields in tag-related metadata queries. |
| core/database/challenges.go | Add instancing fields to Challenge model; select extra metadata fields; rename submission saver. |
| core/constants.go | Add Redis/check-script constants; change SSH_PORT type. |
| core/config/config.go | Add Redis + instance config, localhost/remote port ranges, and validation helpers. |
| core/config/challenge.go | Add instanced challenge metadata; remove explicit port-mapping support. |
| core/cache/ports.go | New Redis-backed port allocation/free APIs. |
| core/cache/instance.go | New Redis-backed instance store + deletion queue + TTL handling. |
| core/cache/cache.go | New Redis connection/backup/reset utilities. |
| cmd/beast/run.go | Add cache backup/connection termination during shutdown. |
| cmd/beast/init.go | Add Redis bootstrap (ACL user creation); adjust DB init DSN usage. |
| cmd/beast/config.go | Prompt for Redis config; enable health prober by default. |
| cmd/beast/commands.go | Register cache backup/reset/restore commands. |
| cmd/beast/cache.go | New CLI commands for cache reset/restore. |
| cmd/beast/backup.go | Add backup-cache command. |
| api/router.go | Add instance/admin instance routes; swap /submit to /check. |
| api/response.go | Extend challenge metadata response; rename submit response type. |
| api/main.go | Initialize Redis cache on API startup. |
| api/instance.go | New user/admin instance management endpoints. |
| api/info.go | Change used-ports response (now empty). |
| api/check.go | New check-script-based challenge verification endpoint. |
| api/submit.go | Removed legacy flag submission endpoint. |
| _examples/example.config.toml | Add Redis + port-range examples (needs schema alignment). |
| _examples/instanced-service/* | Add instanced service example challenge + docs. |
| _examples/instanced-compose/* | Add instanced docker-compose example challenge + docs. |
| go.mod / go.sum | Add Redis client dependency and update module sums. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| port_range_start = 30000 # Start of port range for instances | ||
| port_range_end = 40000 # End of port range for instances | ||
| default_expiration = 300 # Default TTL in seconds (5 minutes) | ||
| max_extension = 600 # Maximum extension time (10 minutes) | ||
| max_instances_per_user = 3 # Max concurrent instances per user |
There was a problem hiding this comment.
The README’s global config snippet uses port_range_start / port_range_end under [instance_config], but the implementation configures port ranges via local_host_port_range and available_servers.*.port_range. Update this snippet to reflect the actual config keys.
| port_range_start = 30000 # Start of port range for instances | |
| port_range_end = 40000 # End of port range for instances | |
| default_expiration = 300 # Default TTL in seconds (5 minutes) | |
| max_extension = 600 # Maximum extension time (10 minutes) | |
| max_instances_per_user = 3 # Max concurrent instances per user | |
| local_host_port_range = "30000-40000" # Host port range for instances | |
| default_expiration = 300 # Default TTL in seconds (5 minutes) | |
| max_extension = 600 # Maximum extension time (10 minutes) | |
| max_instances_per_user = 3 # Max concurrent instances per user | |
| [available_servers.local] | |
| port_range = "30000-40000" # Port range available on this server |
api/check.go
Outdated
| localDeploy := challenge.ServerDeployed == core.LOCALHOST || challenge.ServerDeployed == "" | ||
| instance, err := cache.GetInstance(instanceId) | ||
| if err != nil { | ||
| c.JSON(http.StatusInternalServerError, HTTPErrorResp{ | ||
| Error: "CACHE ERROR while processing the request.", |
There was a problem hiding this comment.
After loading the instance from cache, the handler doesn’t verify that it belongs to the authenticated user (or matches the requested challenge). This allows submitting someone else’s instance_id to claim a solve. Validate instance.UserID against the current user and instance.ChallengeName/challenge ID before executing check.sh.
api/info.go
Outdated
| c.JSON(http.StatusOK, PortsInUseResp{ | ||
| MinPortValue: core.ALLOWED_MIN_PORT_VALUE, | ||
| MaxPortValue: core.ALLOWED_MAX_PORT_VALUE, | ||
| PortsInUse: cfg.USED_PORTS_LIST, | ||
| PortsInUse: []uint32{}, |
There was a problem hiding this comment.
/api/info/usedPorts now always returns an empty list, which breaks the contract of this endpoint and hides actual port usage. If the old USED_PORTS_LIST mechanism is being removed, consider sourcing this from the Redis port allocation set(s) instead (e.g., SMEMBERS host:<host>), or remove/deprecate the endpoint explicitly.
There was a problem hiding this comment.
Will resolve this on the redis-instance branch
cmd/beast/init.go
Outdated
| } | ||
|
|
||
| func initCache() error { | ||
| log.Infoln("Ininializing cache...") |
There was a problem hiding this comment.
Typo in log message: “Ininializing cache...” should be “Initializing cache...”.
| log.Infoln("Ininializing cache...") | |
| log.Infoln("Initializing cache...") |
| /* both ports are inclusive */ | ||
| portRange := lastPort - firstPort - 1 | ||
|
|
There was a problem hiding this comment.
portRange := lastPort - firstPort - 1 contradicts the “both ports are inclusive” comment and can underflow (e.g., when lastPort <= firstPort+1), yielding a huge uint32 and causing GetFreePort to scan an unintended range. Compute the inclusive range size as lastPort - firstPort + 1 and guard against invalid/small ranges before subtracting.
| client, err := CreateSSHClient(server) | ||
| if err != nil { | ||
| return result, fmt.Errorf("failed to create session: %s", err) | ||
| } | ||
| session, err := client.NewSession() |
There was a problem hiding this comment.
RunCommandInContainerOnServer creates an SSH client but never closes it (only the session is closed). This can leak SSH connections over time. Add defer client.Close() after CreateSSHClient succeeds.
core/config/config.go
Outdated
| } | ||
|
|
||
| if firstPort < core.ALLOWED_MIN_PORT_VALUE { | ||
| return fmt.Errorf("invalid port range, range cannot preceed %v", core.ALLOWED_MIN_PORT_VALUE) |
There was a problem hiding this comment.
Typo in error message: “preceed” should be “precede”. This message is user-facing during config validation.
| return fmt.Errorf("invalid port range, range cannot preceed %v", core.ALLOWED_MIN_PORT_VALUE) | |
| return fmt.Errorf("invalid port range, range cannot precede %v", core.ALLOWED_MIN_PORT_VALUE) |
| func GetFreePort(host string, firstPort uint32, portRange uint32) (uint32, error) { | ||
| CacheMutex.Lock() | ||
| defer CacheMutex.Unlock() | ||
|
|
||
| ctx := context.Background() |
There was a problem hiding this comment.
These helpers dereference CacheMutex/Cache without checking initialization. CLI flows like beast challenge deploy don’t call cache.Init(), and cache.Init() can also fail to connect, which would panic here. Return a clear error when the cache isn’t initialized, or ensure all entrypoints init and validate cache connectivity first.
| "redis-cli", | ||
| "-h", cacheConfig.RedisConfig.Host, | ||
| "-p", cacheConfig.RedisConfig.Port, | ||
| "--user", cacheConfig.RedisConfig.User, | ||
| "-n", strconv.Itoa(cacheConfig.RedisConfig.DB), |
There was a problem hiding this comment.
ResetCache always includes --user even when redis_config.user is empty (contrast with BackupCache). This will cause redis-cli to fail for default-user setups. Only include --user when a non-empty user is configured, or default it explicitly.
There was a problem hiding this comment.
don't think I should change this, since I'm assuming that the config file hasn't been tampered with. If it has been then errors should be expected. Am leaving the same in database reset as well. @v1bh475u is that okay?
| $query = "SELECT * FROM users WHERE username='$username' AND password='$password'"; | ||
| $result = $conn->query($query); |
There was a problem hiding this comment.
The login handler builds an SQL query by directly interpolating $_POST['username'] and $_POST['password'] into the string, which makes this endpoint vulnerable to SQL injection. An attacker can craft input such as admin' OR '1'='1 in the username field to bypass authentication and read arbitrary data from the users and secrets tables (including the flag). Use parameterized queries/prepared statements and avoid concatenating raw user input into SQL strings.
Cleanup: Remove unused handler. Fix: fix typo. Fix: fix port range. Refactor: refactor constant paths. Fix: fix variable management. Update: fix port range in example Fix: Add descriptive errors. Fix: fix error handling. Feat: Cleanup undeployment Sanity: add sanity init in cache helpers
8927581 to
e496285
Compare
This branch is based on the
redis-instancebranch and contains specific changes for sad server type challenges.