Conversation
|
will fix (squash) commits soon |
There was a problem hiding this comment.
Pull request overview
This pull request adds support for instanced challenges - a feature that allows each user to spawn their own isolated container instance for a challenge, rather than sharing a single deployed instance. This is particularly useful for challenges that require per-user isolation, such as pwn challenges or vulnerable web applications.
Changes:
- Adds Redis as a caching layer for managing dynamic port allocation and instance metadata
- Implements instance lifecycle management (spawn, extend, kill) with automatic expiration
- Introduces new database fields (Instanced, InstanceExpiration) to the Challenge model
- Adds comprehensive API endpoints for users and admins to manage instances
- Removes static port allocation from challenge configs in favor of dynamic allocation from port ranges
- Includes example challenges demonstrating both simple service and docker-compose based instancing
Reviewed changes
Copilot reviewed 40 out of 41 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| go.mod, go.sum | Adds redis client dependency and removes unused dependencies |
| core/config/config.go | Adds Redis config, instance config, and port range validation |
| core/config/challenge.go | Adds instanced metadata fields and simplifies port configuration |
| core/database/challenges.go | Adds Instanced and InstanceExpiration fields to Challenge model |
| core/cache/*.go | New package for Redis-based caching of instances and port allocations |
| core/manager/instance.go | New file implementing instance lifecycle management |
| core/manager/pipeline.go | Updates deployment to use dynamic port allocation |
| core/manager/health_check.go | Adds instance cleanup prober for expired instances |
| core/manager/challenge.go | Updates undeploy to kill active instances first |
| core/manager/utils.go | Updates port registration to use cache instead of static config |
| api/instance.go | New API handlers for instance management |
| api/router.go | Adds instance API routes |
| cmd/beast/*.go | Adds Redis initialization and cache management commands |
| utils/datatypes.go | Updates port mapping comment to reflect new usage |
| _examples/instanced-* | Example challenges demonstrating instancing feature |
Comments suppressed due to low confidence (1)
core/manager/pipeline.go:387
- The error check at line 382 is performed after registering ports (lines 375-380). If the container creation failed (err != nil), the ports should not be registered. The error check should be moved before port registration to avoid registering ports for a failed container creation. Additionally, if the container creation fails, the allocated ports should be freed.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # DO NOT specify ports for instanced challenges! | ||
| # Instead, use default_port to indicate which container port to expose |
There was a problem hiding this comment.
The comment "# DO NOT specify ports for instanced challenges!" conflicts with the validation logic. According to core/config/challenge.go line 308, either ports or default_port must be specified. The comment should be updated to clarify that you should use default_port instead of ports for instanced challenges, not that you should avoid specifying any port information.
| # DO NOT specify ports for instanced challenges! | |
| # Instead, use default_port to indicate which container port to expose | |
| # Do not configure a ports list for instanced challenges. | |
| # Instead, you must use default_port to indicate which container port to expose. |
| tags = ["web", "sql", "instanced"] | ||
| maxAttemptLimit = 100 | ||
| instanced = true | ||
| instance_expiration = 12 |
There was a problem hiding this comment.
The instance_expiration value is set to 12 seconds, which is extremely short for a challenge instance. This seems like it's intended for testing purposes. For actual use, this should be a more reasonable duration (e.g., 600 seconds as mentioned in the README). Consider updating this to match the recommended value or adding a comment indicating this is a test value.
| instance_expiration = 12 | |
| instance_expiration = 600 |
| // Postgresql database for beast. The Db variable is the connection variable for the | ||
| // database, which is not closed after creating a connection here and can |
There was a problem hiding this comment.
The comment says "Postgresql database" but this is for cache/Redis initialization. This comment was likely copied from the database package and should be updated to reflect that this is for Redis/cache initialization.
| // Postgresql database for beast. The Db variable is the connection variable for the | |
| // database, which is not closed after creating a connection here and can | |
| // Redis cache for beast. The Cache variable is the client connection for the | |
| // cache, which is not closed after creating a connection here and can |
9786785 to
42be447
Compare
| projectName := fmt.Sprintf("beast-instance-%s-%s", coreUtils.EncodeID(challengeName), instanceID) | ||
| composeFile := filepath.Join(stagingDir, challengeName, config.Challenge.Env.DockerCompose) | ||
|
|
||
| if serverDeployed == core.LOCALHOST || serverDeployed == "" { |
There was a problem hiding this comment.
I don't understand the disproportion here. Why this function manually executing the command when the same thing can be done using the docker compose for localhost as well! I think it might need some refactoring but spilling the logic in this function seems very redundant.
| return core.LOCALHOST | ||
| } | ||
|
|
||
| func deployInstanceContainer(instanceID, challengeName string, hostPort uint32, imageID string, config *cfg.BeastChallengeConfig, serverDeployed string) (string, error) { |
There was a problem hiding this comment.
It seems to me that both the deploy functions and killing the instances should not be here but in the pkg/cr and pkg/remoteManager. Either that or the logic of deployment and everything should be in those files not in the functions mentioned here. Thoughts?
| return ttl, nil | ||
| } | ||
|
|
||
| func QueueInstanceForDeletion(instanceID string) error { |
There was a problem hiding this comment.
I am not convinced of the idea that using this queue and the probing logic is a good fit for our problem
There was a problem hiding this comment.
what other clean up mechanisms can we use?
Cleanup: Remove unused handler. Fix: fix typo. Fix: fix port range. Refactor: refactor constant paths. Fix: fix variable management. Update: fix port range in example Fix: Add descriptive errors. Fix: fix error handling. Feat: Cleanup undeployment Sanity: add sanity init in cache helpers
…lve timestamps and optimized database queries
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 48 out of 49 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var portRegex = regexp.MustCompile(`\$\{([^}]+)}`) | ||
|
|
||
| func ExtractPortsFromCompose(composeFile string) ([]string, error) { | ||
| data, err := os.ReadFile(composeFile) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("error while reading compose file: %w", err) | ||
| } | ||
| var raw Compose | ||
| err = yaml.Unmarshal(data, &raw) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("error while parsing compose file: %s", err.Error()) | ||
| } | ||
|
|
||
| portVariables := make([]string, 0) | ||
| seen := make(map[string]bool) | ||
| for _, service := range raw.Services { | ||
| for _, port := range service.Ports { | ||
| matches := portRegex.FindAllStringSubmatch(port, -1) | ||
|
|
||
| if len(matches) == 0 { | ||
| return nil, fmt.Errorf("port %s is not mapped using an env variable", port) | ||
| } |
There was a problem hiding this comment.
ExtractPortsFromCompose currently captures the full ${...} expression, so a mapping like ${INSTANCE_PORT:-8080}:80 will yield variable name INSTANCE_PORT:-8080, which is not a valid env var name and will break dynamic port injection. Also, returning an error for port mappings without ${...} will now fail existing non-instanced compose challenges (e.g., _examples/compose-type/docker-compose.yml uses "10020:80"). Consider parsing only the variable name portion (strip :-default / -default modifiers) and either supporting fixed mappings or gating this requirement to instanced/dynamic-port compose deployments with a clearer error message.
| server := cfg.Cfg.AvailableServers[serverDeployed] | ||
| if deploymentType == core.DEPLOYMENT_TYPES["docker_compose"] { | ||
| stagingDir := filepath.Join(core.BEAST_GLOBAL_DIR, core.BEAST_STAGING_DIR) | ||
| err := remoteManager.ComposePurgeRemote(challengeName, stagingDir, server) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to stop compose on remote: %w", err) | ||
| } |
There was a problem hiding this comment.
killInstanceContainer uses ComposePurgeRemote(challengeName, ...), but instanced compose deployments use a per-instance project name (via projectName := utils.GetProjectName(projectBase) where projectBase is instance-...). This will purge beast-<challengeName> instead of beast-instance-..., leaving remote instance projects/containers orphaned and ports potentially leaked. Pass the instance project base/name through to the remote purge/down call (or add a remote purge helper that accepts an explicit project name).
| func instanceToResponse(instance *cache.Instance) InstanceResponse { | ||
| ttl := time.Until(instance.ExpiresAt).Seconds() | ||
| if ttl < 0 { | ||
| ttl = 0 | ||
| } | ||
|
|
||
| return InstanceResponse{ | ||
| InstanceID: instance.InstanceID, | ||
| ChallengeName: instance.ChallengeName, | ||
| Port: instance.Port, | ||
| CreatedAt: instance.CreatedAt, | ||
| ExpiresAt: instance.ExpiresAt, | ||
| TTLSeconds: int64(ttl), | ||
| } |
There was a problem hiding this comment.
instanceToResponse never populates HostedAddress, so API responses will always return an empty hosted_address even though clients need the server hostname/IP. Use instance.ServerDeployed (and map it to config.Cfg.AvailableServers[name].Host when applicable, similar to challengeInfoHandler) to set HostedAddress.
| Select("DISTINCT ON (user_challenges.user_id, user_challenges.challenge_id) user_challenges.user_id, users.username, user_challenges.created_at, challenges.points, false AS is_hint"). | ||
| Joins("JOIN challenges ON user_challenges.challenge_id = challenges.id"). | ||
| Joins("JOIN users ON user_challenges.user_id = users.id"). | ||
| Where("user_challenges.user_id IN ? AND user_challenges.solved = ?", topUserId, true). |
There was a problem hiding this comment.
The user_challenges query uses DISTINCT ON (user_id, challenge_id) but no longer specifies an ORDER BY. With DISTINCT ON, PostgreSQL selects an arbitrary row unless ordering is explicit, which can produce incorrect/unstable solve timestamps and time-series scores. Add an ORDER BY user_id, challenge_id, created_at ASC (or DESC, depending on desired semantics) to deterministically pick the intended row before scanning.
| Where("user_challenges.user_id IN ? AND user_challenges.solved = ?", topUserId, true). | |
| Where("user_challenges.user_id IN ? AND user_challenges.solved = ?", topUserId, true). | |
| Order("user_challenges.user_id, user_challenges.challenge_id, user_challenges.created_at ASC"). |
| [instance_config] | ||
| # Port Range for localhost. per-server this is configured via `port-range` | ||
| local_host_port_range = '10000:11000' | ||
| default_expiration = 300 | ||
| max_extension = 600 | ||
| max_instances_per_user = 3 | ||
|
|
There was a problem hiding this comment.
The [instance_config] section documents local_host_port_range, but the code only supports per-server available_servers.<name>.port_range and does not read local_host_port_range. This mismatch means copying the example config can leave localhost/servers without a usable port range, breaking instance/compose deployment. Align the example and code: either remove this key from docs or add support + fallback logic in config validation.
| // ParsePortMapping parses the port mapping string and return the required ports | ||
| // If the portMapping string is not valid, this returns an error. | ||
| // The format of the port mapping is `HOST_PORT:CONTAINER_PORT` | ||
| // The format of the port mapping is `PORT_FIRST:PORT_LAST` | ||
| func ParsePortMapping(portMap string) (uint32, uint32, error) { | ||
| ports := strings.Split(portMap, mappingDelimeter) | ||
| ports := strings.Split(portMap, core.MappingDelimeter) | ||
|
|
||
| if len(ports) != 2 { | ||
| return 0, 0, errors.New("port mapping string is not valid") | ||
| } | ||
|
|
||
| hostPort, err := strconv.ParseUint(ports[0], 10, 32) | ||
| firstPort, err := strconv.ParseUint(ports[0], 10, 32) | ||
| if err != nil { | ||
| return 0, 0, fmt.Errorf("host port is not a valid port in: %s", portMap) | ||
| } | ||
|
|
||
| containerPort, err := strconv.ParseUint(ports[1], 10, 32) | ||
| lastPort, err := strconv.ParseUint(ports[1], 10, 32) | ||
| if err != nil { | ||
| return 0, 0, fmt.Errorf("container port is not a valid port in: %s", portMap) | ||
| } | ||
|
|
||
| return uint32(hostPort), uint32(containerPort), nil | ||
| return uint32(firstPort), uint32(lastPort), nil |
There was a problem hiding this comment.
ParsePortMapping now parses a port range (PORT_FIRST:PORT_LAST), but the error messages still refer to “host port” / “container port”, which is confusing for callers and operators. Update the messages to reference “first port” / “last port” (and ideally validate firstPort <= lastPort here too, so all callers get consistent validation).
This PR introduces instanced challenges: a new challenge type where each contestant gets their own isolated container environment, automatically cleaned up after expiration. It also replaces the old static port-list approach with a Redis-backed dynamic port allocation system.