Skip to content

Add Instanced challenges#450

Open
gqvz wants to merge 43 commits intobl4ze/devfrom
redis-instance
Open

Add Instanced challenges#450
gqvz wants to merge 43 commits intobl4ze/devfrom
redis-instance

Conversation

@gqvz
Copy link
Copy Markdown
Member

@gqvz gqvz commented Feb 5, 2026

This PR introduces instanced challenges: a new challenge type where each contestant gets their own isolated container environment, automatically cleaned up after expiration. It also replaces the old static port-list approach with a Redis-backed dynamic port allocation system.

@gqvz
Copy link
Copy Markdown
Member Author

gqvz commented Feb 5, 2026

will fix (squash) commits soon

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for instanced challenges - a feature that allows each user to spawn their own isolated container instance for a challenge, rather than sharing a single deployed instance. This is particularly useful for challenges that require per-user isolation, such as pwn challenges or vulnerable web applications.

Changes:

  • Adds Redis as a caching layer for managing dynamic port allocation and instance metadata
  • Implements instance lifecycle management (spawn, extend, kill) with automatic expiration
  • Introduces new database fields (Instanced, InstanceExpiration) to the Challenge model
  • Adds comprehensive API endpoints for users and admins to manage instances
  • Removes static port allocation from challenge configs in favor of dynamic allocation from port ranges
  • Includes example challenges demonstrating both simple service and docker-compose based instancing

Reviewed changes

Copilot reviewed 40 out of 41 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
go.mod, go.sum Adds redis client dependency and removes unused dependencies
core/config/config.go Adds Redis config, instance config, and port range validation
core/config/challenge.go Adds instanced metadata fields and simplifies port configuration
core/database/challenges.go Adds Instanced and InstanceExpiration fields to Challenge model
core/cache/*.go New package for Redis-based caching of instances and port allocations
core/manager/instance.go New file implementing instance lifecycle management
core/manager/pipeline.go Updates deployment to use dynamic port allocation
core/manager/health_check.go Adds instance cleanup prober for expired instances
core/manager/challenge.go Updates undeploy to kill active instances first
core/manager/utils.go Updates port registration to use cache instead of static config
api/instance.go New API handlers for instance management
api/router.go Adds instance API routes
cmd/beast/*.go Adds Redis initialization and cache management commands
utils/datatypes.go Updates port mapping comment to reflect new usage
_examples/instanced-* Example challenges demonstrating instancing feature
Comments suppressed due to low confidence (1)

core/manager/pipeline.go:387

  • The error check at line 382 is performed after registering ports (lines 375-380). If the container creation failed (err != nil), the ports should not be registered. The error check should be moved before port registration to avoid registering ports for a failed container creation. Additionally, if the container creation fails, the allocated ports should be freed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +21 to +22
# DO NOT specify ports for instanced challenges!
# Instead, use default_port to indicate which container port to expose
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "# DO NOT specify ports for instanced challenges!" conflicts with the validation logic. According to core/config/challenge.go line 308, either ports or default_port must be specified. The comment should be updated to clarify that you should use default_port instead of ports for instanced challenges, not that you should avoid specifying any port information.

Suggested change
# DO NOT specify ports for instanced challenges!
# Instead, use default_port to indicate which container port to expose
# Do not configure a ports list for instanced challenges.
# Instead, you must use default_port to indicate which container port to expose.

Copilot uses AI. Check for mistakes.
Comment thread core/cache/cache.go Outdated
tags = ["web", "sql", "instanced"]
maxAttemptLimit = 100
instanced = true
instance_expiration = 12
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance_expiration value is set to 12 seconds, which is extremely short for a challenge instance. This seems like it's intended for testing purposes. For actual use, this should be a more reasonable duration (e.g., 600 seconds as mentioned in the README). Consider updating this to match the recommended value or adding a comment indicating this is a test value.

Suggested change
instance_expiration = 12
instance_expiration = 600

Copilot uses AI. Check for mistakes.
Comment thread core/cache/ports.go Outdated
Comment thread core/manager/pipeline.go Outdated
Comment thread core/cache/cache.go
Comment on lines +72 to +73
// Postgresql database for beast. The Db variable is the connection variable for the
// database, which is not closed after creating a connection here and can
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Postgresql database" but this is for cache/Redis initialization. This comment was likely copied from the database package and should be updated to reflect that this is for Redis/cache initialization.

Suggested change
// Postgresql database for beast. The Db variable is the connection variable for the
// database, which is not closed after creating a connection here and can
// Redis cache for beast. The Cache variable is the client connection for the
// cache, which is not closed after creating a connection here and can

Copilot uses AI. Check for mistakes.
Comment thread core/cache/cache.go Outdated
Comment thread cmd/beast/init.go Outdated
Comment thread cmd/beast/init.go Outdated
Comment thread core/manager/pipeline.go Outdated
@kunrex kunrex force-pushed the redis-instance branch 3 times, most recently from 9786785 to 42be447 Compare February 9, 2026 16:20
Comment thread api/main.go
Comment thread core/cache/cache.go
Comment thread pkg/remoteManager/init.go Outdated
Comment thread utils/datatypes.go Outdated
Comment thread core/manager/instance.go Outdated
Comment thread core/manager/instance.go Outdated
Comment thread core/manager/instance.go Outdated
projectName := fmt.Sprintf("beast-instance-%s-%s", coreUtils.EncodeID(challengeName), instanceID)
composeFile := filepath.Join(stagingDir, challengeName, config.Challenge.Env.DockerCompose)

if serverDeployed == core.LOCALHOST || serverDeployed == "" {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the disproportion here. Why this function manually executing the command when the same thing can be done using the docker compose for localhost as well! I think it might need some refactoring but spilling the logic in this function seems very redundant.

Comment thread core/manager/instance.go Outdated
return core.LOCALHOST
}

func deployInstanceContainer(instanceID, challengeName string, hostPort uint32, imageID string, config *cfg.BeastChallengeConfig, serverDeployed string) (string, error) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that both the deploy functions and killing the instances should not be here but in the pkg/cr and pkg/remoteManager. Either that or the logic of deployment and everything should be in those files not in the functions mentioned here. Thoughts?

Comment thread core/manager/instance.go Outdated
Comment thread core/cache/instance.go
return ttl, nil
}

func QueueInstanceForDeletion(instanceID string) error {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced of the idea that using this queue and the probing logic is a good fit for our problem

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what other clean up mechanisms can we use?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 48 out of 49 changed files in this pull request and generated 14 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread utils/compose.go
Comment on lines +16 to +37
var portRegex = regexp.MustCompile(`\$\{([^}]+)}`)

func ExtractPortsFromCompose(composeFile string) ([]string, error) {
data, err := os.ReadFile(composeFile)
if err != nil {
return nil, fmt.Errorf("error while reading compose file: %w", err)
}
var raw Compose
err = yaml.Unmarshal(data, &raw)
if err != nil {
return nil, fmt.Errorf("error while parsing compose file: %s", err.Error())
}

portVariables := make([]string, 0)
seen := make(map[string]bool)
for _, service := range raw.Services {
for _, port := range service.Ports {
matches := portRegex.FindAllStringSubmatch(port, -1)

if len(matches) == 0 {
return nil, fmt.Errorf("port %s is not mapped using an env variable", port)
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExtractPortsFromCompose currently captures the full ${...} expression, so a mapping like ${INSTANCE_PORT:-8080}:80 will yield variable name INSTANCE_PORT:-8080, which is not a valid env var name and will break dynamic port injection. Also, returning an error for port mappings without ${...} will now fail existing non-instanced compose challenges (e.g., _examples/compose-type/docker-compose.yml uses "10020:80"). Consider parsing only the variable name portion (strip :-default / -default modifiers) and either supporting fixed mappings or gating this requirement to instanced/dynamic-port compose deployments with a clearer error message.

Copilot uses AI. Check for mistakes.
Comment thread core/manager/instance.go
Comment on lines +388 to +394
server := cfg.Cfg.AvailableServers[serverDeployed]
if deploymentType == core.DEPLOYMENT_TYPES["docker_compose"] {
stagingDir := filepath.Join(core.BEAST_GLOBAL_DIR, core.BEAST_STAGING_DIR)
err := remoteManager.ComposePurgeRemote(challengeName, stagingDir, server)
if err != nil {
return fmt.Errorf("failed to stop compose on remote: %w", err)
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

killInstanceContainer uses ComposePurgeRemote(challengeName, ...), but instanced compose deployments use a per-instance project name (via projectName := utils.GetProjectName(projectBase) where projectBase is instance-...). This will purge beast-<challengeName> instead of beast-instance-..., leaving remote instance projects/containers orphaned and ports potentially leaked. Pass the instance project base/name through to the remote purge/down call (or add a remote purge helper that accepts an explicit project name).

Copilot uses AI. Check for mistakes.
Comment thread api/instance.go
Comment on lines +35 to +48
func instanceToResponse(instance *cache.Instance) InstanceResponse {
ttl := time.Until(instance.ExpiresAt).Seconds()
if ttl < 0 {
ttl = 0
}

return InstanceResponse{
InstanceID: instance.InstanceID,
ChallengeName: instance.ChallengeName,
Port: instance.Port,
CreatedAt: instance.CreatedAt,
ExpiresAt: instance.ExpiresAt,
TTLSeconds: int64(ttl),
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instanceToResponse never populates HostedAddress, so API responses will always return an empty hosted_address even though clients need the server hostname/IP. Use instance.ServerDeployed (and map it to config.Cfg.AvailableServers[name].Host when applicable, similar to challengeInfoHandler) to set HostedAddress.

Copilot uses AI. Check for mistakes.
Select("DISTINCT ON (user_challenges.user_id, user_challenges.challenge_id) user_challenges.user_id, users.username, user_challenges.created_at, challenges.points, false AS is_hint").
Joins("JOIN challenges ON user_challenges.challenge_id = challenges.id").
Joins("JOIN users ON user_challenges.user_id = users.id").
Where("user_challenges.user_id IN ? AND user_challenges.solved = ?", topUserId, true).
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user_challenges query uses DISTINCT ON (user_id, challenge_id) but no longer specifies an ORDER BY. With DISTINCT ON, PostgreSQL selects an arbitrary row unless ordering is explicit, which can produce incorrect/unstable solve timestamps and time-series scores. Add an ORDER BY user_id, challenge_id, created_at ASC (or DESC, depending on desired semantics) to deterministically pick the intended row before scanning.

Suggested change
Where("user_challenges.user_id IN ? AND user_challenges.solved = ?", topUserId, true).
Where("user_challenges.user_id IN ? AND user_challenges.solved = ?", topUserId, true).
Order("user_challenges.user_id, user_challenges.challenge_id, user_challenges.created_at ASC").

Copilot uses AI. Check for mistakes.
Comment thread core/database/challenges.go Outdated
Comment on lines +131 to +137
[instance_config]
# Port Range for localhost. per-server this is configured via `port-range`
local_host_port_range = '10000:11000'
default_expiration = 300
max_extension = 600
max_instances_per_user = 3

Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [instance_config] section documents local_host_port_range, but the code only supports per-server available_servers.<name>.port_range and does not read local_host_port_range. This mismatch means copying the example config can leave localhost/servers without a usable port range, breaking instance/compose deployment. Align the example and code: either remove this key from docs or add support + fallback logic in config validation.

Copilot uses AI. Check for mistakes.
Comment thread core/constants.go Outdated
Comment thread utils/datatypes.go
Comment on lines 46 to +66
// ParsePortMapping parses the port mapping string and return the required ports
// If the portMapping string is not valid, this returns an error.
// The format of the port mapping is `HOST_PORT:CONTAINER_PORT`
// The format of the port mapping is `PORT_FIRST:PORT_LAST`
func ParsePortMapping(portMap string) (uint32, uint32, error) {
ports := strings.Split(portMap, mappingDelimeter)
ports := strings.Split(portMap, core.MappingDelimeter)

if len(ports) != 2 {
return 0, 0, errors.New("port mapping string is not valid")
}

hostPort, err := strconv.ParseUint(ports[0], 10, 32)
firstPort, err := strconv.ParseUint(ports[0], 10, 32)
if err != nil {
return 0, 0, fmt.Errorf("host port is not a valid port in: %s", portMap)
}

containerPort, err := strconv.ParseUint(ports[1], 10, 32)
lastPort, err := strconv.ParseUint(ports[1], 10, 32)
if err != nil {
return 0, 0, fmt.Errorf("container port is not a valid port in: %s", portMap)
}

return uint32(hostPort), uint32(containerPort), nil
return uint32(firstPort), uint32(lastPort), nil
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParsePortMapping now parses a port range (PORT_FIRST:PORT_LAST), but the error messages still refer to “host port” / “container port”, which is confusing for callers and operators. Update the messages to reference “first port” / “last port” (and ideally validate firstPort <= lastPort here too, so all callers get consistent validation).

Copilot uses AI. Check for mistakes.
Comment thread core/manager/pipeline.go Outdated
Comment thread core/manager/pipeline.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants