Topograph accepts its configuration file path using the -c command-line parameter. The configuration file is a YAML document. A sample configuration file is located at config/topograph-config.yaml.
The configuration file supports the following parameters:
# serving topograph endpoint
http:
# port: specifies the port on which the API server will listen (required).
port: 49021
# ssl: enables HTTPS protocol if set to `true` (optional).
ssl: false
# provider: the provider that topograph will use (optional)
# Valid options include "aws", "oci", "gcp", "nebius", "nscale", "netq", "dra", "infiniband-k8s", "infiniband-bm" or "test".
# Can be overridden if the provider is specified in a topology request to topograph
provider: test
# engine: the engine that topograph will use (optional)
# Valid options include "slurm", "k8s", "slinky", or "graph".
# Can be overridden if the engine is specified in a topology request to topograph
engine: slurm
# requestAggregationDelay: defines the delay before processing a request (required).
# Topograph aggregates multiple sequential requests within this delay into a single request,
# processing only if no new requests arrive during the specified duration.
requestAggregationDelay: 15s
# pageSize: sets the page size for topology requests against a CSP API (optional).
pageSize: 100
# ssl: specifies the paths to the TLS certificate, private key,
# and CA certificate (required if `http.ssl=true`).
ssl:
cert: /etc/topograph/ssl/server-cert.pem
key: /etc/topograph/ssl/server-key.pem
ca_cert: /etc/topograph/ssl/ca-cert.pem
# credentialsPath: specifies the path to a YAML file containing API credentials (optional).
# When using credentials in Kubernetes-based engines ("k8s" or "slinky"),
# the secret file must be named `credentials.yaml`. For example:
# `kubectl create secret generic <secret-name> --from-file=credentials.yaml=<path to credentials>`
# For more details about credential configuration, refer to the docs/providers section.
# credentialsPath:
# env: environment variable names and values to inject into Topograph's shell (optional).
# The `PATH` variable, if provided, will append the specified value to the existing `PATH`.
# env:
# SLURM_CONF: /etc/slurm/slurm.conf
# PATH:Topograph exposes three endpoints for interacting with the service. Below are the details of each endpoint:
- URL:
GET http://<server>:<port>/healthz - Description: This endpoint verifies the service status. It returns a "200 OK" HTTP response if the service is reachable.
-
URL:
POST http://<server>:<port>/v1/generate -
Description: This endpoint is used to request a new cluster topology.
-
Payload: The request body is a JSON object organized into three top-level sections:
- provider: (optional) Selects the topology source and provides any provider-specific authentication or parameters.
- name: (optional) A string specifying the Service Provider, such as
aws,oci,gcp,nebius,nscale,netq,dra,infiniband-k8s,infiniband-bmortest. This parameter will override the provider set in the topograph config. - creds: (optional) A key-value map with provider-specific parameters for authentication.
- params: (optional) A key-value map with provider-specific parameters. The
testprovider uses these parameters for response simulation; for complete behavior and examples, see Test Mode and Test Provider.
- name: (optional) A string specifying the Service Provider, such as
- engine: (optional) Selects the topology output and provides any engine-specific parameters.
- name: (optional) A string specifying the topology output, either
slurm,k8s,slinky, orgraph. This parameter will override the engine set in the topograph config. - params: (optional) A key-value map with engine-specific parameters.
- plugin: (optional) Used in: [
slurm,slinky]. A string specifying the cluster-wide topology plugin:topology/treeortopology/block. Forslurm, this defaults totopology/treewhen neitherpluginnortopologiesis set. Do not setplugintogether withtopologies. - blockSizes: (optional) Used in: [
slurm,slinky]. An array of block sizes fortopology/block. - topologyConfigPath: Used in: [
slurm,slinky,graph]. Optional forslurmandgraph; required forslinky. Forslurm, a file path for the topology configuration; if omitted, the topology config content is returned in the HTTP response. Forslinky, the key for the topology config in the ConfigMap. Forgraph, an existing path on the Topograph host where instance JSON should be written; if omitted, the JSON is returned in the topology response. - topologies: (optional) Used in: [
slurm,slinky]. A map of named per-partition topology settings. Do not set top-levelplugintogether withtopologies.- plugin: Used in: [
slurm,slinky]. A required string specifying the per-partition topology plugin:topology/tree,topology/block, ortopology/flat. - blockSizes: (optional) Used in: [
slurm,slinky]. An array of block sizes fortopology/block. - nodes: (optional) Used in: [
slurm,slinky]. An explicit list of SLURM nodes for this topology. If omitted, Topograph can discover membership frompodSelector(slinkyonly) orpartition. - partition: (optional) Used in: [
slurm,slinky]. A SLURM partition name used to discover nodes withscontrol show partitionwhennodesis not set. Forslinky, this fallback is used only when the topology entry does not setpodSelector. - podSelector: (optional) Used in: [
slinky]. A Kubernetes label selector for slurmd pods in this partition.nodesandpodSelectorare mutually exclusive on the same topology entry. - clusterDefault: (optional) Used in: [
slurm,slinky]. Iftrue, marks this topology as the default for nodes not assigned to another topology; commonly used withplugin: topology/flat.
- plugin: Used in: [
- reconfigure: (optional) Used in: [
slurm]. Iftrue, invokescontrol reconfigureafter topology config is generated. Defaultfalse. - namespace: Used in: [
slinky]. The required namespace where the SLURM cluster is running. - podSelector: Used in: [
slinky]. A required Kubernetes label selector for pods running SLURM nodes. - nodeSelector: (optional) Used in: [
k8s,slinky]. A Kubernetes node label map that filters which nodes participate in topology generation. - topologyConfigmapName: Used in: [
slinky]. The required name of the ConfigMap containing the topology config. - useDynamicNodes: (optional) Used in: [
slinky]. Iftrue, Kubernetes nodes matched by the Node Selector will be annotated with the topology spec. - configUpdateMode: (optional) Used in: [
slinky]. By default, the full topology YAML is written in the Slurm ConfigMap.skeleton-onlyoverrides to include switches or blocks only (no node lines);noneskips updating the topology key in the ConfigMap.
- plugin: (optional) Used in: [
- name: (optional) A string specifying the topology output, either
- nodes: (optional) Supplies the cluster nodes used for topology generation as an array of regions mapping instance IDs to node names.
Example:
- provider: (optional) Selects the topology source and provides any provider-specific authentication or parameters.
{
"provider": {
"name": "aws",
"creds": {
"accessKeyId": "id",
"secretAccessKey": "secret"
}
},
"engine": {
"name": "slurm",
"params": {
"plugin": "topology/block",
"blockSizes": [30, 120]
}
},
"nodes": [
{
"region": "region1",
"instances": {
"instance1": "node1",
"instance2": "node2",
"instance3": "node3"
}
},
{
"region": "region2",
"instances": {
"instance4": "node4",
"instance5": "node5",
"instance6": "node6"
}
}
]
}- Response: This endpoint immediately returns a "202 Accepted" status with a unique request ID if the request is valid. If not, it returns an appropriate error code.
- URL:
GET http://<server>:<port>/v1/topology - Description: This endpoint retrieves the result of a topology request.
- URL Query Parameters:
- uid: Specifies the request ID returned by the topology request endpoint.
- Response: Depending on the request's execution stage, this endpoint can return:
- "200 OK" - The request has completed successfully.
- "202 Accepted" - The request is still in progress and has not completed yet.
- "404 Not Found" - The specified request ID does not exist.
- Other error responses encountered by Topograph during request execution.
Example usage:
id=$(curl -s -X POST -H "Content-Type: application/json" -d @payload.json http://localhost:49021/v1/generate)
curl -s "http://localhost:49021/v1/topology?uid=$id"