Workload input and output

What you send into the system and what you get out—and where to send your results.

Input contract (what you provide)

To run a workload, you provide the following. All of these are configured when you create a deployment in the portal (or via API).

Image URI and tag — The Docker image that runs your job (e.g. from Amazon ECR or another registry). We pull and run this image on the provisioned instance.
Compute profile — Instance type (e.g. g6.xlarge, p5.48xlarge) and optional region preferences. The platform uses this to request Spot capacity and to meter usage for billing.
Environment variables and secrets — Optional key-value pairs or references to secrets. These are injected into the container at runtime.
Startup command — Optional override for the container’s default command (e.g. a training script or entrypoint).
Storage options — Any mounted volumes or paths you need (if supported by the product). Check the portal for available options.

No SSH access

You do not get SSH or direct host access. All interaction is through the Docker image you submit and any outputs your workload writes to a destination you control (e.g. S3).

Output contract (what the platform produces)

The platform produces the following. You can use these for monitoring, debugging, and billing.

Deployment IDs and status — Each deployment has a unique ID. Status (e.g. running, migrated, stopped) is visible in the portal and via API.
Health and migration events — When a Spot interruption is detected, we checkpoint, migrate, and restore. You can see migration events and health state in the app.
Logs — Container logs are available as configured (e.g. streamed or stored). Check the portal for how to access logs for your deployment.
Checkpoints and manifests — During migration, checkpoint data and manifests are written to our internal S3. These are used for restore only; for your own artifacts (model weights, results), use your workload’s output path (see Result destinations).
Metering and billing records — Usage is reported to AWS Marketplace (e.g. GPU hours per instance type). Your AWS bill reflects Spot cost plus our stability fee. See AWS Marketplace — Billing and metering.

Result destinations (where to send your results)

Perpetual Compute runs your Docker container; it does not automatically ship your application’s outputs (e.g. model checkpoints, logs, artifacts) to a location you choose. We recommend that your workload itself writes results to a destination you control.

S3 (recommended)

Have your container write outputs to an S3 bucket (your own or a shared bucket). For example:

Use the AWS CLI or an SDK inside the container with credentials provided via environment variables or IAM role.
Stream checkpoints, logs, or final artifacts to s3://your-bucket/prefix/.

This way, even if the instance is migrated or replaced, your results are already in S3 and available from any other tool or region.

Other destinations

Your Docker image can also push to other endpoints: another cloud storage (e.g. GCS, Azure Blob), an API, or a database. Configure credentials and endpoints via environment variables or secrets, and implement the upload logic inside your workload. The platform does not restrict outbound traffic for your container beyond standard security policies.

Design for portability

Designing your workload to write all important state and outputs to an external store (S3 or otherwise) makes it resilient to migration and easy to integrate with the rest of your pipeline.

Reporting custom metrics from your workload

Your container can push arbitrary key-value metrics (e.g. training loss, epoch, custom KPIs) to the platform. These are stored on the deployment and shown in the dashboard and admin UI, useful for live progress and debugging without log scraping.

Endpoint and authentication

Send a POST request to {API_URL}/deployments/{instance_id}/metrics. Include the header X-Instance-Token: <token>. The token is the deployment's metrics token, injected into the instance (e.g. via METRICS_TOKEN in the workload environment or UserData). Only that deployment can push metrics for that instance_id.

Payload

Send a JSON object: keys are metric names, values are string, number, or boolean. The server adds a received_at timestamp; do not send that key yourself.

Limits

Keys: Must start with a letter, then only letters, digits, and underscores; max 128 characters. Reserved: received_at.
Count: Up to 50 keys per request (extra keys are ignored).
Values: Strings capped at 2,048 characters; numbers stored as-is (floats rounded to 10 decimal places); booleans stored as "true" / "false".

Each POST replaces the entire metrics object for that deployment (last write wins).

{"epoch": 5, "loss": 0.42, "lr": 1e-5}

Stored metrics appear in the portal under the deployment (e.g. "Workload metrics" in the clusters or deployments table).

Summary

You provide: image, compute profile, env, command, and optional storage. The platform provides: deployment identity, status, events, logs, and billing. You are responsible for sending results (models, artifacts, logs) to your own S3 or other destination from inside your Docker workload.

Docker workloads