The Retire-Cluster REST API provides comprehensive management and monitoring capabilities for your distributed computing cluster through HTTP endpoints.
Install with API support:
pip install retire-cluster[api]
# Start with default settings
retire-cluster-api
# Start with custom configuration
retire-cluster-api --host 0.0.0.0 --port 8081 --auth --api-key your-secret-key
# Connect to specific cluster node
retire-cluster-api --cluster-host 192.168.1.100 --cluster-port 8080
# Check API health
curl http://localhost:8081/health
# Get cluster status
curl http://localhost:8081/api/v1/cluster/status
# List devices
curl http://localhost:8081/api/v1/devices
# Submit a task
curl -X POST http://localhost:8081/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{"task_type": "echo", "payload": {"message": "Hello World"}}'
http://localhost:8081/api/v1
API supports optional API key authentication:
# Include API key in header
curl -H "X-API-Key: your-secret-key" http://localhost:8081/api/v1/cluster/config
# Or use Authorization header
curl -H "Authorization: Bearer your-secret-key" http://localhost:8081/api/v1/cluster/config
All API responses follow a consistent format:
{
"status": "success|error",
"data": {},
"message": "Optional message",
"timestamp": "2023-01-01T12:00:00Z",
"request_id": "uuid"
}
GET /api/v1/cluster/status
Returns comprehensive cluster statistics including device count, health percentage, and resource totals.
Response:
{
"status": "success",
"data": {
"cluster_stats": {
"total_devices": 5,
"online_devices": 4,
"offline_devices": 1,
"health_percentage": 80.0,
"total_resources": {
"cpu_cores": 32,
"memory_gb": 128,
"storage_gb": 2000
},
"by_role": {
"compute": 2,
"mobile": 2,
"storage": 1
},
"by_platform": {
"linux": 2,
"android": 2,
"windows": 1
}
},
"server_info": {
"version": "1.0.0",
"uptime": "2d 4h 30m",
"host": "0.0.0.0",
"port": 8080
}
}
}
GET /health
Simple health check endpoint for monitoring.
Response:
{
"status": "healthy",
"timestamp": "2023-01-01T12:00:00Z",
"components": {
"api": "healthy",
"cluster_server": "healthy",
"task_scheduler": "healthy"
}
}
GET /api/v1/cluster/metrics
Detailed performance and utilization metrics.
GET /api/v1/cluster/config
Requires Authentication
Returns cluster configuration settings.
GET /api/v1/devices?page=1&page_size=20&status=online&role=compute
Query Parameters:
page (int): Page number (default: 1)page_size (int): Items per page (default: 20, max: 100)status (string): Filter by status (online, offline, all)role (string): Filter by role (worker, compute, storage, mobile)platform (string): Filter by platform (linux, windows, android, darwin)tags (array): Filter by tagsResponse:
{
"status": "success",
"data": [
{
"device_id": "laptop-001",
"role": "compute",
"platform": "linux",
"status": "online",
"ip_address": "192.168.1.101",
"last_heartbeat": "2023-01-01T12:00:00Z",
"uptime": "2h 30m",
"capabilities": {
"cpu_count": 8,
"memory_total_gb": 16,
"storage_total_gb": 500,
"has_gpu": true
},
"tags": ["development", "gpu-capable"]
}
],
"pagination": {
"page": 1,
"page_size": 20,
"total_items": 5,
"total_pages": 1,
"has_next": false,
"has_previous": false
}
}
GET /api/v1/devices/{device_id}
Returns detailed information about a specific device.
GET /api/v1/devices/{device_id}/status
Get current status and health of a device.
POST /api/v1/devices/{device_id}/ping
Requires Authentication
Test connectivity to a specific device.
DELETE /api/v1/devices/{device_id}
Requires Authentication
Remove a device from the cluster.
GET /api/v1/devices/summary
Get summary statistics of all devices.
POST /api/v1/tasks
Content-Type: application/json
Request Body:
{
"task_type": "echo",
"payload": {
"message": "Hello World"
},
"priority": "normal",
"requirements": {
"min_cpu_cores": 2,
"min_memory_gb": 4,
"required_platform": "linux",
"timeout_seconds": 300
},
"metadata": {
"created_by": "api_user"
}
}
Task Types:
echo: Returns payload as-issleep: Sleep for specified durationsystem_info: Returns device capabilitiespython_eval: Evaluate Python expressionhttp_request: Make HTTP requestscommand: Execute shell commandsPriority Levels:
low, normal, high, urgentRequirements:
min_cpu_cores (int): Minimum CPU coresmin_memory_gb (float): Minimum memory in GBmin_storage_gb (float): Minimum storage in GBrequired_platform (string): Required platformrequired_role (string): Required device rolerequired_tags (array): Required device tagsgpu_required (bool): GPU requiredinternet_required (bool): Internet access requiredtimeout_seconds (int): Maximum execution timemax_retries (int): Maximum retry attemptsResponse:
{
"status": "success",
"data": {
"task_id": "550e8400-e29b-41d4-a716-446655440000",
"task": {
"task_id": "550e8400-e29b-41d4-a716-446655440000",
"task_type": "echo",
"status": "queued",
"priority": "normal",
"created_at": "2023-01-01T12:00:00Z"
}
}
}
GET /api/v1/tasks?status=running&page=1&page_size=20
Query Parameters:
page (int): Page numberpage_size (int): Items per pagestatus (string): Filter by statustask_type (string): Filter by task typepriority (string): Filter by prioritydevice_id (string): Filter by assigned deviceTask Statuses:
pending: Task created but not queuedqueued: Waiting for available deviceassigned: Assigned to specific devicerunning: Currently executingsuccess: Completed successfullyfailed: Failed with errorcancelled: Cancelled by usertimeout: Exceeded timeout limitGET /api/v1/tasks/{task_id}
Returns complete task information including payload, requirements, and results.
GET /api/v1/tasks/{task_id}/status
Get current status and execution progress.
GET /api/v1/tasks/{task_id}/result
Get execution result for completed tasks.
Response:
{
"status": "success",
"data": {
"task_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "success",
"result_data": {
"echo": {"message": "Hello World"}
},
"execution_time_seconds": 0.05,
"worker_device_id": "laptop-001",
"started_at": "2023-01-01T12:00:00Z",
"completed_at": "2023-01-01T12:00:00Z"
}
}
POST /api/v1/tasks/{task_id}/cancel
Requires Authentication
Cancel a running or queued task.
POST /api/v1/tasks/{task_id}/retry
Requires Authentication
Retry a failed task.
GET /api/v1/tasks/statistics
Get task execution statistics and performance metrics.
GET /api/v1/tasks/types
Get list of supported task types across all devices.
GET /api/v1/config
Requires Authentication
Get complete system configuration.
GET /api/v1/config/server
Requires Authentication
PUT /api/v1/config/server
Content-Type: application/json
Requires Authentication
{
"max_connections": 100
}
GET /api/v1/config/heartbeat
Requires Authentication
PUT /api/v1/config/heartbeat
Content-Type: application/json
Requires Authentication
{
"interval": 60,
"timeout": 300,
"max_missed": 3
}
POST /api/v1/config/reset
Requires Authentication
Reset all configuration to defaults.
{
"status": "error",
"message": "Error description",
"error_code": "ERROR_CODE",
"error_details": {
"field": "field_name",
"reason": "validation_error"
},
"timestamp": "2023-01-01T12:00:00Z",
"request_id": "uuid"
}
VALIDATION_ERROR: Request validation failedNOT_FOUND: Resource not foundDEVICE_NOT_FOUND: Device not foundTASK_NOT_FOUND: Task not foundUNAUTHORIZED: Authentication requiredRATE_LIMIT_EXCEEDED: Too many requestsINTERNAL_ERROR: Server error200: Success201: Created400: Bad Request401: Unauthorized404: Not Found429: Too Many Requests500: Internal Server Error503: Service UnavailableDefault rate limits:
Rate limit headers:
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1640995200
Real-time updates will be available via WebSocket:
const ws = new WebSocket('ws://localhost:8081/ws');
ws.onmessage = (event) => {
const update = JSON.parse(event.data);
console.log('Cluster update:', update);
};
import requests
class RetireClusterAPI:
def __init__(self, base_url, api_key=None):
self.base_url = base_url
self.headers = {'Content-Type': 'application/json'}
if api_key:
self.headers['X-API-Key'] = api_key
def get_cluster_status(self):
response = requests.get(
f"{self.base_url}/cluster/status",
headers=self.headers
)
return response.json()
def submit_task(self, task_type, payload, **kwargs):
data = {
'task_type': task_type,
'payload': payload,
**kwargs
}
response = requests.post(
f"{self.base_url}/tasks",
json=data,
headers=self.headers
)
return response.json()
# Usage
api = RetireClusterAPI('http://localhost:8081/api/v1')
status = api.get_cluster_status()
task = api.submit_task('echo', {'message': 'Hello'})
class RetireClusterAPI {
constructor(baseUrl, apiKey) {
this.baseUrl = baseUrl;
this.headers = {'Content-Type': 'application/json'};
if (apiKey) {
this.headers['X-API-Key'] = apiKey;
}
}
async getClusterStatus() {
const response = await fetch(`${this.baseUrl}/cluster/status`, {
headers: this.headers
});
return response.json();
}
async submitTask(taskType, payload, options = {}) {
const data = {
task_type: taskType,
payload: payload,
...options
};
const response = await fetch(`${this.baseUrl}/tasks`, {
method: 'POST',
headers: this.headers,
body: JSON.stringify(data)
});
return response.json();
}
}
// Usage
const api = new RetireClusterAPI('http://localhost:8081/api/v1');
const status = await api.getClusterStatus();
const task = await api.submitTask('echo', {message: 'Hello'});
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:8081 retire_cluster.api.wsgi:app
FROM python:3.11-slim
COPY . /app
WORKDIR /app
RUN pip install retire-cluster[api]
EXPOSE 8081
CMD ["retire-cluster-api", "--host", "0.0.0.0", "--port", "8081"]
export RETIRE_CLUSTER_API_HOST=0.0.0.0
export RETIRE_CLUSTER_API_PORT=8081
export RETIRE_CLUSTER_API_KEY=your-secret-key
export RETIRE_CLUSTER_CLUSTER_HOST=localhost
export RETIRE_CLUSTER_CLUSTER_PORT=8080
# HELP retire_cluster_devices_total Total number of registered devices
# TYPE retire_cluster_devices_total gauge
retire_cluster_devices_total{status="online"} 4
# HELP retire_cluster_tasks_total Total number of tasks
# TYPE retire_cluster_tasks_total counter
retire_cluster_tasks_total{status="success"} 1250
{
"timestamp": "2023-01-01T12:00:00Z",
"level": "INFO",
"logger": "api.requests",
"message": "Request processed",
"request_id": "uuid",
"method": "GET",
"path": "/api/v1/cluster/status",
"status_code": 200,
"duration_ms": 45.2
}
API server won’t start
pip install retire-cluster[api]netstat -an | grep 8081Authentication errors
X-API-Key: your-key--authTask submission fails
GET /api/v1/tasks/typesRate limit exceeded
retire-cluster-api --debug
Enables detailed logging and error traces.