This document describes the gRPC API for the Joblet system, including service definitions, message formats, authentication, and usage examples.
The Joblet API is built on gRPC and uses Protocol Buffers for message serialization. All communication is secured with mutual TLS authentication and supports role-based authorization.
Server Address: <host>:50051
TLS: Required (mutual authentication)
Client Certificates: Required for all operations
Platform: Linux server required for job execution
All API calls require valid client certificates signed by the same Certificate Authority (CA) as the server.
Client Certificate Subject Format:
CN=<client-name>, OU=<role>, O=<organization>
Supported Roles:
- OU=admin → Full access (all operations)
- OU=viewer → Read-only access (get, list, stream)
certs/
├── ca-cert.pem # Certificate Authority
├── client-cert.pem # Client certificate (admin or viewer)
└── client-key.pem # Client private key
Role | RunJob | GetJobStatus | StopJob | ListJobs | GetJobLogs |
---|---|---|---|---|---|
admin | ✅ | ✅ | ✅ | ✅ | ✅ |
viewer | ❌ | ✅ | ❌ | ✅ | ✅ |
syntax = "proto3";
package joblet;
service JobletService {
// Create and start a new job
rpc RunJob(RunJobReq) returns (RunJobRes);
// Get job information by ID
rpc GetJobStatus(GetJobStatusReq) returns (GetJobStatusRes);
// Stop a running job
rpc StopJob(StopJobReq) returns (StopJobRes);
// List all jobs
rpc ListJobs(EmptyRequest) returns (Jobs);
// Stream job output in real-time
rpc GetJobLogs(GetJobLogsReq) returns (stream DataChunk);
}
Creates and starts a new job with specified command and resource limits. Jobs execute on the Linux server with complete process isolation.
Authorization: Admin only
rpc RunJob(RunJobReq) returns (RunJobRes);
Request Parameters:
command
(string): Command to execute (required)args
(repeated string): Command arguments (optional)maxCPU
(int32): CPU limit percentage (optional, default: 100)maxMemory
(int32): Memory limit in MB (optional, default: 512)maxIOBPS
(int32): I/O bandwidth limit in bytes/sec (optional, default: 0=unlimited)Job Execution Environment:
Response:
Example:
# CLI
rnx run --max-cpu=50 --max-memory=512 python3 script.py
# Expected Response
Job started:
ID: 1
Command: python3 script.py
Status: INITIALIZING
StartTime: 2024-01-15T10:30:00Z
MaxCPU: 50
MaxMemory: 512
Network: host (shared with system)
Retrieves detailed information about a specific job, including current status, resource usage, and execution metadata.
Authorization: Admin, Viewer
rpc GetJobStatus(GetJobStatusReq) returns (GetJobStatusRes);
Request Parameters:
id
(string): Job ID (required)Response:
Example:
# CLI
rnx status 1
# Expected Response
Id: 1
Command: python3 script.py
Status: RUNNING
Started At: 2024-01-15T10:30:00Z
Ended At:
MaxCPU: 50
MaxMemory: 512
MaxIOBPS: 0
ExitCode: 0
Terminates a running job using graceful shutdown (SIGTERM) followed by force termination (SIGKILL) if necessary.
Authorization: Admin only
rpc StopJob(StopJobReq) returns (StopJobRes);
Request Parameters:
id
(string): Job ID (required)Termination Process:
SIGTERM
to process groupSIGKILL
if process still aliveSTOPPED
Response:
Example:
# CLI
rnx stop 1
# Expected Response
Job stopped successfully:
ID: 1
Status: STOPPED
ExitCode: -1
EndTime: 2024-01-15T10:45:00Z
Lists all jobs with their current status and metadata. Useful for monitoring overall system activity.
Authorization: Admin, Viewer
rpc ListJobs(EmptyRequest) returns (Jobs);
Request Parameters: None
Response:
Example:
# CLI
rnx list
# Expected Response
1 COMPLETED StartTime: 2024-01-15T10:30:00Z Command: echo hello
2 RUNNING StartTime: 2024-01-15T10:35:00Z Command: python3 script.py
3 FAILED StartTime: 2024-01-15T10:40:00Z Command: invalid-command
Streams job output in real-time, including historical logs and live updates. Supports multiple concurrent clients streaming the same job.
Authorization: Admin, Viewer
rpc GetJobLogs(GetJobLogsReq) returns (stream DataChunk);
Request Parameters:
id
(string): Job ID (required)Streaming Behavior:
Response:
DataChunk
messages containing raw stdout/stderr outputExample:
# CLI
rnx log -f 1
# Expected Response (streaming)
Logs for job 1 (Press Ctrl+C to exit if streaming):
Starting script...
Processing item 1
Processing item 2
...
Script completed successfully
Core job representation used across all API responses.
message Job {
string id = 1; // Unique job identifier
string name = 2; // Human-readable job name (from workflows, empty for individual jobs)
string command = 3; // Command being executed
repeated string args = 4; // Command arguments
int32 maxCPU = 5; // CPU limit in percent
string cpuCores = 6; // CPU core binding specification
int32 maxMemory = 7; // Memory limit in MB
int32 maxIOBPS = 8; // IO limit in bytes per second
string status = 9; // Current job status
string startTime = 10; // Start time (RFC3339 format)
string endTime = 11; // End time (RFC3339 format, empty if running)
int32 exitCode = 12; // Process exit code
string scheduledTime = 13; // Scheduled execution time (RFC3339 format)
string runtime = 14; // Runtime specification used
map<string, string> environment = 15; // Regular environment variables (visible)
map<string, string> secret_environment = 16; // Secret environment variables (masked)
}
INITIALIZING - Job created, setting up isolation and resources
RUNNING - Process executing in isolated namespace
COMPLETED - Process finished successfully (exit code 0)
FAILED - Process finished with error (exit code != 0)
STOPPED - Process terminated by user request or timeout
Default values when not specified in configuration (joblet-config.yml
):
DefaultCPULimitPercent = 100 // 100% of one core
DefaultMemoryLimitMB = 512 // 512 MB
DefaultIOBPS = 0 // Unlimited I/O
message RunJobReq {
string command = 1; // Required: command to execute
repeated string args = 2; // Optional: command arguments
int32 maxCPU = 3; // Optional: CPU limit percentage
int32 maxMemory = 4; // Optional: memory limit in MB
int32 maxIOBPS = 5; // Optional: I/O bandwidth limit
}
Used for streaming job output with efficient binary transport.
message DataChunk {
bytes payload = 1; // Raw output data (stdout/stderr merged)
}
Code | Description | Common Causes |
---|---|---|
UNAUTHENTICATED |
Invalid or missing client certificate | Certificate expired, wrong CA |
PERMISSION_DENIED |
Insufficient role permissions | Viewer trying admin operation |
NOT_FOUND |
Job not found | Invalid job ID |
INTERNAL |
Server-side error | Job creation failed, system error |
CANCELED |
Operation canceled | Client disconnected during stream |
INVALID_ARGUMENT |
Invalid request parameters | Empty command, invalid limits |
{
"code": "NOT_FOUND",
"message": "job not found: 999",
"details": []
}
# Missing certificate
Error: failed to extract client role: no TLS information found
# Wrong role (viewer trying to run job)
Error: role viewer is not allowed to perform operation run_job
# Invalid certificate
Error: certificate verify failed: certificate has expired
# Job not found
Error: job not found: 999
# Job not running (for stop operation)
Error: job is not running: 123 (current status: COMPLETED)
# Command validation failed
Error: invalid command: command contains dangerous characters
# Resource limits exceeded
Error: job creation failed: maxMemory exceeds system limits
# Linux platform required
Error: job execution requires Linux server (current: darwin)
# Cgroup setup failed
Error: cgroup setup failed: permission denied
# Namespace creation failed
Error: failed to create isolated environment: operation not permitted
--server string Server address (default "localhost:50051")
--cert string Client certificate path (default "certs/client-cert.pem")
--key string Client private key path (default "certs/client-key.pem")
--ca string CA certificate path (default "certs/ca-cert.pem")
Create and start a new job with optional resource limits.
rnx run [flags] <command> [args...]
Flags:
--max-cpu int Max CPU percentage (default: from config)
--max-memory int Max memory in MB (default: from config)
--max-iobps int Max I/O bytes per second (default: 0=unlimited)
Examples:
rnx run echo "hello world"
rnx run --max-cpu=50 python3 script.py
rnx run --max-memory=1024 java -jar app.jar
rnx run bash -c "sleep 10 && echo done"
Get detailed information about a job by ID.
rnx status <job-id>
Example:
rnx status 1
List all jobs with their current status.
rnx list
Example:
rnx list
Stop a running job gracefully (SIGTERM) or forcefully (SIGKILL).
rnx stop <job-id>
Example:
rnx stop 1
Stream job output in real-time or view historical logs.
rnx log [flags] <job-id>
Flags:
--follow, -f Follow the log stream (default true)
Examples:
rnx log 1 # View all logs
rnx log -f 1 # Follow live output
rnx log --follow=false 1 # Historical logs only
# Connect to remote Linux server from any platform
rnx --server=prod.example.com:50051 \
--cert=certs/admin-client-cert.pem \
--key=certs/admin-client-key.pem \
run echo "remote execution on Linux"
export JOBLET_SERVER="prod.example.com:50051"
export JOBLET_CERT_PATH="./certs/admin-client-cert.pem"
export JOBLET_KEY_PATH="./certs/admin-client-key.pem"
export JOBLET_CA_PATH="./certs/ca-cert.pem"
rnx run python3 script.py
Resource limits and timeouts are configured in /opt/joblet/joblet-config.yml
:
joblet:
defaultCpuLimit: 100 # Default CPU percentage
defaultMemoryLimit: 512 # Default memory in MB
defaultIoLimit: 0 # Default I/O limit (0=unlimited)
maxConcurrentJobs: 100 # Maximum concurrent jobs
jobTimeout: "1h" # Maximum job runtime
cleanupTimeout: "5s" # Resource cleanup timeout
grpc:
maxRecvMsgSize: 524288 # 512KB max receive message
maxSendMsgSize: 4194304 # 4MB max send message
keepAliveTime: "30s" # Connection keep-alive
The server provides detailed logging for:
# Structured logging with fields
DEBUG - Detailed execution flow and debugging info
INFO - Job lifecycle events and normal operations
WARN - Resource limit violations, slow clients, recoverable errors
ERROR - Job failures, system errors, authentication failures
# Example log entry
[2024-01-15T10:30:00Z] [INFO] job started successfully | jobId=1 pid=12345 command="python3 script.py" duration=50ms
# Check server health
rnx list
# Verify certificate and connection
rnx --server=your-server:50051 list
# Monitor service status (systemd)
sudo systemctl status joblet
sudo journalctl -u joblet -f
rnx list
/sys/fs/cgroup/joblet.slice/
Joblet provides comprehensive workflow orchestration through YAML-defined job dependencies. Workflows enable complex multi-job execution with dependency management, resource isolation, and comprehensive monitoring.
requires
clausesThe API provides dedicated workflow services for orchestration:
service JobService {
// Workflow execution
rpc RunWorkflow(RunWorkflowRequest) returns (RunWorkflowResponse);
rpc GetWorkflowStatus(GetWorkflowStatusRequest) returns (GetWorkflowStatusResponse);
rpc ListWorkflows(ListWorkflowsRequest) returns (ListWorkflowsResponse);
rpc GetWorkflowJobs(GetWorkflowJobsRequest) returns (GetWorkflowJobsResponse);
}
Represents a job within a workflow with dependency information.
message WorkflowJob {
string jobId = 1; // Actual job ID for started jobs, "0" for non-started jobs
string jobName = 2; // Human-readable job name from workflow YAML
string status = 3; // Current job status
repeated string dependencies = 4; // List of job names this job depends on
Timestamp startTime = 5; // Job start time
Timestamp endTime = 6; // Job completion time
int32 exitCode = 7; // Process exit code
}
Job ID Behavior:
jobId
contains actual job ID assigned by joblet (e.g., “42”, “43”)jobId
shows “0” to indicate the job hasn’t been started yetProvides comprehensive workflow status with job details.
message GetWorkflowStatusResponse {
WorkflowInfo workflow = 1; // Overall workflow information
repeated WorkflowJob jobs = 2; // Detailed job information with dependencies
}
Workflow jobs have human-readable names derived from YAML job keys:
# workflow.yaml
jobs:
setup-data: # Job name: "setup-data"
command: "python3"
args: ["setup.py"]
process-data: # Job name: "process-data"
command: "python3"
args: ["process.py"]
requires:
- setup-data: "COMPLETED"
Job ID vs Job Name:
Status Display:
JOB ID JOB NAME STATUS EXIT CODE DEPENDENCIES
-----------------------------------------------------------------------------------------
42 setup-data COMPLETED 0 -
43 process-data RUNNING - setup-data
Workflow status commands automatically display job names for better visibility:
# Get workflow status with job names and dependencies
rnx status --workflow 1
# List workflows
rnx list --workflow
# Execute workflow
rnx run --workflow=pipeline.yaml