joblet

Joblet: Enterprise Linux Job Execution Platform

Joblet is a comprehensive Linux-native job execution platform designed for enterprise workloads. It leverages Linux namespaces and cgroups v2 to provide robust process isolation, resource management, and secure multi-tenant execution environments without the overhead of containerization.

Executive Summary

Joblet delivers enterprise-grade job execution capabilities by combining native Linux kernel features with modern orchestration patterns. The platform provides deterministic resource allocation, comprehensive security isolation, and seamless integration with existing infrastructure through a unified gRPC API and intuitive command-line interface.

Core Capabilities

Security Architecture

Management Interfaces

Web Interface

System Monitoring

System Monitoring

Workflow Management

Workflow Management

Enterprise Use Cases

Continuous Integration and Deployment

# Run jobs with pre-built runtime environments
rnx job run --runtime=python-3.11-ml pytest tests/
rnx job run --runtime=openjdk-21 --upload=pom.xml --upload=src/ mvn clean install

Data Engineering and Analytics Workloads

# Isolated data processing with resource limits
rnx job run --max-memory=8192 --max-cpu=400 \
        --volume=data-lake \
        --runtime=python-3.11-ml \
        python process_big_data.py

# GPU-accelerated data processing
rnx job run --gpu=2 --gpu-memory=16GB \
        --max-memory=16384 \
        --runtime=python-3.11-ml \
        python gpu_analysis.py

Microservices Testing and Validation

# Network-isolated service testing
rnx network create test-env --cidr=10.10.0.0/24
rnx job run --network=test-env --runtime=openjdk-21 ./service-a
rnx job run --network=test-env --runtime=python-3.11-ml ./service-b

Complex Workflow Orchestration

# ml-pipeline.yaml
jobs:
  data-extraction:
    command: "python3"
    args: [ "extract.py" ]
    runtime: "python-3.11-ml"
    resources:
      max_memory: 2048
      max_cpu: 100

  model-training:
    command: "python3"
    args: [ "train.py" ]
    runtime: "python-3.11-ml"
    requires:
      - data-extraction: "COMPLETED"
    resources:
      max_memory: 8192
      max_cpu: 400
      gpu_count: 1
      gpu_memory_mb: 8192
# Execute and monitor workflow with job names
rnx job run --workflow=ml-pipeline.yaml
rnx job status --workflow a1b2c3d4-e5f6-7890-1234-567890abcdef

# View workflow status with original YAML content (available from any workstation)
rnx job status --workflow --detail a1b2c3d4-e5f6-7890-1234-567890abcdef

# Output shows job names, node identification, and dependencies:
# JOB UUID        JOB NAME             NODE ID         STATUS       EXIT CODE  DEPENDENCIES
# -----------------------------------------------------------------------------------------
# f47ac10b-...    data-extraction      8f94c5b2-...    COMPLETED    0          -
# a1b2c3d4-...    model-training       8f94c5b2-...    RUNNING      -          data-extraction     

Site Reliability Engineering Operations

# Resource-bounded health checks with timeout
rnx job run --max-cpu=10 --max-memory=64 \
        --runtime=python-3.11 \
        python health_check.py

# Isolated incident response tooling
rnx job run --network=isolated \
        --volume=incident-logs \
        ./debug-analyzer.sh

Artificial Intelligence and Machine Learning Workloads

# Multi-agent system with isolation
rnx job run --max-memory=4096 --runtime=python-3.11-ml \
        python agent_coordinator.py

# GPU-powered ML agents
rnx job run --gpu=1 --gpu-memory=8GB \
        --max-memory=2048 --runtime=python-3.11-ml \
        --network=agent-net \
        python inference_agent.py

rnx job run --max-memory=1024 --runtime=python-3.11-ml \
        --network=agent-net \
        python monitoring_agent.py

Technical Architecture

Linux Kernel Integration

Security Framework

Scalability and Performance

Documentation Resources

Getting Started

User Guides

Advanced Topics

Reference

Quick Start Example

# Install Joblet Server on Linux (see Installation Guide for details)
# Download from GitHub releases and run installation script

# Run your first job
rnx job run echo "Hello, Joblet!"

# Create a workflow
cat > ml-pipeline.yaml << EOF
jobs:
  analyze:
    command: "python3"
    args: ["analyze.py", "--data", "/data/input.csv"]
    runtime: "python-3.11-ml"
    volumes: ["data-volume"]
EOF

# Execute the workflow
rnx job run --workflow=ml-pipeline.yaml

Command Reference

Job Execution

# Run basic commands
rnx job run echo "Hello World"
rnx job run --runtime=python-3.11-ml python script.py
rnx job run --runtime=openjdk-21 java MyApp

# Resource limits
rnx job run --max-memory=2048 --max-cpu=200 intensive-task

# GPU-accelerated jobs
rnx job run --gpu=1 --gpu-memory=4GB python ml_training.py
rnx job run --gpu=2 --runtime=python-3.11-ml python distributed_inference.py

# Multi-process jobs (see PROCESS_ISOLATION.md for details)
rnx job run --runtime=python-3.11-ml bash -c "sleep 30 & sleep 40 & ps aux"
rnx job run --runtime=python-3.11-ml bash -c "task1 & task2 & wait"

Node Identification

# View jobs with node identification for distributed tracking
rnx job list

# Example output showing node IDs:
# UUID                                 NAME         NODE ID                              STATUS
# ------------------------------------  ------------ ------------------------------------ ----------
# f47ac10b-58cc-4372-a567-0e02b2c3d479  setup-data   8f94c5b2-1234-5678-9abc-def012345678 COMPLETED
# a1b2c3d4-e5f6-7890-abcd-ef1234567890  process-data 8f94c5b2-1234-5678-9abc-def012345678 RUNNING

# View detailed job status including node information
rnx job status f47ac10b-58cc-4372-a567-0e02b2c3d479

# Node ID information helps identify which Joblet instance executed each job
# Useful for debugging and tracking in multi-node distributed deployments

Runtime Management

# List available runtimes (Python, Python ML, Java)
rnx runtime list

# Get runtime information
rnx runtime info python-3.11-ml

# Install runtimes
rnx runtime install python-3.11-ml
rnx runtime install python-3.11
rnx runtime install openjdk-21
rnx runtime install graalvmjdk-21

# Remove runtimes
rnx runtime remove python-3.11-ml

# Test runtime functionality
rnx runtime test openjdk-21

Network & Storage

# Create isolated networks
rnx network create my-network --cidr=10.0.0.0/24

# Create persistent volumes
rnx volume create data-vol --size=10GB

# Use in jobs
rnx job run --network=my-network --volume=data-vol app

Business Value Proposition

DevOps and Platform Teams

Development Teams

Operations Teams

Site Reliability Engineering

Machine Learning and AI Platforms


Getting Started

For detailed installation instructions and initial configuration, please refer to the Quick Start Guide. For production deployment considerations, consult the Deployment Guide.