joblet

Security Guide

Comprehensive security guide for Joblet deployment, covering mTLS authentication, authorization, isolation, and best practices.

Table of Contents

Security Overview

Joblet implements multi-layered security:

┌─────────────────────────────────────────┐
│           Transport Layer               │
│         mTLS + Certificate Auth         │
├─────────────────────────────────────────┤
│         Authorization Layer             │
│         RBAC (admin/viewer)             │
├─────────────────────────────────────────┤
│         Process Isolation               │
│    Namespaces + cgroups + chroot        │
├─────────────────────────────────────────┤
│         Network Isolation               │
│       Custom networks + traffic         │
├─────────────────────────────────────────┤
│         Filesystem Isolation            │
│      Per-job workspaces + volumes       │
└─────────────────────────────────────────┘

Key Security Features

mTLS Authentication

Certificate-Based Security

Joblet uses mutual TLS (mTLS) for secure communication:

# Generate CA and certificates
export JOBLET_SERVER_ADDRESS='192.168.1.100'
sudo /usr/local/bin/certs_gen_embedded.sh

This creates:

Certificate Structure

Certificate Authority (CA)
├── Server Certificate (CN=joblet)
│   ├── Used by Joblet daemon
│   └── Validates server identity
└── Client Certificates
    ├── Admin Certificate (OU=admin)
    │   ├── Full access to all operations
    │   └── Can run, stop, manage jobs
    └── Viewer Certificate (OU=viewer)
        ├── Read-only access
        └── Can list, view status, logs

Manual Certificate Generation

# 1. Create CA
openssl genrsa -out ca-key.pem 4096
openssl req -new -x509 -key ca-key.pem -out ca-cert.pem -days 3650 \
  -subj "/CN=Joblet CA"

# 2. Server certificate (CN must be "joblet")
openssl genrsa -out server-key.pem 4096
openssl req -new -key server-key.pem -out server.csr \
  -subj "/CN=joblet"
openssl x509 -req -in server.csr -CA ca-cert.pem -CAkey ca-key.pem \
  -out server-cert.pem -days 365 -CAcreateserial \
  -extensions v3_req -extfile <(echo "[v3_req]
subjectAltName = DNS:localhost,DNS:joblet,IP:127.0.0.1,IP:${SERVER_IP}")

# 3. Admin client certificate
openssl genrsa -out admin-key.pem 4096
openssl req -new -key admin-key.pem -out admin.csr \
  -subj "/CN=admin-client/OU=admin"
openssl x509 -req -in admin.csr -CA ca-cert.pem -CAkey ca-key.pem \
  -out admin-cert.pem -days 365 -CAcreateserial

# 4. Viewer client certificate
openssl genrsa -out viewer-key.pem 4096
openssl req -new -key viewer-key.pem -out viewer.csr \
  -subj "/CN=viewer-client/OU=viewer"
openssl x509 -req -in viewer.csr -CA ca-cert.pem -CAkey ca-key.pem \
  -out viewer-cert.pem -days 365 -CAcreateserial

Certificate Rotation

# 1. Generate new certificates
sudo /usr/local/bin/certs_gen_embedded.sh

# 2. Update server configuration
sudo systemctl restart joblet

# 3. Distribute new client certificates
scp /opt/joblet/config/rnx-config.yml client:~/.rnx/

# 4. Verify new certificates
rnx job list  # Should work with new certs

TLS Configuration

# Server configuration (joblet-config.yml)
security:
  # TLS settings
  tls:
    enabled: true
    min_version: "1.3"
    cipher_suites:
      - TLS_AES_256_GCM_SHA384
      - TLS_CHACHA20_POLY1305_SHA256

  # Certificate verification
  require_client_cert: true
  verify_client_cert: true

  # Certificate embedded in config
  serverCert: |
    -----BEGIN CERTIFICATE-----
    ...
  serverKey: |
    -----BEGIN PRIVATE KEY-----
    ...
  caCert: |
    -----BEGIN CERTIFICATE-----
    ...

Authorization and RBAC

Role-Based Access Control

Joblet uses certificate Organization Unit (OU) for roles:

Role OU Value Permissions
Admin admin Full access: run, stop, list, logs, volumes, networks
Viewer viewer Read-only: list, status, logs (but cannot run/stop jobs)

Admin Role (OU=admin)

# Full access operations
rnx job run echo "Admin can run jobs"
rnx job stop <job-id>
rnx volume create admin-vol --size=1GB
rnx network create admin-net --cidr=10.1.0.0/24

# All monitoring operations
rnx monitor
rnx job list
rnx job status <job-id>
rnx job log <job-id>

Viewer Role (OU=viewer)

# Read-only operations (allowed)
rnx job list
rnx job status <job-id>
rnx job log <job-id>
rnx monitor status

# Write operations (denied)
rnx job run echo "test"          # ERROR: Permission denied
rnx job stop <job-id>            # ERROR: Permission denied
rnx volume create test       # ERROR: Permission denied

Multi-User Setup

# Generate certificates for different users
# DevOps team (admin access)
openssl req -new -key devops-key.pem -out devops.csr \
  -subj "/CN=devops-team/OU=admin"

# Developers (viewer access)
openssl req -new -key dev-key.pem -out dev.csr \
  -subj "/CN=developer/OU=viewer"

# QA team (viewer access)
openssl req -new -key qa-key.pem -out qa.csr \
  -subj "/CN=qa-team/OU=viewer"

# Sign all certificates with CA
for cert in devops dev qa; do
  openssl x509 -req -in ${cert}.csr -CA ca-cert.pem -CAkey ca-key.pem \
    -out ${cert}-cert.pem -days 365 -CAcreateserial
done

Fine-Grained Permissions (Future)

Current RBAC is binary (admin/viewer). For more granular control:

# Conceptual fine-grained permissions
authorization:
  roles:
    admin:
      - job:*
      - volume:*
      - network:*
      - monitor:*

    developer:
      - job:run
      - job:list
      - job:status
      - job:logs
      - volume:list

    qa:
      - job:list
      - job:status
      - job:logs
      - monitor:status

Process Isolation

Service-Based Isolation Architecture

Joblet implements automatic isolation based on which API service initiates jobs:

# Production Jobs (JobService API) - Minimal Chroot
rnx job run echo "Hello World"           # Uses minimal chroot isolation
rnx job run --runtime=java:21 java App  # Secure runtime mounting

# Runtime Build Jobs (RuntimeService API) - Builder Chroot  
rnx runtime install java:21         # Uses builder chroot with host OS access

Isolation Routing:

JobService API          RuntimeService API
     │                       │
     ▼                       ▼
JobType: "standard"     JobType: "runtime-build"
     │                       │
     ▼                       ▼
Minimal Chroot          Builder Chroot
- Production isolation  - Controlled host access
- Secure runtime mounts - Runtime building tools
- No package managers   - Temporary modifications

Dual Chroot System

Production Jobs (Minimal Chroot)

# Minimal filesystem access
rnx job run ls /                    # Limited directories
rnx job run which apt              # Command not found
rnx job run ls /opt/joblet         # No access to joblet internals

Runtime Builds (Builder Chroot)

# Controlled host OS access (ONLY during runtime building)
# - Full host filesystem (read-only)
# - Package managers available 
# - /opt/joblet/runtimes writable
# - Automatic cleanup creates isolated runtime structure

Linux Namespaces

Both job types use identical namespace isolation:

# PID namespace - process isolation
rnx job run ps aux  # Only sees job processes

# Mount namespace - filesystem isolation
rnx job run mount  # Shows only job-specific mounts

# Network namespace - network isolation
rnx job run ip addr show  # Shows only job network interface

# IPC namespace - inter-process communication isolation
rnx job run ipcs  # No shared memory/semaphores from host

# UTS namespace - hostname isolation
rnx job run hostname  # Job-specific hostname

# Cgroup namespace - resource visibility
rnx job run cat /proc/cgroups  # Limited cgroup view

Runtime Isolation Security

Joblet prevents host filesystem exposure through runtime cleanup:

# BEFORE cleanup (INSECURE): Runtime mounts point to host OS paths
# runtime.yml contained:
# - source: "usr/lib/jvm/java-21-openjdk-amd64"  # HOST PATH!

# AFTER cleanup (SECURE): Runtime mounts point to isolated copies  
# runtime.yml contains:
# - source: "isolated/usr/lib/jvm/java-21-openjdk-amd64"  # ISOLATED COPY

# Production jobs using runtimes are completely isolated
rnx job run --runtime=java:21 find /usr -type f | head -5
# Only shows isolated runtime files, not host OS files

Runtime Directory Structure:

/opt/joblet/runtimes/java/openjdk-21/
├── isolated/                    # Self-contained runtime files
│   ├── usr/lib/jvm/            # Copied from host during build
│   ├── usr/bin/                # Runtime binaries (isolated)  
│   └── etc/ssl/certs/          # Certificates (isolated)
├── runtime.yml                 # Uses isolated/ paths only
└── runtime.yml.original        # Backup for audit

Security Context

# Jobs run as unprivileged user
rnx job run id
# Output: uid=65534(nobody) gid=65534(nogroup)

# No sudo/setuid capabilities
rnx job run sudo echo "test"  # Command not found

# Limited filesystem access
rnx job run ls /root  # Permission denied
rnx job run ls /etc/shadow  # Permission denied

Resource Limits (Security)

# Prevent fork bombs
rnx job run --max-cpu=100 :(){ :|:& };:  # Limited by CPU quota

# Prevent memory exhaustion
rnx job run --max-memory=512 bash -c 'a=(); while true; do a+=($a); done'  # Killed by OOM

# Prevent I/O attacks
rnx job run --max-iobps=1048576 dd if=/dev/zero of=/work/attack bs=1M  # Limited bandwidth

Network Security

Network Isolation

# Create isolated networks for different security zones
rnx network create dmz --cidr=10.1.0.0/24           # Public-facing
rnx network create internal --cidr=10.2.0.0/24      # Internal services
rnx network create secure --cidr=10.3.0.0/24        # Sensitive data

# Jobs in different networks cannot communicate
rnx job run --network=dmz ping 10.2.0.1        # Will fail
rnx job run --network=internal ping 10.3.0.1   # Will fail

Zero-Trust Network Model

# No network access for sensitive processing
rnx job run --network=none --volume=sensitive-data \
  python process_classified.py

# Limited network access
rnx job run --network=internal --volume=app-data \
  python internal_service.py

# Full internet access (carefully controlled)
rnx job run --network=bridge \
  curl https://api.trusted-service.com

Traffic Control

# Limit bandwidth to prevent data exfiltration
rnx job run --max-iobps=1048576 --network=bridge \
  curl https://malicious-site.com  # Limited to 1MB/s

# Monitor network usage
rnx job run --network=monitored iftop

Data Security

Sensitive Data Handling

# 1. Create encrypted volume
rnx volume create encrypted-data --size=10GB

# 2. Encrypt data before storage
rnx job run --volume=encrypted-data bash -c '
  echo "sensitive information" | \
  openssl enc -aes-256-cbc -k "$ENCRYPTION_KEY" \
  > /volumes/encrypted-data/secret.enc
'

# 3. Decrypt only when needed
rnx job run --volume=encrypted-data --env=ENCRYPTION_KEY=xxx bash -c '
  openssl enc -aes-256-cbc -d -k "$ENCRYPTION_KEY" \
  < /volumes/encrypted-data/secret.enc
'

Secrets Management

# Avoid embedding secrets in commands (BAD)
rnx job run curl -H "Authorization: Bearer secret123" api.com

# Use environment variables (BETTER)
rnx job run --env=API_TOKEN=secret123 \
  curl -H "Authorization: Bearer \$API_TOKEN" api.com

# Use volume-based secrets (BEST)
echo "secret123" | rnx job run --volume=secrets bash -c '
  cat > /volumes/secrets/api-token
  chmod 600 /volumes/secrets/api-token
'

rnx job run --volume=secrets bash -c '
  API_TOKEN=$(cat /volumes/secrets/api-token)
  curl -H "Authorization: Bearer $API_TOKEN" api.com
'

Data Classification

# Public data - no restrictions
rnx job run --network=bridge --volume=public-data \
  wget https://public-dataset.com/data.csv

# Internal data - network restrictions
rnx job run --network=internal --volume=internal-data \
  python process_internal.py

# Confidential data - maximum isolation
rnx job run --network=none --volume=confidential-data \
  python process_confidential.py

# Secret data - encrypted storage
rnx job run --network=none --volume=encrypted-secrets \
  --env=DECRYPT_KEY=xxx \
  python process_secrets.py

Hardening Practices

Server Hardening

# 1. Minimal server installation
sudo apt install --no-install-recommends joblet

# 2. Disable unnecessary services
sudo systemctl disable apache2 nginx mysql

# 3. Configure firewall
sudo ufw allow 50051/tcp  # Joblet port only
sudo ufw enable

# 4. Regular updates
sudo apt update && sudo apt upgrade

# 5. Secure SSH
# /etc/ssh/sshd_config
PermitRootLogin no
PasswordAuthentication no
ChallengeResponseAuthentication no

Configuration Hardening

# Secure server configuration
server:
  tls:
    enabled: true
    min_version: "1.3"

joblet:
  # Validate all commands
  validateCommands: true
  allowedCommands:
    - python3
    - node
    - bash
    - sh

  # Conservative limits
  maxConcurrentJobs: 50
  jobTimeout: "1h"
  defaultMemoryLimit: 1024

security:
  require_client_cert: true
  verify_client_cert: true
  enable_rbac: true

  # Comprehensive auditing
  audit:
    enabled: true
    log_all_operations: true
    log_failed_auth: true

filesystem:
  enable_chroot: true
  readonly_rootfs: true

process:
  default_user: "nobody"
  allow_setuid: false

File Permissions

# Secure configuration files
sudo chmod 600 /opt/joblet/config/joblet-config.yml
sudo chown root:root /opt/joblet/config/joblet-config.yml

# Secure certificates
sudo chmod 600 /opt/joblet/certs/*.pem
sudo chown root:root /opt/joblet/certs/*.pem

# Secure log files
sudo chmod 640 /var/log/joblet/*.log
sudo chown root:joblet /var/log/joblet/*.log

Monitoring and Auditing

Security Logging

# Enable comprehensive logging
logging:
  level: "info"
  format: "json"

  # Security-focused logging
  outputs:
    - type: "file"
      path: "/var/log/joblet/security.log"
      filter: "security"
    - type: "syslog"
      facility: "auth"

security:
  audit:
    enabled: true
    log_file: "/var/log/joblet/audit.log"
    log_successful_auth: true
    log_failed_auth: true
    log_job_operations: true
    log_admin_operations: true

Security Monitoring

# Monitor failed authentication attempts
sudo tail -f /var/log/joblet/audit.log | grep "auth_failed"

# Monitor admin operations
sudo tail -f /var/log/joblet/audit.log | grep "admin_operation"

# Monitor unusual job patterns
sudo tail -f /var/log/joblet/audit.log | jq 'select(.job_count > 100)'

# Monitor resource usage spikes
rnx monitor --json | jq 'select(.cpu_usage > 90 or .memory_usage > 90)'

Alerting Setup

# Create monitoring script
cat > security_monitor.sh << 'EOF'
#!/bin/bash
# Monitor for security events

# Check for multiple failed auth attempts
FAILED_AUTH=$(grep -c "auth_failed" /var/log/joblet/audit.log | tail -100)
if [ $FAILED_AUTH -gt 10 ]; then
  echo "ALERT: Multiple authentication failures detected"
fi

# Check for unusual job patterns
RUNNING_JOBS=$(rnx job list --json | jq '[.[] | select(.status == "RUNNING")] | length')
if [ $RUNNING_JOBS -gt 50 ]; then
  echo "ALERT: Unusual number of running jobs: $RUNNING_JOBS"
fi

# Check for resource exhaustion
CPU_USAGE=$(rnx monitor status --json | jq .cpu_usage)
if [ $(echo "$CPU_USAGE > 95" | bc) -eq 1 ]; then
  echo "ALERT: High CPU usage: $CPU_USAGE%"
fi
EOF

# Schedule monitoring
echo "*/5 * * * * /opt/joblet/scripts/security_monitor.sh" | sudo crontab -

Security Compliance

SOC 2 Compliance

# Configuration for SOC 2
security:
  audit:
    enabled: true
    log_all_operations: true
    retention_days: 2555  # 7 years

  access_control:
    require_mfa: true
    session_timeout: "8h"

logging:
  immutable_logs: true
  log_integrity_check: true

HIPAA Compliance

# HIPAA-compliant setup
# 1. Encrypted storage
rnx volume create phi-data --size=100GB --type=filesystem

# 2. Encrypted transit (already provided by mTLS)

# 3. Access logging
# (Already provided by audit logging)

# 4. Data minimization
rnx job run --network=none --volume=phi-data \
  python anonymize_phi.py

# 5. Secure disposal
rnx volume remove phi-data  # Secure deletion

PCI DSS Compliance

# PCI DSS network segmentation
rnx network create pci-zone --cidr=10.100.0.0/24

# Restricted processing environment
rnx job run \
  --network=pci-zone \
  --volume=pci-secure \
  --max-memory=2048 \
  --env=PCI_MODE=true \
  python process_payments.py

Incident Response

Security Incident Detection

# Automated threat detection
cat > threat_detection.sh << 'EOF'
#!/bin/bash

# Detect privilege escalation attempts
if grep -q "setuid\|sudo\|su " /var/log/joblet/audit.log; then
  echo "THREAT: Privilege escalation attempt detected"
fi

# Detect network scanning
if rnx job list --json | jq '.[] | select(.command | contains("nmap"))' | grep -q .; then
  echo "THREAT: Network scanning detected"
fi

# Detect data exfiltration patterns
if rnx job list --json | jq '.[] | select(.command | contains("curl") and .max_iobps == 0)' | grep -q .; then
  echo "THREAT: Potential data exfiltration (unlimited bandwidth)"
fi
EOF

Incident Response Procedures

# 1. Immediate response
# Stop suspicious jobs
rnx job list --json | jq -r '.[] | select(.status == "RUNNING" and (.command | contains("suspicious"))) | .id' | xargs rnx job stop

# 2. Isolate affected networks
rnx network delete compromised-network

# 3. Preserve evidence
sudo cp -r /var/log/joblet/ /var/incident-evidence/$(date +%Y%m%d-%H%M%S)

# 4. Reset certificates
sudo /usr/local/bin/certs_gen_embedded.sh
sudo systemctl restart joblet

# 5. Audit all access
sudo grep "auth_success" /var/log/joblet/audit.log | tail -1000

Best Practices Summary

✅ Do’s

  1. Always use mTLS - Never disable certificate verification
  2. Implement RBAC - Use viewer certificates for read-only access
  3. Network isolation - Use custom networks for sensitive workloads
  4. Resource limits - Always set appropriate CPU/memory limits
  5. Audit logging - Enable comprehensive security logging
  6. Regular updates - Keep Joblet and system updated
  7. Certificate rotation - Rotate certificates regularly
  8. Principle of least privilege - Use --network=none when possible

❌ Don’ts

  1. Don’t embed secrets in job commands
  2. Don’t use host network for untrusted workloads
  3. Don’t disable TLS in production
  4. Don’t use unlimited resources for untrusted jobs
  5. Don’t ignore audit logs - Monitor for anomalies
  6. Don’t share admin certificates - Use viewer certificates for most users
  7. Don’t run Joblet as non-root - It needs privileges for isolation
  8. Don’t trust user input - Validate and sanitize all inputs

See Also