Kestra Integration | Deploy on Shakudo

Kestra Knowledge Base

Kestra Overview

Kestra is an open-source workflow orchestration platform that lets teams automate data pipelines, ETL jobs, AI workflows, and infrastructure tasks using simple YAML-based definitions. Think of it as the glue layer that coordinates everything: it runs your scripts, calls your APIs, moves your data, and responds to events — all without custom glue code.

In Shakudo environments, Kestra is deployed as a managed stack component. It sits between your data sources, compute services, and downstream systems, giving your team a single place to author, schedule, and monitor all automated workflows.

What Problem Does Kestra Solve?

Without an orchestrator, teams write ad-hoc cron jobs, one-off scripts, and manual pipelines that are hard to monitor, retry, or hand off. Kestra replaces that fragmentation with a single platform where every workflow is versioned, observable, and recoverable.

Replaces cron jobs and manual scripts with tracked, retryable workflows
Gives a UI to monitor all executions, logs, and errors in one place
Supports both scheduled (time-based) and event-driven (trigger-based) execution
Integrates with 1,000+ tools via plugins — databases, cloud services, APIs, ML frameworks

How Kestra Fits in the Shakudo Stack

Kestra typically sits in the orchestration layer alongside or as an alternative to Airflow. It connects to:

MinIO: reads and writes files to object storage as part of pipeline steps
PostgreSQL / Supabase: stores workflow state and runs SQL tasks
External APIs and services: any HTTP endpoint or plugin-supported tool
Containers and scripts: runs Python, Bash, or containerized tasks in isolated environments
LLM services (via LiteLLM or Ollama): orchestrates AI-enrichment steps in data pipelines

Key Concepts

Flow: the core unit in Kestra. A flow is a YAML file that defines tasks, triggers, and execution order.
Task: a single unit of work inside a flow (run a script, call an API, query a database).
Trigger: what starts a flow. Triggers can be time-based (schedule) or event-based (webhook, file arrival).
Namespace: a folder-like grouping for flows. Used to separate environments (dev, prod) or teams.
Execution: a single run of a flow. Executions have logs, inputs, outputs, and status.
Plugin: an extension that adds new task types. Plugins exist for AWS, GCP, Postgres, HTTP, Python, and hundreds more.

What Kestra Is Not

Not a data transformation engine (use dbt or Pandas inside Kestra tasks for that).
Not a feature store or model registry (use MLflow alongside Kestra for ML lifecycle management).
Not a streaming platform (use Kafka or Flink for real-time streams; Kestra handles batch and near-real-time events).

Deployment Runbook

Helm-based deployment of Kestra v1.3.2 on a Shakudo-managed Kubernetes cluster. Uses the shared cluster MinIO for storage and a bundled PostgreSQL instance for state.

What Has Worked in Practice

Deploy from a local kubeconfig with full namespace admin access — CI pipeline service accounts fail due to cross-namespace secret restrictions
Use the Shakudo monorepo Helm chart (branch kestra_upgrade_v1.3.2) not the upstream open-source chart
Use shared cluster MinIO (hyperplane-minio namespace) — do not deploy a per-Kestra MinIO instance
Set basicAuth username to email format ([email protected]) — plain usernames are rejected in v1.3.x
Include both datasources.default and datasources.postgres in values.yaml — Micronaut 4 requires both or silently fails

Required Inputs

Confirm before starting:

Local kubeconfig with full namespace admin access to hyperplane-kestra
GitHub PAT for cloning the Shakudo monorepo (password auth no longer works)
MinIO root credentials (MINIO_ROOT_USER, MINIO_ROOT_PASSWORD)
Chosen kestra-<env> bucket name and service account key
Database password for the bundled PostgreSQL
basicAuth password: must have uppercase + lowercase + digit + 8+ chars
Customer domain for the basicAuth username (e.g. [email protected])

Step 1 — Clone the Helm Chart

Step 2 — Pull Chart Dependencies

`helm dependency update . ls charts/ # expect: postgresql-.tgz, minio-.tgz`

Step 3 — Create MinIO Bucket

Step 4 — Configure values.yaml

Critical sections — do not omit any of these:

Step 5 — Deploy

`helm upgrade kestra . \\ --install \\ --namespace hyperplane-kestra \\ --values values.yaml \\ --timeout 10m \\ --wait`

Step 6 — Verify Deployment Health`‍`

`‍`Expected pod state:

kestra-postgres-0 — 2/2 Running
kestra-standalone-* — 2/2 Running
kestra-post-install-* — 1/2 Completed (normal — Istio sidecar stays up after init job finishes)

Step 7 — ConfigMap Patch (config changes without full redeploy)

Use this pattern for credential updates, AI Copilot addition, or endpoint changes:

`‍`

📌 Always use single-quoted <<'YAML' (not <<YAML) when writing YAML heredocs in terminal. The double-quote version causes a dquote> hang when the YAML contains double-quoted strings.

Safe Rollback

`# Roll back to previous Helm release helm rollback kestra -n hyperplane-kestra # View history first helm history kestra -n hyperplane-kestra`

Post-Deployment Checklist

All pods Running or Completed (no CrashLoopBackOff)
Pod readiness 2/2 (Istio sidecar + app)
Zero Warning events in namespace
MinIO health returns 200 OK
Kestra API responds on /api/v1/flows
Login works with configured credentials
Test flow created and executed end-to-end
ConfigMap has all three required sections: kestra.*, datasources.default, datasources.postgres

Administration & Best Practices

This page covers how to keep a Kestra deployment stable, organized, and secure in a production Shakudo environment.

Workflow Organisation (Namespaces)

Namespaces in Kestra are like folders. Use them to separate environments, teams, or workflow domains:

production: live, scheduled workflows
staging: test new flows before promoting to production
dev: individual developer workflows
data-team, ops, ai: domain-based grouping within production

Use lowercase with hyphens (e.g. production.data-ingestion). Avoid deep nesting.

Version Control (Git Integration)

Store flow YAML files in a Git repository and sync them to Kestra for change history, peer review, and rollback.

Keep all flow YAML files under a flows/ directory in the monorepo or a dedicated flows repo
Use a CI pipeline to push flows to Kestra via the API on merge to main
Tag each flow with a version comment in the YAML for audit purposes

# Push a flow via API curl -X POST <http://localhost:8080/api/v1/flows> \\ -H "Content-Type: application/yaml" \\ --data-binary @my-flow.yaml

Retry Policies and Error Handling

Always define retry behaviour on tasks that call external services (APIs, databases, SFTP):

tasks: - id: call-api type: io.kestra.plugin.core.http.Request uri: <https://api.example.com/data> retry: type: exponential maxAttempts: 3 multiplier: 2.0 maxDuration: PT10M

For flow-level error handling, use an errors block to run cleanup or notification tasks:

errors: - id: notify-failure type: io.kestra.plugin.core.log.Log message: "Flow {{flow.id}} failed on execution {{execution.id}}"

Scaling Workers

Increase worker thread count in values.yaml for higher throughput
For high volume, move to a distributed deployment (separate webserver + workers) — contact the Shakudo team
Stagger cron expressions to avoid hundreds of flows starting at the same second

Security Basics

BasicAuth credentials

Username must be email format: [email protected]
Password must meet complexity: uppercase + lowercase + digit + 8+ chars
Store credentials in a Kubernetes secret — do not hardcode plain text in values.yaml

RBAC

The open-source version uses basicAuth for a single admin account. Multi-user RBAC requires Kestra Enterprise. Confirm with the Shakudo team if this is needed.

Plugin blacklist

The ICI deployment blacklists the Docker plugin to prevent arbitrary container execution:

kestra: plugins: blacklist: ["io.kestra.plugin.docker.*"]

ConfigMap Backup Before Changes

kubectl get cm kestra-config -n hyperplane-kestra -o yaml > backup-$(date +%Y%m%d).yaml

Backup Strategy

PostgreSQL: schedule a pg_dump and upload to MinIO or off-cluster storage
MinIO: include the kestra-<env> bucket in the backup policy
Flows: export all flows periodically via the API if not already in Git

Troubleshooting Guide

Pod stuck in CrashLoopBackOff after install

Check: kubectl logs -n hyperplane-kestra deployment/kestra-standalone -c kestra-standalone
Look for DataSource or Micronaut errors -- usually missing datasources.default block
Fix: add all three datasource sections in values.yaml: kestra.datasources.postgres, datasources.default, datasources.postgres

Login fails or password rejected

Check: username must be email format ([email protected])
Check: password must have uppercase + lowercase + digit + 8+ chars
Fix: update credentials in ConfigMap using Step 7 and rollout restart

helm upgrade hangs or fails with RBAC error

Check: CI service accounts lack cross-namespace secret access
Fix: run helm upgrade from a local kubeconfig with full namespace admin access

Post-install job shows 1/2 Completed

This is normal with Istio enabled -- the sidecar stays running after the init job completes
No action needed unless the init container shows Error or CrashLoopBackOff

Connection Issues

Kestra cannot connect to MinIO

Check: endpoint must use cluster-internal DNS (minio.hyperplane-minio.svc.cluster.local:9000)
Check: run the Step 6 MinIO health check to verify reachability
Fix: re-run Step 3 if the bucket or service account is missing

Kestra cannot connect to PostgreSQL

Check: kestra-postgres-0 must be Running
Check: passwords in ConfigMap datasources.default and datasources.postgres must be correct
Fix: update ConfigMap with correct credentials and rollout restart

Workflow Issues

Flow not triggering on schedule

Check: validate cron expression at crontab.guru
Check: flow must be enabled in the UI
Check: Kestra must have been running at the scheduled time -- missed runs are not replayed
Fix: trigger manually via UI or API to confirm the flow itself works, then re-check trigger

Task fails with plugin not found

Check: copy the plugin type string exactly from the Kestra plugin docs
Check: plugin must not be on the blacklist in values.yaml
Fix: remove from blacklist or use an alternative plugin type

Execution stuck in Running state

Check: all pods in hyperplane-kestra namespace must be healthy
Check: view logs for the stuck execution in the Kestra UI
Fix: check the external resource the task is waiting on; rollout restart if pods are unhealthy

dquote> prompt when applying ConfigMap YAML

Cause: using <<EOF (unquoted) with YAML containing double-quoted strings -- shell misinterprets them
Fix: use single-quoted heredoc <<'YAML' (see Step 7 in the Deployment Runbook)

Performance Issues

Executions are slow or queuing

Check: default worker thread count is low for high-volume environments
Check: PostgreSQL pod CPU/memory -- it is the shared queue backend
Fix: increase workerThread in ConfigMap. For heavy workloads, consider distributed mode

Frequently Asked Questions

Q: How do I add a new plugin?

Kestra ships with many plugins bundled. To add a new one, raise a request with the Shakudo team to include it in the next custom Kestra image build.

Q: How do I promote a flow from staging to production?

Change the namespace in the flow YAML from staging to production and re-create the flow in Kestra. The original staging flow remains until manually deleted.

Q: Can I run Python scripts inside Kestra tasks?

Yes. Use io.kestra.plugin.scripts.python.Script for inline scripts. Ensure required Python libraries are available in the Kestra image.

Q: What happens if a scheduled flow misses its window because Kestra was down?

Kestra does not replay missed schedules by default. That execution is skipped. Trigger a backfill manually via UI or API if needed. Implement external monitoring if missed executions are critical.

Q: How do I update Kestra to a newer version?

Switch to the new branch in the Shakudo monorepo, update the image tag in values.yaml, re-run helm dependency update, and redeploy with helm upgrade. Always back up the ConfigMap, test in staging first, and review the Kestra changelog for breaking changes.

Pipeline Orchestration

What is Kestra, and How to Deploy It in an Enterprise Data Stack?

Kestra

What is Kestra?

What is Kestra?

Watch Kestra in action

Read more about Kestra

Why is Kestra better on Shakudo?

Kestra Knowledge Base

Kestra Overview

What Problem Does Kestra Solve?

How Kestra Fits in the Shakudo Stack

Key Concepts

What Kestra Is Not

Deployment Runbook

What Has Worked in Practice

Required Inputs

Step 1 — Clone the Helm Chart

Step 2 — Pull Chart Dependencies

helm dependency update .ls charts/ # expect: postgresql-*.tgz, minio-*.tgz

Step 3 — Create MinIO Bucket

Step 4 — Configure values.yaml

Step 5 — Deploy

helm upgrade kestra . \\ --install \\ --namespace hyperplane-kestra \\ --values values.yaml \\ --timeout 10m \\ --wait

Step 6 — Verify Deployment Health‍

‍Expected pod state:

Step 7 — ConfigMap Patch (config changes without full redeploy)

‍

Safe Rollback

# Roll back to previous Helm releasehelm rollback kestra -n hyperplane-kestra# View history firsthelm history kestra -n hyperplane-kestra

Post-Deployment Checklist

Administration & Best Practices

Workflow Organisation (Namespaces)

Version Control (Git Integration)

Retry Policies and Error Handling

Scaling Workers

Security Basics

BasicAuth credentials

RBAC

Plugin blacklist

ConfigMap Backup Before Changes

Backup Strategy

Troubleshooting Guide

Pod stuck in CrashLoopBackOff after install

Login fails or password rejected

helm upgrade hangs or fails with RBAC error

Post-install job shows 1/2 Completed

Connection Issues

Kestra cannot connect to MinIO

Kestra cannot connect to PostgreSQL

Workflow Issues

Flow not triggering on schedule

Task fails with plugin not found

Execution stuck in Running state

dquote> prompt when applying ConfigMap YAML

Performance Issues

Executions are slow or queuing

Frequently Asked Questions

Q: How do I add a new plugin?

Q: How do I promote a flow from staging to production?

Q: Can I run Python scripts inside Kestra tasks?

Q: What happens if a scheduled flow misses its window because Kestra was down?

Q: How do I update Kestra to a newer version?

Why is Kestra better on Shakudo?

Why is Kestra better on Shakudo?

Core Shakudo Features

Own Your AI

Faster Time-to-Value

Flexible with Experts

`helm dependency update . ls charts/ # expect: postgresql-.tgz, minio-.tgz`

`helm upgrade kestra . \\ --install \\ --namespace hyperplane-kestra \\ --values values.yaml \\ --timeout 10m \\ --wait`

Step 6 — Verify Deployment Health`‍`

`‍`Expected pod state:

`‍`

`# Roll back to previous Helm release helm rollback kestra -n hyperplane-kestra # View history first helm history kestra -n hyperplane-kestra`