Overview
Grafana is a visualization and observability platform for dashboards, metrics, logs, alerts, and operational monitoring.
In a Shakudo environment, Grafana is the main observability UI. It connects to Prometheus, Loki, PostgreSQL, and other data sources so customers can monitor platform health, application status, and business or operational metrics.
This page is written for onboarding and deployment calls. It focuses on what customers need to understand, provide, validate, and troubleshoot in a real environment.
Where it fits in the stack
- Primary role: Grafana provides a reusable platform capability rather than a one-off application.
- Typical deployment model: Kubernetes + Helm, with customer-specific values and secrets.
- Typical access model: private internal endpoint or customer-approved external route.
- Typical support model: validate deployment health first, then validate user workflow and integrations.
Getting Started
Start with one safe workflow in Grafana before enabling production usage. The goal is to prove connectivity, permissions, and operational ownership.
What the customer needs to provide
- data sources such as Prometheus, Loki, Postgres, or cloud monitoring endpoints
- admin credential or SSO/OIDC setup
- dashboard JSON files or dashboard requirements
- alert routing details such as email, Slack, PagerDuty, or webhook
- domain/TLS route for customer access
First workflow
- Log in to Grafana
- Confirm data sources are healthy from Connections > Data sources
- Import or create the first dashboard
- Set the dashboard time range to match the data
- Validate each panel query in Explore
- Create alerts only after dashboard panels are confirmed
Administration and Best Practices
Use these practices to keep Grafana reliable after the initial deployment.
- Keep dashboard JSON in version control when dashboards are customer-critical
- Use folders and permissions to separate platform, app, and customer dashboards
- Validate datasource health after any credential or network change
- For SQL panels, test queries directly in Explore before editing dashboard JSON
- Use alert thresholds that customers understand, not only raw infrastructure metrics
- Back up Grafana database or provision dashboards/datasources declaratively
Troubleshooting & FAQ
Use this section during customer debugging calls. Format: Problem → What to check → Fix.
Dashboard shows No data
- What to check: Check datasource health, query time range, variables, and panel query output in Explore
- Fix: Fix the datasource or query first; only then adjust panel visualization settings
Datasource is not visible
- What to check: Check datasource provisioning config, Grafana logs, and org/user permissions
- Fix: Reload provisioning or add the datasource through the admin UI
Login fails
- What to check: Check admin secret, SSO settings, and whether basic login is enabled
- Fix: Reset admin password or correct the SSO redirect configuration
Alerts do not fire
- What to check: Check alert rule state, evaluation interval, contact point, and notification policy
- Fix: Use Test contact point, then trigger a controlled threshold breach

