Skip to content

Commit 0e0ecb7

Browse files
Merge branch 'main' into release-0.4
2 parents 3d2757d + bd129c3 commit 0e0ecb7

22 files changed

Lines changed: 1646 additions & 1445 deletions

AGENTS.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# Openshift Troubleshooting Panel Console Plugin - AI Agent Guide
2+
3+
## Project Overview
4+
A frontend plugin to the OpenShit Console of the [Korrel8r](https://github.com/korrel8r/korrel8r) project, which looks to display a connected web of observability nodes/signals and navigate to them on selection.
5+
6+
## External Dependencies & Operators
7+
8+
| System | Repository | Purpose |
9+
|--------|------------|---------|
10+
| COO | https://github.com/rhobs/observability-operator | Manages troubleshooting-panel-console-plugin |
11+
| Korrel8r | https://github.com/korrel8r/korrel8r | Correlation backend |
12+
| Console SDK | https://github.com/openshift/console | Plugin framework |
13+
14+
### COO (Cluster Observability Operator)
15+
COO is the downstream OpenShift build of the observability-operator project, providing optional observability configuration and features to a kubernetes cluster. In order to deploy the troubleshooting-panel and korrel8r, a UIPlugin with the type of `TroubleshootingPanel` can be created. COO can take the state of the cluster (such as the OCP version) and information set in the UIPlugin to pass in a set of features or configuration values to the troubleshooting-panel backend. Currently there aren't any features or configuration values used.
16+
17+
18+
- **UIPlugin CR example**:
19+
```yaml
20+
apiVersion: observability.openshift.io/v1alpha1
21+
kind: UIPlugin
22+
metadata:
23+
name: troubleshooting-panel
24+
spec:
25+
type: TroubleshootingPanel
26+
```
27+
28+
### Korrel8r
29+
Korrel8r is a rule based correlation engine, with an extensible rule set, that can navigate:
30+
- many types of signal and resource data
31+
- using diverse schema, data models and naming conventions
32+
- queried using diverse query languages
33+
- stored in multiple stores with diverse query APIs
34+
35+
Each type of signal or resource is represented by a "domain". Korrel8r can be extended to handle new signals and resources by adding new domains.
36+
37+
Relationships within and between domains are expressed as "rules".
38+
39+
The full documentation for Korrel8r can be found here:
40+
https://korrel8r.github.io/korrel8r/
41+
42+
### Console Plugin Framework
43+
The OpenShift Console uses a frontend plugin system powered by Webpack's Module Fedaration. Upon reconciling the UIPlugin, COO will create a ConsolePlugin CR which will enable a route for OpenShift console users to make requests to the troubleshooting-panel pod. The OpenShift Console will first load a `plugin-manifest.json` which is rendered from the `./web/console-extensions.json` file durring build time, and then use the information within it to dynamically load needed chunks of the built js to the frontend.
44+
45+
The OpenShift console provides an npm SDK package which is tied to the OCP version it is built for. The package tries to retain compatability as much as possible, so a single build is able to be used across multiple OCP versions, with specific versions (such as 4.19 and the unreleased 4.22) breaking backwards compatability.
46+
47+
## Development Guide
48+
The troubleshooting-panel repo's code is split up into 2 general areas:
49+
- golang backend - `./cmd` and `./pkg` folders
50+
- frontend components - `./web`
51+
52+
All commands should be routed through the `Makefile`.
53+
54+
### Frontend
55+
The troubelshooting-panel uses the following technologies:
56+
- typescript
57+
- react 17
58+
- i18next
59+
- redux
60+
61+
#### i18next
62+
When working with i18next the react hook should contain the troubleshooting panels namespace, and each piece of static text should be wrapped in the returned translation function. After adding a new tranlated text, make sure to run `make build-frontend` which will regenerate the translation files.
63+
64+
```ts
65+
const { t } = useTranslation('plugin__troubleshooting-panel-console-plugin');
66+
return <div>{`t('Korrel8r')`}</div>
67+
```
68+
69+
### Backend
70+
The troubelshooting-panel uses the following technologies:
71+
- go
72+
- gorilla/mux
73+
74+
### Console Plugin Framework:
75+
- Dynamic Plugin: https://github.com/openshift/enhancements/blob/master/enhancements/console/dynamic-plugins.md
76+
- Plugin SDK README: https://github.com/openshift/console/blob/main/frontend/packages/console-dynamic-plugin-sdk/README.md
77+
- Plugin SDK API: https://github.com/openshift/console/blob/main/frontend/packages/console-dynamic-plugin-sdk/docs/api.md
78+
- Extensions docs: https://github.com/openshift/console/blob/main/frontend/packages/console-dynamic-plugin-sdk/docs/console-extensions.md
79+
- Example plugin: https://github.com/openshift/console/tree/main/dynamic-demo-plugin
80+
81+
In the event that a new console-extension point is needed which is only available when a specific feature is enabled, the `openshift/monitoring-plugin` can be used as an implementation refrence:
82+
83+
For reference for adding console extension points or features:
84+
https://github.com/openshift/monitoring-plugin/tree/main/pkg
85+
86+
### Korrel8r
87+
88+
The Korrel8r API client is built off of the swagger documentation for the upstream project. Updating the API client can be accomplished by copying the `swagger.json` file from the upstream project, located [here](https://github.com/korrel8r/korrel8r/blob/main/pkg/rest/docs/swagger.json), and then running `make gen-client`.
89+
90+
#### Openshift Domain's Location
91+
After determing which domains, signals and queries are connected from querying the perses backend, we then need to convert the korrel8r responses into OpenShift URL's so that we can match the current page and locate related pages to navigate to. These conversions are located in `./web/src/korrel8r`. These URL conversions MUST be kept accurate and have extensive unit tests located in `./web/src/__tests__`.
92+
93+
### Development Setup
94+
- See README.md for full setup
95+
- Deployment of COO and other resources: https://github.com/observability-ui/development-tools/
96+
97+
## Release & Testing
98+
99+
### Before submitting a PR run the following and address any errors:
100+
```bash
101+
make build-frontend
102+
make test-frontend
103+
```
104+
105+
### PR Requirements:
106+
- **Title format**: `[JIRA_ISSUE]: Description`
107+
- **Testing**: All linting and tests must pass
108+
- **Translations**: Ensure i18next keys are properly added by ensuring any static text in the frontend is wrapped in a useTranslation function call, ie. `t('Korrel8r')`
109+
110+
### Commit Requirements:
111+
- **Title format**: Conventional Commit format ([link](https://www.conventionalcommits.org/en/v1.0.0/))
112+
113+
---
114+
*This guide is optimized for AI agents and developers. For detailed setup instructions, also refer to README.md and Makefile.*

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
./AGENTS.md

Dockerfile.dev

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
# This Dockerfile is used for local testing of images. All images should be available for public use
2-
FROM registry.access.redhat.com/ubi9/nodejs-22:latest AS web-builder
1+
FROM registry.redhat.io/ubi9/nodejs-22:latest AS web-builder
32

43
WORKDIR /opt/app-root
54

65
USER 0
76

7+
ENV HUSKY=0
88
COPY web/package*.json web/
99
COPY Makefile Makefile
10-
RUN make install-frontend
10+
RUN make install-frontend-ci
1111

1212
COPY web/ web/
1313
RUN make build-frontend
@@ -20,14 +20,23 @@ COPY Makefile Makefile
2020
COPY go.mod go.mod
2121
COPY go.sum go.sum
2222

23-
RUN go mod download
23+
RUN make install-backend
2424

2525
COPY cmd/ cmd/
2626
COPY pkg/ pkg/
2727

28-
RUN make build-backend
28+
ENV GOFLAGS='-mod=mod'
29+
ENV GOEXPERIMENT=strictfipsruntime
30+
ENV CGO_ENABLED=1
2931

30-
FROM registry.access.redhat.com/ubi9/ubi-minimal
32+
RUN make build-backend BUILD_OPTS="-tags strictfipsruntime"
33+
34+
FROM registry.redhat.io/rhel9-4-els/rhel:9.4
35+
36+
RUN mkdir /licenses
37+
COPY LICENSE /licenses/.
38+
39+
USER 1001
3140

3241
COPY --from=web-builder /opt/app-root/web/dist /opt/app-root/web/dist
3342
COPY --from=go-builder /opt/app-root/plugin-backend /opt/app-root

Makefile

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,13 @@
1+
2+
VERSION ?= latest
3+
PLATFORMS ?= linux/arm64,linux/amd64
4+
ORG ?= openshift-observability-ui
5+
IMAGE ?= quay.io/${ORG}/troubleshooting-panel-console-plugin:${VERSION}
6+
TAG ?= $(VERSION)
7+
8+
.PHONY: all
9+
all: build-frontend build-backend test-frontend
10+
111
.PHONY: test
212
test: test-frontend
313

@@ -50,21 +60,17 @@ install: install-frontend install-backend
5060

5161
.PHONY: build-image
5262
build-image: build-frontend test-frontend
53-
./scripts/build-image.sh
63+
TAG=$(TAG) ./scripts/build-image.sh
5464

5565
.PHONY: start-forward
5666
start-forward:
5767
./scripts/start-forward.sh
5868

59-
export REGISTRY_ORG?= openshift-observability-ui
60-
export TAG?=latest
61-
IMAGE=quay.io/${REGISTRY_ORG}/troubleshooting-panel-console-plugin:${TAG}
62-
6369
.PHONY: deploy
6470
deploy: test-frontend ## Build and push image, reinstall on cluster using helm.
6571
helm uninstall troubleshooting-panel-console-plugin -n troubleshooting-panel-console-plugin || true
6672
PUSH=1 scripts/build-image.sh
67-
helm install troubleshooting-panel-console-plugin charts/openshift-console-plugin -n troubleshooting-panel-console-plugin --create-namespace --set plugin.image=$(IMAGE)
73+
helm install troubleshooting-panel-console-plugin charts/openshift-console-plugin -n troubleshooting-panel-console-plugin --create-namespace --set plugin.image=${IMAGE}
6874

6975
.PHONY: start-devspace-backend
7076
start-devspace-backend:
@@ -77,3 +83,9 @@ gen-client: web/src/korrel8r/client
7783
web/src/korrel8r/client: korrel8r/swagger.json
7884
cd web && npx openapi-typescript-codegen --indent 2 --input ../$< --output ../$@ --name Korrel8rClient
7985
@touch $@
86+
87+
.PHONY: podman-cross-build
88+
podman-cross-build:
89+
podman manifest create -a ${IMAGE}
90+
podman build --platform=${PLATFORMS} --manifest ${IMAGE} -f Dockerfile.dev
91+
podman manifest push ${IMAGE}

OWNERS

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,11 @@
11
reviewers:
22
- jgbernalp
3-
- kyoto
43
- zhuje
54
- peteryurkovich
65
- alanconway
76
- shwetaap
87
approvers:
98
- jgbernalp
10-
- kyoto
119
- zhuje
1210
- peteryurkovich
1311
- alanconway

doc/README.adoc

Lines changed: 1 addition & 126 deletions
Original file line numberDiff line numberDiff line change
@@ -2,129 +2,4 @@
22
:doctype: book
33
:toc: left
44

5-
The troubleshooting panel displays a graph of resources and observability signals related to whatever is
6-
shown in the main console window.
7-
Nodes in the graph represent a type of resource or signal, edges represent relationships.
8-
9-
Clicking on a node in the graph opens the console page showing details of that resource or signal.
10-
Clicking the "Focus" button re-calculates the graph starting from the current contents of the main window.
11-
12-
The panel provides a map of related information to help you navigate more quickly to relevant data,
13-
or to discover relevant data you may not have been aware of.
14-
15-
We will show an example of troubleshooting an Alert.
16-
17-
NOTE: You can re-create this example alert on your own cluster by following the instructions xref:example-alert[here].
18-
You can also experiment by using the panel with existing resources in your own cluster.
19-
20-
== Opening the panel
21-
22-
Open the troubleshooting panel with the "Signal Correlation" entry in the troubleshooting section of
23-
the "launcher" menu, found at top right of the screen:
24-
25-
[.border]
26-
image::images/launcher.png[]
27-
28-
Opening the panel shows a _neighbourhood_ of the resource currently displayed in the console.
29-
A neighbourhood is a graph that starts at the current resource, and includes related objects up to
30-
3 steps away from the starting point.
31-
32-
NOTE: Not all resource types are currently supported, more will be added in future.
33-
For an unsupported resource, the panel will be empty.
34-
35-
For example here the panel for a `KubeContainerWaiting` alert.
36-
37-
[.border]
38-
image::images/panel-graph.png[]
39-
40-
41-
<1> Alert(1): This node represents the starting point, a `KubeContainerWaiting` alert that was displayed in the console.
42-
<2> Pod(1): This node indicates there is a single Pod resource associated with this alert. Clicking on this node will show the pod details in the console.
43-
<3> Event(2): There are two kuberenetes events associated with the Pod, and you can see them by clicking this node.
44-
<3> Logs(74): The pod has emitted 74 lines of logs. Click to show them.
45-
<4> Metrics(105): There are always many metrics associated with every Pod.
46-
<6> Network(6): There are network events associated with the pod, which means it has communicated with other resources in the cluster.
47-
The remaining Service, Deployment and DaemonSet nodes are the resources that the pod has communicated with.
48-
<7> Focus: Clicking this button will re-calculate the graph starting from the current contents of the main console window.
49-
This may have changed by clicking nodes in the graph, or by using any other links, menus or navigation features of the console.
50-
<8> Show Query: enables experimental features detailed below.
51-
52-
NOTE: Clicking on a node may sometimes show fewer results than are indicated on the graph.
53-
This is a known issue that will be addressed in future.
54-
55-
== Experimental features
56-
57-
[.border]
58-
image::images/query-details.png[]
59-
60-
<1> Hide Query hides the experimental features.
61-
<2> The query that identifies the starting point for the graph. This is normally derived automatically from the contents of the main console window.
62-
You can enter queries manually, but the format of this query language is experimental and likely to change in future.
63-
footnote:[This query language is part of https://korrel8r.github.io/korrel8r[Korrel8r], the correlation engine used to create the graphs]
64-
The "Focus" button updates the query to match the resources in the main console window.
65-
<3> Neighbourhood depth: increase or decrease to see a smaller or larger neighbourhood.
66-
Note: setting a large value in a large cluster may cause the query to fail if the number of results is too big.
67-
<4> Goal class: Selecting this option will do a _goal directed search_ instead of a neighbourhood search.
68-
A goal directed search will show all paths from the starting point to the goal _class_ , which indicates a type of resource or signal.
69-
70-
The format of the goal class is experimental and may change. Currently the valid goal classes are:
71-
72-
[horizontal]
73-
`k8s:__resource[.version.[group]]__` :: Kind of Kuberenetes resource. For example `k8s:Pod` or `k8s:Deployment.apps.v1`.
74-
`alert:alert`:: Any alert.
75-
`metric:metric`:: Any metric.
76-
`netflow:network`:: Any network observability event.
77-
`log:__log_type__`:: Stored logs, __log_type__ must be `application`, `infrastructure` or `audit`
78-
79-
== Optional signal stores
80-
81-
The troubleshooting panel relies on the observability signal stores installed in your cluster.
82-
Kuberenetes resources, alerts and metrics are available by default in an OCP cluster.
83-
84-
Other types of signal require optional components to be installed:
85-
86-
- Logs: "Red Hat Openshift Logging" (collection) and "Loki Operator provided by Red Hat" (store)
87-
- Network Events: "Network Observability provided by Red Hat" (collection) and "Loki Operator provided by Red Hat" (store)
88-
89-
== Creating the example alert
90-
[id="example-alert"]
91-
92-
You can reproduce the example alert shown above as follows.
93-
94-
.Procedure
95-
96-
. Run the following command to create a broken deployment in a system namespace:
97-
+
98-
[source,terminal]
99-
----
100-
kubectl apply -f - << EOF
101-
apiVersion: apps/v1
102-
kind: Deployment
103-
metadata:
104-
name: bad-deployment
105-
namespace: default <1>
106-
spec:
107-
selector:
108-
matchLabels:
109-
app: bad-deployment
110-
template:
111-
metadata:
112-
labels:
113-
app: bad-deployment
114-
spec:
115-
containers: <2>
116-
- name: bad-deployment
117-
image: quay.io/openshift-logging/vector:5.8
118-
----
119-
<1> The deployment must be in a system namespace (such as `default`) to cause the desired alerts.
120-
<2> This container deliberately tries to start a `vector` server with no configuration file. The server will log a few messages, and then exit with an error. Any container could be used for this.
121-
122-
. View the alerts:
123-
.. Go to *Observe* -> *Alerting* and click *clear all filters*. View the `Pending` alerts.
124-
+
125-
[IMPORTANT]
126-
====
127-
Alerts first appear in the `Pending` state. They do not start `Firing` until the container has been crashing for some time. By showing `Pending` alerts you can see them much more quickly.
128-
====
129-
.. Look for `KubeContainerWaiting`, `KubePodCrashLooping`, or `KubePodNotReady` alerts.
130-
.. Select one such alert and open the troubleshooting panel, or click the "Focus" button if it is already open.
5+
See downstream documentation https://docs.redhat.com/en/documentation/red_hat_openshift_cluster_observability_operator/1-latest/html/ui_plugins_for_red_hat_openshift_cluster_observability_operator/troubleshooting-ui-plugin[Chapter 5. Troubleshooting UI plugin]

scripts/build-image.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ set -euo pipefail
44

55
PREFER_PODMAN="${PREFER_PODMAN:-1}"
66
PUSH="${PUSH:-0}"
7-
TAG="${TAG:-v0.1.0}"
7+
TAG="${TAG:-latest}"
88

99
REGISTRY_HOST=${REGISTRY_HOST:-quay.io}
1010
REGISTRY_ORG="${REGISTRY_ORG:-openshift-observability-ui}"

web/.depcheckrc

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# .depcheckrc config file for `npx depcheck`
2+
# Ignore packages not explicitly imported but still needed.
3+
# For example packages used as or via other executables, or imported via configuration files.
4+
ignores:
5+
- "@types/*"
6+
- css-loader
7+
- cypress-multi-reporters
8+
- eslint
9+
- husky
10+
- jest
11+
- jest-environment-jsdom
12+
- lint-staged
13+
- mocha-junit-reporter
14+
- mochawesome
15+
- prettier
16+
- style-loader
17+
- stylelint-config-standard
18+
- ts-loader
19+
ignore-bin-package: true

0 commit comments

Comments
 (0)