Appliance Troubleshooting Script

The OVA ships with a self-diagnostic script that checks the two most common "nothing works" situations on the appliance:

Are Sysmon events flowing into Kafka? An empty console almost always traces back to here.
Is the local LLM (the GPU Node) reachable and answering? Verdicts that all show error, or nothing being scored, almost always trace back to here.

Run it whenever the appliance is up but data isn't appearing where you expect.

Running it

The script is installed on the App Node at /opt/logster/troubleshoot/logster-troubleshoot.sh. Run it with sudo:

sudo /opt/logster/troubleshoot/logster-troubleshoot.sh

It prints a timestamped PASS/FAIL line for each check, followed by a summary. It exits non-zero if any check fails, so it is also safe to call from other scripts.

[!NOTE] Start the stack first (App Node → Step 4). The script talks to the running containers; if the stack is down it will report the failure and tell you how to bring it up.

What it checks

Check 1 — Sysmon events flowing into Kafka

The script live-tails the sysmon-logs topic for up to 60 seconds, waiting for a new event to arrive.

PASS — events are flowing. The endpoint side of the pipeline is healthy.
FAIL — no event was seen within the window. The Windows endpoint agent is likely not shipping, or the LAN listener / Winlogbeat output is misconfigured. See Connect Windows Endpoints and confirm the endpoint can reach <EXTERNAL_KAFKA_LAN_HOST>:29092.

The script also prints the manual command it uses, so you can keep watching the topic yourself:

cd /opt/logster/deploy && sudo docker compose exec kafka \
    kafka-console-consumer --bootstrap-server localhost:9092 \
    --topic sysmon-logs

If nothing prints for about a minute, no events are flowing in.

Check 2 — Local LLM reachability

The script discovers the LLM endpoint the inference service is actually using (it reads LOCAL_LLM_ENDPOINT from the running inference container, falling back to /etc/logster/logster.env), then:

Calls GET /models on the GPU node to confirm it is reachable and to read the served model id.
Sends a tiny dummy chat completion to confirm the model actually answers.
PASS — the GPU node is reachable and the model answered (HTTP 200). The inference path can reach the model.
FAIL — the message tells you which stage failed and the likely cause:

Symptom reported	Likely cause
`LOCAL_LLM_ENDPOINT is not set`	The value is missing from `logster.env` — see App Node → Step 2.
Cannot resolve the LLM host (DNS)	Wrong hostname in `LOCAL_LLM_ENDPOINT`.
Connection refused	The GPU node is down or the port is wrong.
Connection timed out	Firewall, wrong host, or the node is unreachable.
HTTP 400	Model name mismatch.
HTTP 401	Bad or missing API key.
HTTP 5xx	Error on the GPU node itself.

[!IMPORTANT] If the local LLM is unreachable the stack still runs, but every window is reported as an error verdict (not benign). Make sure the GPU Node is up and reachable from the App Node before starting — see GPU Node.

Advanced — overriding the defaults

The script targets the appliance layout by default. For a non-standard layout you can override these

environment variables:

Variable	Default	Purpose
`LOGSTER_DEPLOY_DIR`	`/opt/logster/deploy`	Directory holding `docker-compose.yml`.
`LOGSTER_ENV_FILE`	`/etc/logster/logster.env`	Env file the LLM endpoint/key are read from when the container isn't running.

LOGSTER_DEPLOY_DIR=./deploy sudo -E /opt/logster/troubleshoot/logster-troubleshoot.sh

If docker-compose.yml is not found at LOGSTER_DEPLOY_DIR, the script exits with status 2 and prints which directory it expected.

Appliance stack issues

Stack won't start:

sudo systemctl status logster.service
sudo docker compose -f /opt/logster/deploy/docker-compose.yml ps
sudo docker compose -f /opt/logster/deploy/docker-compose.yml logs --tail 100

The two most common causes are a missing/invalid license (App Node → Step 3) or an unfilled logster.env (App Node → Step 2).

Reset to a clean state (wipes all data):

sudo systemctl stop logster.service
cd /opt/logster/deploy && sudo docker compose --profile services down -v
sudo systemctl start logster.service

Endpoint not appearing on the console: see Connect Windows Endpoints.