Security Guide: Security Validations
This page documents how Logster validates data as it flows through the pipeline — at ingestion, during normalization, during inference, and at the API layer. It is written so that security engineers can reason about attack surface: "what happens if an attacker injects malformed JSON into Kafka?", "what happens if the model produces a nonsense prediction?", "what happens if the API receives a crafted request?"
Stage 1 — Raw Kafka ingestion
Trust boundary: the five raw topics (sysmon-logs,
linux-auditd-logs, linux-ebpf-process-logs,
linux-ebpf-file-logs, linux-ebpf-network-logs). Everything
published to these topics is treated as untrusted input and must be
validated by the normalizer before being re-published.
What is validated
The normalizer accepts any JSON payload on these topics, but parses it through platform-specific parser functions:
- Windows events go through
normalize_sysmon_event(). - Linux auditd events go through
normalize_auditd_event(). - Linux eBPF events go through
normalize_ebpf_event().
Each parser:
- Checks that the event type is known. Unknown event types
are dropped with a
parse_errorscounter increment. - Extracts only the fields it understands into the typed
datafield of theNormalizedEvent. Unrecognized fields are preserved underraw_eventfor forensic replay but are not routed into downstream logic. - Normalizes identifiers. Hostnames are lowercased (
DESKTOP-X→desktop-x), timestamps are converted to Unix epoch float, andtenant_iddefaults to"default"if not present. - Assigns a fresh UUID as
event_id. This is used by Logstash as the Elasticsearch document id, so duplicates are implicitly deduped at the ES layer.
What is not validated
- Kafka write access. In
PLAINTEXTmode (the default), anyone who can reach the broker can publish to any topic. This is the single most important thing to lock down before production. See Configuration: Kafka authentication. - Payload schema strictness. The parser is forgiving — fields that don't match expected types are often silently coerced or skipped rather than raising an error. This is an intentional trade-off against the realities of real-world Sysmon/auditd payloads (which include unexpected fields routinely), not a security feature.
- Event authenticity. There is no per-host signature on normalized events. A malicious collector that can write to Kafka can impersonate any host.
[!WARNING] If you cannot fully control Kafka write access, Logster's detections become untrustworthy. An attacker who can write arbitrary events into the raw topics can craft an endpoint history that looks entirely normal, burying real attack evidence.
Stage 2 — Normalized events
Trust boundary: events on normalized-endpoint-events are
trusted by the inference service to be well-formed.
What is validated
The inference service validates that normalized events:
- Have a non-empty
endpoint_id,tenant_id, andplatform. - Carry a
timestampthat falls within the current sliding window (inference.windowminutes). - Have a recognized
event_typefor the platform (process, file, network, script, syscall).
Events that fail these checks are dropped from the window, with a log message but no alert.
Graph construction safety
When building the heterogeneous graph for a window, the graph builder:
- Caps the number of nodes and edges per window. Abnormally large windows (tens of thousands of events) are truncated to protect the GNN from OOM.
- Deduplicates nodes by deterministic key (e.g.
process_guidfor Windows processes,pidfor Linux). This prevents a malicious event from inflating the graph with thousands of distinct "processes" that are actually the same one. - Timestamps every edge, so the model can reason about temporal ordering rather than just topology.
If graph construction fails — missing required fields, malformed
relationships, window too small to form any edges — the inference
service emits an InferenceResult with
prediction="error" and moves on. An error prediction never
becomes an alert.
Stage 3 — GNN inference
Trust boundary: the pre-trained .pt model file is assumed
trustworthy. If an attacker has swapped the model file, Logster's
detection pipeline is compromised regardless of any other control.
What is validated
- Model file exists and loads cleanly. A load failure crashes the service at startup. Always prefer a hard crash to silently running on a broken model.
- Input tensor shape matches the model's expected input. Shape mismatches are caught and logged as inference errors rather than crashes.
- Output is a valid 2-class softmax.
attack_probis clamped to[0.0, 1.0]. NaN / Inf outputs are treated as errors.
Threshold enforcement
The threshold between benign and attack is set by
inference.threshold (default 0.7). Below that value, the
inference is labeled benign regardless of how close it came.
The alerts service re-checks this via alerts.min_threshold —
there are two independent thresholds, and raising either one
is a valid way to suppress noise.
Model integrity
To verify the model file hasn't been tampered with, compute and record a checksum whenever you deploy a new model:
Then compare it on every restart. See Model Deployment.
Stage 4 — Alerts
Trust boundary: the alerts service trusts its own in-memory state and the inference results it consumes from Kafka.
What is validated
- Minimum threshold. Inferences below
alerts.min_thresholdnever become alerts, even ifprediction == "attack". - Dedup key well-formedness. If
tenant_id,endpoint_id, orplatformis missing, the result is dropped rather than producing a headless alert. - Severity derivation. The service maps
attack_probto the severity enum deterministically. There is no way for an inference to "claim" a severity outside the derivation. - State machine bounds. Analyst state transitions (
open → ...) are validated to be validAlertStatusvalues. See API User Guide: Reference.
What is not validated
- Identity of the analyst performing a state transition. The
API accepts any string as
resolved_byandanalyst. Without a reverse proxy that injects a trusted identity header, an attacker with API access can attribute any verdict to anyone.
Stage 5 — REST API
Trust boundary: in the default build, everything. See the Threat model.
What is validated
- Request schema. FastAPI + Pydantic automatically validate
every request body against its schema. Missing required fields
or wrong types return
422. - Path parameters. Alert IDs are passed through to the store unchanged but are not directly executed against any query language (the store is a dict in the default build, not SQL).
- Status enum.
PATCH /alerts/{id}validates thestatusvalue against theAlertStatusenum. Invalid values return400. - Pagination bounds.
limitis enforced between1and1000;offsetmust be>= 0.
What is not validated
- Authentication. There is none — see Configuration.
- Authorization. Everyone who reaches the API sees every tenant and every alert.
- Rate limiting. The API does not rate-limit callers. A broken integration can easily DoS it.
CORS
The CORS middleware is configured from api.cors_origins in
deploy/service-config.yaml.
The default ["*"] accepts any origin. In production, restrict
this to your dashboard origin(s) only. See
Configuration: CORS.
Stage 6 — Dashboard
Trust boundary: The dashboard reads directly from Elasticsearch. Its backend has no write access to the alert store.
What is validated
The dashboard's Express backend validates query parameters on
every route (time ranges, hostnames, platform filters). Invalid
inputs return 400 or 500 and are logged.
What is not validated
- Authentication.
DISABLE_AUTH=trueby default. - Authorization. Same problem as the API — every authenticated user sees every tenant's data.
- ES injection via query parameters. The Express backend uses parameterized Elasticsearch queries via the official ES client, so direct injection is not a concern. However, a malicious user could submit extremely expensive queries (very wide time ranges, huge aggregation sizes) and overload the cluster. Mitigate by enforcing an ES query cost budget at the reverse proxy layer.
Audit logging
Logster does not currently produce an explicit audit log of analyst actions. The closest equivalents are:
- Alert update history. Every PATCH / feedback call updates
updated_at,resolved_by, and appends toanalyst_notes. The alert store keeps the latest state but not the full history. - Service-level tracing. Every API call is captured by
OpenTelemetry and shipped to Tempo (via the in-stack
OpenTelemetry Collector). Traces contain timestamps, route
paths, and status codes — but not request bodies, so you
cannot reconstruct "who changed alert X to
false_positive" from traces alone. - Docker container logs. FastAPI logs every request at INFO level. Ship these to your log aggregator if you need a durable audit trail.
For a real audit log of analyst actions, the recommended pattern
is to record every /feedback and /alerts/{id} PATCH at the
reverse proxy layer with the authenticated identity from your
identity provider.
Where to go next
- Configuration — step-by-step hardening procedures for each of the issues described above.
- Security Guide: Overview — the threat model this page is scoped against.
- Admin Guide: Important Considerations — the full production hardening checklist.