Model Deployment
Logster's detection quality comes from a pair of pre-trained graph neural network models — one for Windows, one for Linux. This guide explains how the models are packaged, how they are loaded at inference-service startup, how to swap them, and how to verify integrity.
This is the Logster equivalent of a "Llama Guard Deployment" guide for a different model — it is the deployment path for the actual detection model that backs the pipeline.
Model artifacts
Logster ships two models in the default Compose stack:
| Platform | Host path | Container path |
|---|---|---|
| Windows | models/models/balanced_run_20260114_143653/best_model.pt |
/app/models/balanced_run_20260114_143653/best_model.pt |
| Linux | models/models/balanced_run_20260222_142924/best_model.pt |
/app/models/balanced_run_20260222_142924/best_model.pt |
Both are PyTorch .pt files containing the weights of a
3-layer Heterogeneous Graph Attention Network (GAT) with 128
hidden dimensions, 4 attention heads, and a 2-class softmax output
(benign / attack).
The host directory models/models/ is mounted
read-only into the inference container at /app/models. The
container can never modify the weights.
How the models are loaded
On startup, the inference service:
- Reads
model.pathandmodel.linux_model_pathfrom deploy/service-config.yaml. - Calls
torch.load()on each file to load the model state dict. - Instantiates the
HeteroGNNClassifier(Windows) andHeteroGNNClassifierLinuxarchitectures and loads the weights into each. - Moves each model to the device specified by
model.device(cpuorcuda). - Sets both models to evaluation mode.
If any of these steps fail, the container crashes at startup —
never silently running on a broken model. Check
docker compose logs inference for the exact traceback.
Swapping a model
Step 1 — Place the new model file
Copy the new best_model.pt into a new subdirectory under
models/models/. Use a clear, dated directory
name so you have a version history on disk:
cd models/models/
mkdir my_run_2026_04_12
cp /path/to/new_best_model.pt my_run_2026_04_12/best_model.pt
[!IMPORTANT] Do not overwrite existing model directories. Always add new versions side-by-side so that a rollback is a single config edit rather than a file restore.
Step 2 — Update the config
Edit deploy/service-config.yaml:
model:
path: "/app/models/my_run_2026_04_12/best_model.pt" # Windows
linux_model_path: "/app/models/balanced_run_20260222_142924/best_model.pt"
device: "cpu"
Remember the path is the container path. The mount root is
/app/models, so the container sees
models/models/my_run_2026_04_12/best_model.pt as
/app/models/my_run_2026_04_12/best_model.pt.
Step 3 — Restart the inference service
Watch the startup logs to confirm the new model loads cleanly:
You should see a log line indicating the model path and a successful load.
Step 4 — Smoke-test the new model
Compare a few recent inferences against the previous model's behavior:
# How many inferences in the last 30 minutes?
curl 'http://localhost:9200/logster-inferences/_count' \
-H 'Content-Type: application/json' \
-d '{"query":{"range":{"@timestamp":{"gte":"now-30m"}}}}'
# What's the prediction distribution?
curl 'http://localhost:5001/api/distribution'
A healthy swap looks like:
inferences_runmetric resumes climbing at the same rate.- Prediction distribution is similar to pre-swap (big benign majority, small attack tail, low error rate).
inference_time_msis comparable to pre-swap.
If any of these look very different from the previous model, you
may have deployed a model with different expectations about input
shape or distribution. Roll back by reverting model.path to the
previous directory and restarting.
Rolling back
Because you kept the previous model directory under
models/models/, rollback is a one-line config edit and a
restart:
Integrity verification
Model files are executable code from PyTorch's perspective —
torch.load() will run arbitrary pickled code at load time.
This means tampering with a model file is equivalent to code
injection into the inference service.
Record a checksum on deploy
Every time you deploy a new model, record its SHA-256:
Store the output alongside your deployment notes (in your change management system, config management repo, or however your team tracks infrastructure changes).
Verify on restart
Before each restart of the inference service, compare the current checksum against the recorded value:
# Expected
echo "abc123... models/models/my_run_2026_04_12/best_model.pt" > expected.sum
# Verify
sha256sum -c expected.sum
If the checksum does not match, do not start the service. Investigate the mismatch first.
[!WARNING] If
torch.loadis loading a tampered file, the container will still start successfully and will happily produce attacker- controlled predictions. There is no automatic integrity check at runtime — you must verify checksums manually, or wire the check into your deployment pipeline.
Use signed containers
For the strongest supply chain posture, distribute Logster container images signed with cosign or Notary v2, and gate deployment on signature verification. This defends against a compromised container registry in addition to a compromised model file.
CPU vs GPU deployment
CPU
Set model.device: cpu in
deploy/service-config.yaml. No
container changes required. Works on any Docker host.
Right for: small deployments (≤ 100 endpoints), development environments, cost-sensitive small teams.
GPU
Set model.device: cuda. The inference container needs the
NVIDIA container runtime to access a GPU on the host:
The host must have:
- An NVIDIA driver matching the PyTorch build's CUDA version.
- The
nvidia-container-toolkitpackage installed.
Right for: production deployments with hundreds to thousands of endpoints. The Logster SaaS stack runs on NVIDIA RTX 4090 GPUs — see Licensing Guide: Hardware.
Sizing
TBD — real benchmarks required.
The inference service's per-replica throughput depends on:
- Hardware (CPU core count / GPU model)
inference.windowandinference.interval- Average events per endpoint per window
- Graph size (nodes and edges) per window
Populate this section with measured numbers from your own deployment, or from Logster Support's published benchmarks once available. Do not guess — undersized hardware will silently degrade detection quality.
Metrics to capture when benchmarking:
inferences_runper second (Prometheus)inference_time_msp50 / p95 / p99 (Prometheus)active_endpointsat whichinference_time_msstarts rising
Where to go next
- Admin Guide: Installation Parameters
— the full reference for
model.*config keys. - Admin Guide: Daily Operations
— monitor
inference_time_msandinferences_runto see how the model is performing. - Security Guide: Overview — the supply-chain threat model for model files.
- Licensing Guide — SaaS vs On-Prem hardware expectations.