Edge-to-HPC integration
The technologies we wire together — and why we wire them this way.
The idea
We work with a set of technologies that, individually, are well understood: Slurm job scheduling, Kubeflow ML pipelines, GPU inference, edge compute on companion computers, flight controllers, sensor ingestion, Flask APIs, relational databases. What's less common is wiring them together into a single backend that takes data in from the edge, processes it with HPC, and pushes results or commands back out — at scale, in real time, reliably.
That integration pattern is the capability. The application could be anything: autonomous drone operations, robot coordination, distributed sensor networks, real-time ISR processing, industrial IoT analytics, environmental monitoring, or something that hasn't been asked for yet. The backend doesn't care what the endpoint is. It cares about data in, compute, and decisions out.
The building blocks
- Slurm — Job scheduling and batch compute. Image processing, model training, large-scale analytics, data aggregation. The workhorse for anything that doesn't need sub-second latency.
- Kubernetes & Kubeflow — ML pipelines, experiment tracking, model registry, and serving. Autoscales to match load. Runs alongside Slurm or independently.
- GPU inference — Real-time model serving for object detection, classification, anomaly detection, or any ML task that needs to return results fast.
- Edge compute — Lightweight processing on the endpoint itself — a Raspberry Pi, a Jetson, any Linux companion computer. Handles preprocessing, buffering, and local decision-making when the backend isn't reachable.
- ArduPilot & flight controllers — The endpoint side for aerial and ground vehicles. Flight plans, waypoints, autonomous navigation. Platform-agnostic — anything ArduPilot-compatible with an uplink port.
- Sensor ingestion — Images, video, lidar, radar, telemetry, GPS. Streamed or batched into the backend for processing and storage.
- APIs & control plane — Python/Flask APIs that handle registration, telemetry, command dispatch, plan management, and integration with third-party services.
- Data layer — Percona MySQL for state, metadata, and telemetry. Scalable object storage for sensor data. The plumbing that makes everything else stateful.
What this enables
- Connecting many diverse endpoints to a single HPC-backed backend
- Processing sensor data with ML as it arrives — not hours later in a batch job
- Training and deploying models on the same infrastructure that runs the backend
- Coordinating groups of endpoints with shared plans and real-time updates
- Endpoints that keep working when disconnected and sync when they reconnect
- Plugging in third-party AI/ML services alongside your own models
- Fusing data from multiple sources into maps, dashboards, and analytics
The specific shape of the system depends on the problem. We bring the integration experience and the HPC / AI/ML infrastructure to make it work.
Why HPC and not just cloud
Cloud VMs and managed services work fine for dashboards and CRUD APIs. They do not work when you need to run object detection on thousands of images per minute, retrain models on live data, schedule parallel jobs across GPU nodes, or fuse sensor streams from dozens of sources into something coherent. That is what Slurm clusters and GPU infrastructure are for. Our founder has spent years building exactly this kind of compute — the same tooling that runs ML training, research workloads, and production inference at scale.
Where it applies
Defence and security. Disaster response. Wildfire detection. Infrastructure inspection. Agriculture. Environmental monitoring. Resource surveying. Algorithmic trading. Any domain where you have data coming in from the edge and need real compute behind it. See Sectors for how this maps to specific industries, or get in touch to talk about your problem.