Notes on systems & AI
Embedded · edge AI · robotics · networking · multi-agent systems
Forward Deployed Engineering: What the Role Actually Is
Forward Deployed Engineering isn't sales engineering with extra steps. It's the discipline of solving customer problems in the field by writing production code under time pressure. Here's what makes it different — and what makes someone good at it.
Production RAG Patterns: Beyond the Tutorial
Most RAG tutorials show toy examples that fall apart in production. Here's what actually works at scale — chunk strategy, hybrid retrieval, reranking, and the operational realities nobody mentions.
Real-Time Linux for Robotics: PREEMPT_RT in Practice
Standard Linux can have 10ms latency spikes. Real-time robotics needs sub-millisecond. PREEMPT_RT bridges that gap — here's how to actually deploy it, what changes in your code, and the gotchas nobody warns you about.
Why FieldFix Has Zero Cloud Dependencies: Designing AI for the Edge
Building an AI repair assistant that works in agricultural fields with no internet. The architectural choices for offline LLMs, deterministic safety, and why we picked Gemma 3 4B.
Inside Watchpoint: Architecture of a Robotics Incident Intelligence Platform
How Watchpoint captures robotic failures end-to-end — Go edge agent, replay bundles, rules-based RCA, and a correlation timeline. The architecture decisions that made it work in production.
Programming NVIDIA BlueField DPUs with DOCA
How to build data-plane applications on NVIDIA BlueField DPUs using the DOCA SDK — packet processing, flow steering, and running AI inference inline with network traffic.
MuJoCo Sim-to-Real: Closing the Gap for Humanoid Robots
How we used MuJoCo simulation, teleoperation data, and NVIDIA GR00T to build locomotion policies for the Unitree G1 humanoid — and what the sim-to-real gap actually looks like in practice.
TensorRT in Production: The Complete Optimization Workflow
End-to-end TensorRT optimization — from PyTorch model to INT8 engine running at 60fps on Jetson Orin. Covers ONNX export, calibration, engine building, profiling, and common pitfalls.
Building Self-Improving Multi-Agent AI Systems
How we built HydraSwarm — a 7-agent system that gets measurably better at software engineering tasks with each run, using persistent vector memory and structured agent roles.
ROS2 for Physical AI: Building Real-Time Robot Pipelines
How ROS2's DDS middleware, lifecycle nodes, and executor model enable production robotics — lessons from building humanoid teleoperation and multi-sensor fusion pipelines.
CUDA Kernel Optimization: What Actually Moves the Needle on Jetson
The profiling-driven workflow I use to squeeze real inference throughput out of Jetson Orin — memory coalescing, occupancy tuning, and why INT8 isn't always the answer.
Embedded Linux from Scratch: BSP, Kernel Config, and Device Drivers
What nobody tells you about bringing up embedded Linux on custom hardware — from BSP bringup and kernel config to writing your first character driver and surviving device tree.
Winning Two Awards at the Intelligence at the Frontier Hackathon
How our team won Best Overall Use of DeepLake with HydraSwarm and the Physical AI & Robotics track with a Unitree G1 humanoid pipeline — all in 36 hours.
Deploying Edge AI Inference on Jetson Orin for Industrial Logistics
How we deployed CUDA and TensorRT inference pipelines on Jetson Orin at Ciena to detect conveyor anomalies in real time, cutting unplanned downtime by 22%.
Lessons from Packet Processing and Data-Plane Engineering
What I've learned about high-performance packet processing across two roles — from DMA optimization at Ciena to P4-programmable data planes at Cisco.