- Jobs
- TRM Labs
- Senior or Staff ML Systems Engineer, LLMs
Senior or Staff ML Systems Engineer, LLMs
Full-timeStaff
$200K - $275K/yr
AI Tools
Agentic SystemsLangChainLangfuseLlamaIndexvLLM
Tech Stack
PythonLangChainLlamaIndexvLLMMLflowDockerKubernetesTerraformDatadog
Agent Workflow
Build modular AI infrastructure for deploying LLMs and agentic systems at scale. Integrate AI models and agents into real-time production applications with evaluation infrastructure.
About the Role
TRM Labs' AI Engineering team focuses on LLMs and agentic systems, building robust pipelines and infrastructure for deploying AI systems at scale.
Key Responsibilities:
- Develop CI/CD workflows for model training, evaluation, and deployment using tools like Langfuse and GitHub Actions
- Automate model versioning, approval workflows, and compliance checks
- Build modular AI infrastructure including vector databases, feature stores, and model registries
- Integrate AI models and agents into real-time production applications
- Deploy evaluation infrastructure for LLMs and agentic systems with regression testing and cost monitoring
- Enable researcher productivity through sandboxes and reproducible environments
Required Qualifications:
- High-quality Python software development
- Scalable infrastructure experience (Docker, Kubernetes, Terraform, CI/CD)
- Monitoring/logging expertise (Datadog, Prometheus, OpenTelemetry)
- MLOps best practices including model versioning and drift detection
- Production LLM/agentic workflow deployment and optimization
- Strong ownership mentality
Tech Stack: LangChain, LlamaIndex, vLLM, MLflow, BentoML, Langfuse, GitHub Actions, Docker, Kubernetes, Terraform, Datadog, Prometheus, OpenTelemetry, Triton.