Senior or Staff ML Systems Engineer, LLMs

Expired

This listing is older than 60 days and may no longer be accepting applications.

TRM Labs10 views

This listing is older than 60 days and has likely been filled. Here are open roles like it:

Get new agentic engineering jobs in your inbox every Monday.

One curated email a week. No spam, unsubscribe anytime.

$200K - $275K/yr

AI Infrastructure

Agentic Frameworks

Tech Stack

About the Role

TRM Labs' AI Engineering team focuses on LLMs and agentic systems, building robust pipelines and infrastructure for deploying AI systems at scale.

Key Responsibilities:

  • Develop CI/CD workflows for model training, evaluation, and deployment using tools like Langfuse and GitHub Actions
  • Automate model versioning, approval workflows, and compliance checks
  • Build modular AI infrastructure including vector databases, feature stores, and model registries
  • Integrate AI models and agents into real-time production applications
  • Deploy evaluation infrastructure for LLMs and agentic systems with regression testing and cost monitoring
  • Enable researcher productivity through sandboxes and reproducible environments

Required Qualifications:

  • High-quality Python software development
  • Scalable infrastructure experience (Docker, Kubernetes, Terraform, CI/CD)
  • Monitoring/logging expertise (Datadog, Prometheus, OpenTelemetry)
  • MLOps best practices including model versioning and drift detection
  • Production LLM/agentic workflow deployment and optimization
  • Strong ownership mentality

Tech Stack: LangChain, LlamaIndex, vLLM, MLflow, BentoML, Langfuse, GitHub Actions, Docker, Kubernetes, Terraform, Datadog, Prometheus, OpenTelemetry, Triton.

See similar open roles

More jobs like this

Explore related roles

Get jobs like this weekly

Join 26 subscribers