AI Engineer for LLM Ops & Evaluation (m/f/d)

Auxilius.ai

AI Infrastructure

Agentic Frameworks

Tech Stack

About the Role

About Auxilius.ai

Auxilius.ai is an early-stage AI startup focused on Governance, Risk and Compliance (GRC) solutions, serving enterprise customers including auditors and compliance teams. We have product-market fit and need an AI Engineer to own our LLM operations pipeline end-to-end.

About the Role

This is a production-focused position owning the complete LLMOps pipeline at an early-stage AI-native startup.

Responsibilities

  • Manage end-to-end LLMOps infrastructure, prompt optimization, and production integration
  • Design evaluation strategy (deterministic vs. LLM-judge tradeoffs)
  • Drive prompt optimization across our LLM pipelines
  • Establish observability, monitoring, and human-in-the-loop workflows with review queues and feedback loops
  • Manage cost/latency tradeoffs in production
  • Mentor an AI & Analytics intern

Core Requirements

  • 3+ years shipping production ML/AI systems
  • Experience building a shipped LLM evaluation or prompt optimization pipeline
  • Strong hands-on experience with LLM-as-judge, including its variance problems and techniques to control them
  • Classical NLP/ML ops foundation (embeddings, semantic similarity, entity matching, classification)
  • Production judgment on cost/latency tradeoffs and observability
  • Strong Python; excellent English communication

Nice-to-Have

  • Observability tools (Langfuse, LangSmith, Phoenix/Arize, Helicone, Braintrust, W&B)
  • Experience with DSPy or similar prompt optimization frameworks
  • Azure OpenAI or EU-sovereign LLM providers (Mistral, Aleph Alpha)
  • Guardrails/content safety/AI governance exposure
  • Enterprise software experience
  • Java/Spring Boot, Kubernetes
  • German language
  • GRC domain knowledge

Tech Stack
Python, OpenAI, Anthropic, embeddings, semantic similarity, entity matching, classification. Backend uses Java, Spring Boot, Angular, Kubernetes on Azure.

Benefits

  • Direct founding team collaboration
  • Hybrid model (Munich North, minimum 1 day/week office; flexible otherwise; open to strong EU remote candidates)
  • Steep learning curve at the intersection of LLM engineering, enterprise GRC, and startup operations
  • Role in shaping the AI team

Location: Munich, Germany (hybrid, minimum one day per week in-office; flexible otherwise; open to strong EU remote candidates).

Apply Now
Apply Now

More jobs like this

Explore related roles

Get jobs like this weekly