- Jobs
- Auxilius.ai
- AI Engineer for LLM Ops & Evaluation (m/f/d)
AI Engineer for LLM Ops & Evaluation (m/f/d)
AI Infrastructure
Agentic Frameworks
Tech Stack
About the Role
About Auxilius.ai
Auxilius.ai is an early-stage AI startup focused on Governance, Risk and Compliance (GRC) solutions, serving enterprise customers including auditors and compliance teams. We have product-market fit and need an AI Engineer to own our LLM operations pipeline end-to-end.
About the Role
This is a production-focused position owning the complete LLMOps pipeline at an early-stage AI-native startup.
Responsibilities
- Manage end-to-end LLMOps infrastructure, prompt optimization, and production integration
- Design evaluation strategy (deterministic vs. LLM-judge tradeoffs)
- Drive prompt optimization across our LLM pipelines
- Establish observability, monitoring, and human-in-the-loop workflows with review queues and feedback loops
- Manage cost/latency tradeoffs in production
- Mentor an AI & Analytics intern
Core Requirements
- 3+ years shipping production ML/AI systems
- Experience building a shipped LLM evaluation or prompt optimization pipeline
- Strong hands-on experience with LLM-as-judge, including its variance problems and techniques to control them
- Classical NLP/ML ops foundation (embeddings, semantic similarity, entity matching, classification)
- Production judgment on cost/latency tradeoffs and observability
- Strong Python; excellent English communication
Nice-to-Have
- Observability tools (Langfuse, LangSmith, Phoenix/Arize, Helicone, Braintrust, W&B)
- Experience with DSPy or similar prompt optimization frameworks
- Azure OpenAI or EU-sovereign LLM providers (Mistral, Aleph Alpha)
- Guardrails/content safety/AI governance exposure
- Enterprise software experience
- Java/Spring Boot, Kubernetes
- German language
- GRC domain knowledge
Tech Stack
Python, OpenAI, Anthropic, embeddings, semantic similarity, entity matching, classification. Backend uses Java, Spring Boot, Angular, Kubernetes on Azure.
Benefits
- Direct founding team collaboration
- Hybrid model (Munich North, minimum 1 day/week office; flexible otherwise; open to strong EU remote candidates)
- Steep learning curve at the intersection of LLM engineering, enterprise GRC, and startup operations
- Role in shaping the AI team
Location: Munich, Germany (hybrid, minimum one day per week in-office; flexible otherwise; open to strong EU remote candidates).