- Jobs
- LeoTech
- AI/LLM Evaluation & Alignment Software Engineer
AI/LLM Evaluation & Alignment Software Engineer
AI Infrastructure
Agentic Frameworks
Tech Stack
About the Role
At LeoTech, we are passionate about building software that solves real-world problems in the Public Safety sector. Our software has been used to help the fight against continuing criminal enterprises, drug trafficking organizations, identifying financial fraud, disrupting sex and human trafficking rings and focusing on mental health matters.
As an AI/LLM Evaluation & Alignment Engineer on our Data Science team, you will play a critical role in ensuring that our Large Language Model (LLM) and Agentic AI solutions are accurate, safe, and aligned with the unique requirements of public safety and law enforcement workflows. You will design and implement evaluation frameworks, guardrails, and bias-mitigation strategies.
Core Responsibilities:
- Build and maintain evaluation frameworks for LLMs and generative AI systems for public safety use cases
- Design guardrails and alignment strategies to minimize bias, toxicity, hallucinations
- Partner with AI engineers and data scientists on evaluation metrics
- Implement continuous evaluation pipelines integrated into CI/CD
- Stress test models against edge cases and adversarial prompts
- Ensure explainability, transparency, and auditability of AI outputs
- Contribute to DevOps/MLOps workflows
Requirements:
- Bachelor's/Master's in CS, AI, Data Science
- 3-5+ years ML/AI engineering, 2+ years on LLM evaluation/safety
- Python proficiency with LangGraph, Strands Agents, Pydantic AI, LangChain, HuggingFace, PyTorch, LlamaIndex
- DevOps/MLOps pipeline experience (Kubernetes, Terraform, ArgoCD, GitHub Actions)
Technologies: AWS (Bedrock, SageMaker, Lambda), Azure AI, Kubernetes, HuggingFace, OpenAI API, Anthropic, LangChain, LlamaIndex, Ragas, DeepEval, Langfuse, GuardrailsAI, Python, ElasticSearch, Kafka, Airflow.
Salary: $135,000-$160,000. Location: Austin, TX (Remote).