AI/LLM Evaluation & Alignment Software Engineer

LeoTech
Full-timeMid
$135K - $160K/yr

AI Tools

DeepEvalGuardrailsAILangChainLangGraphLangfuseLlamaIndexPydantic AIRagasStrands Agents

Tech Stack

PythonLangGraphPydantic AILangChainLlamaIndexAWS BedrockSageMakerKubernetesTerraformElasticSearchKafka

Agent Workflow

Ensuring LLM and Agentic AI solutions are accurate, safe, and aligned. Build evaluation frameworks, guardrails, and bias-mitigation for agentic AI in public safety.

About the Role

At LeoTech, we are passionate about building software that solves real-world problems in the Public Safety sector. Our software has been used to help the fight against continuing criminal enterprises, drug trafficking organizations, identifying financial fraud, disrupting sex and human trafficking rings and focusing on mental health matters.

As an AI/LLM Evaluation & Alignment Engineer on our Data Science team, you will play a critical role in ensuring that our Large Language Model (LLM) and Agentic AI solutions are accurate, safe, and aligned with the unique requirements of public safety and law enforcement workflows. You will design and implement evaluation frameworks, guardrails, and bias-mitigation strategies.

Core Responsibilities:

  • Build and maintain evaluation frameworks for LLMs and generative AI systems for public safety use cases
  • Design guardrails and alignment strategies to minimize bias, toxicity, hallucinations
  • Partner with AI engineers and data scientists on evaluation metrics
  • Implement continuous evaluation pipelines integrated into CI/CD
  • Stress test models against edge cases and adversarial prompts
  • Ensure explainability, transparency, and auditability of AI outputs
  • Contribute to DevOps/MLOps workflows

Requirements:

  • Bachelor's/Master's in CS, AI, Data Science
  • 3-5+ years ML/AI engineering, 2+ years on LLM evaluation/safety
  • Python proficiency with LangGraph, Strands Agents, Pydantic AI, LangChain, HuggingFace, PyTorch, LlamaIndex
  • DevOps/MLOps pipeline experience (Kubernetes, Terraform, ArgoCD, GitHub Actions)

Technologies: AWS (Bedrock, SageMaker, Lambda), Azure AI, Kubernetes, HuggingFace, OpenAI API, Anthropic, LangChain, LlamaIndex, Ragas, DeepEval, Langfuse, GuardrailsAI, Python, ElasticSearch, Kafka, Airflow.

Salary: $135,000-$160,000. Location: Austin, TX (Remote).

Apply Now
Apply Now

Similar Jobs

Get jobs like this weekly