- Jobs
- Grafana Labs
- Staff AI Engineer | US | Remote (Marketing Ops)
Staff AI Engineer | US | Remote (Marketing Ops)
About the Role
Grafana Labs is a remote-first, open-source powerhouse with over 20M users globally. This is a remote opportunity for candidates in the U.S.
The Opportunity:
Grafana Labs is seeking a Staff Engineer (AI & Automation) to own the AI agent infrastructure and automation platform that powers our Marketing Operations organization. You'll build multi-agent architectures, LLM integrations, and backend services that connect AI models to internal and third-party data platforms. You'll ship production systems that teams depend on daily.
This is a high-autonomy role where you own the technical direction. You'll identify the highest-leverage problems across Marketing, RevOps, and SDR teams, design the solutions, and ship them.
What You'll Be Doing:
Agentic Systems & AI Infrastructure:
- Own end-to-end development of multi-agent AI systems
- Build modular, composable agentic systems using orchestration frameworks (LangChain, CrewAI, Anthropic MCP, or similar)
- Develop reusable agentic skills that agents invoke across interfaces (Slack, dashboards, internal apps, CLIs)
- Implement observability and feedback loops (logging, performance metrics, prompt iteration, model evaluation, cost management)
- Establish governance and compliance standards for AI workflows
Systems Integration & Backend Services:
- Build MCP servers, APIs, CLIs, and microservices connecting AI models to business systems
- Architect data flows for retrieval-augmented generation (RAG), connecting LLMs to internal knowledge bases, customer data, and real-time business context
- Build serverless or containerized services (GCP Cloud Functions, Cloud Run)
Automation & Workflow Enablement:
- Partner with RevOps and SDR teams on high-impact automation
- Design workflows using orchestration tools with CI/CD standards
- Enable self-service systems with documentation and playbooks
Requirements:
- 8+ years software engineering (backend, systems integration, or data/analytics)
- 2+ years applying LLMs/AI to production workflows
- Proficiency in Python and JavaScript/Node.js
- Hands-on experience with LLM frameworks, prompt engineering, RAG, function calling, and evaluation
- Experience building and operating multi-agent systems at scale
- Deep familiarity with Google Cloud Platform and BigQuery
- Understanding of LLM failure modes and production mitigations
Bonus:
- Vector database experience (Pinecone, Weaviate, ChromaDB, Qdrant, pgvector)
- Marketing/sales platform familiarity (Salesforce, HubSpot, Marketo)
- AI observability tooling (LangSmith, Weights & Biases)
- Workflow orchestration (n8n, Temporal, Prefect, Airflow)
- Model Context Protocol (MCP) experience
Compensation: $174,986 - $209,983 USD base, plus RSUs.
Benefits: 100% remote global culture, 30 days annual leave + 3 Grafana Shutdown Days, in-person onboarding.