- Jobs
- Upwork
- Senior Lead Machine Learning Engineer, Agentic AI
Senior Lead Machine Learning Engineer, Agentic AI
AI Infrastructure
About the Role
Upwork Inc.'s (Nasdaq: UPWK) family of companies connects businesses with global, AI-enabled talent across every contingent work type including freelance, fractional, and payrolled. This portfolio includes the Upwork Marketplace and Lifted.
We're seeking a Senior Lead Machine Learning Engineer to architect, ship, and scale the next generation of agentic intelligence across Upwork. You will lead end-to-end development of AI agents and the platform that powers them, from LLM training and evaluation to runtime orchestration, safety, and developer APIs. This is a hands-on, high-impact role at the intersection of applied research and platform engineering.
Responsibilities
- Build Agentic Intelligence. Design and implement multi-agent systems (planning, tool-use, memory, debate/critique, reflection) with robust guardrails and recovery strategies
- Develop protocol-aware agents and services that interoperate cleanly with developer tooling (e.g., agent frameworks and protocols such as MCP)
- Own reliability at scale: deterministic execution where needed, idempotency, timeouts/retries, and evaluation-driven iteration on agent behavior
- Train, Align, and Evaluate LLMs for Agents. Lead data strategy and curation for agent tasks; drive SFT, DPO, RLHF/RLAIF, and safety tuning tailored to multi-tool, multi-step workflows
- Stand up evaluation harnesses for functional, task, and longitudinal metrics (success rate, time-to-completion, hallucination/escape rates, cost/latency)
- Build policy-driven guardrails; partner with Legal/Security on data governance and privacy
- Engineer Agentic Platform Backend Infrastructure. Architect low-latency inference, retrieval, and orchestration services (streaming, event-driven pipelines; scalable queues; caching; batching) with strong SLOs
- Ship production-grade services (APIs/SDKs, auth, rate limiting, observability) that make agent features easy to integrate for internal and external developers
- Optimize cost/performance via quantization, distillation, model-routing, and autoscaling
- Lead, Partner, and Uplevel the Ecosystem. Provide technical leadership across research, product, and platform teams; mentor senior ICs
What it takes to catch our eye
- 8-12+ years in applied ML/ML systems with 4+ years building LLM-powered products; proven delivery of agentic workflows in production
- Hands-on mastery of LLM adaptation (prompting, tool/function calling), data curation, and safety/guardrails
- Strong software fundamentals (distributed systems, transactions, consistency, resiliency)
- Fluency with Python; proficiency in one of Go/Java/Javascript a plus. Experience with container orchestration, messaging/streaming, and observability stacks
- Experience designing eval suites for agents and closing the loop from evals to training to runtime policy
- Familiarity with agent frameworks and protocols (e.g., MCP; API/SDK design for developer productivity)
This position will initially be employed through a partner to ensure a seamless hiring process while we establish the Toronto hub. Once the hub is established, there may be opportunities to transition to direct Upwork employment.