- Jobs
- Centralize
- Software Engineer (Applied AI)
Software Engineer (Applied AI)
AI Infrastructure
Tech Stack
About the Role
About Centralize
Enterprise sales runs on relationships, and every tool built to manage them is a database from 2005. Reps lose seven-figure deals because they can't see who actually matters inside an account. We're building the system of intelligence that replaces the CRM.
Centralize is the relationship intelligence platform for enterprise revenue teams. Webflow, Intercom, Brex, Cognition, LangChain, and Cresta use us to close their largest deals. We've grown 5x since last year, and customers are pulling us forward faster than we can ship.
We just raised a Series A led by NEA, bringing our total funding to $18M+ alongside Salesforce Ventures, Y Combinator, and operators including Cal Henderson (Co-Founder, Slack), Noah Weiss (former CPO, Slack), and sales leaders from Figma, Box, Dropbox, Anthropic, and Notion.
Centralize was founded by Rachit Kataria, a founding engineer on Facebook Shops who helped scale it to 250M MAUs, and Will Wang, who led the launch of Slack Huddles, the fastest-growing product in Slack's history.
Our Bar
We stay small on purpose. No passengers, no politics, no waiting for permission or for someone else to fix what's broken.
We own the unglamorous work alongside the exciting work. The 9pm customer request, the integration buried three layers deep, the bug nobody wants to touch. You go after it because it needs to get done, and because the next thing you build is better for it.
You won't have the answers handed to you. The roadmap, the architecture, the right call on a customer request, you'll be the one figuring it out. We hire people who are energized by ambiguity, not slowed down by it.
Your work shapes what Centralize becomes.
The Role
We are hiring an applied AI engineer to own the intelligence inside Centralize. The product's value depends on AI systems that map stakeholders, analyze deal health, and turn unstructured customer conversations into actions that drive revenue. You will own those systems end to end across the full AI stack: the multi-agent architectures and LLM pipelines, the classical ML and data science work that powers ranking, scoring, and entity resolution, and the eval and data infrastructure that makes all of it better over time.
This is a production engineering role with both an LLM lens and an ML/DS lens. Some problems at Centralize are best solved with a frontier model and a well-designed agent loop. Others are best solved with a classifier, an embedding model, a custom retriever, or a feature pipeline. You'll know which is which, and you'll build whichever one moves the metric.
This role is well-suited to engineers who have shipped LLM-powered products and trained or fine-tuned models in production, who think about evals and reliability before model selection, and who can move fluidly between prompt engineering, fine-tuning, and traditional ML when the problem demands it.
What You Will Do
- Design and ship multi-agent systems that handle the hardest reasoning problems in the product: stakeholder mapping, account research, deal health analysis, conversation intelligence.
- Own the LLM pipelines end to end: prompt engineering, retrieval, tool use, structured outputs, guardrails, and the orchestration glue that ties it all together.
- Build and maintain the ML and DS work that LLMs aren't the right tool for: ranking models, classifiers, embedding models, entity resolution across messy CRM data, signal extraction from sales conversations.
- Fine-tune models when frontier APIs aren't enough. Curate training data, design eval sets, run experiments, and ship the results to production.
- Build the eval infrastructure that lets us ship AI features without breaking them. LLM-as-judge, human-in-the-loop, classical metrics for ML systems, regression suites. We grade on what works in production.
- Own the data flywheel. The product generates rich signal from customer conversations, deal outcomes, and stakeholder interactions. Turn that into training data, eval data, and the feedback loops that compound over time.
- Stay on the frontier. New models drop monthly. You'll know which ones move the needle for our use cases, when to switch, and when to wait.
- Talk to customers. Sit on calls, see what's actually broken, and translate that into the AI capabilities that matter.
What Success Looks Like
- Week 1: First eval suite shipped for an existing AI feature, with measurable accuracy improvement.
- Day 14: Owning a major AI surface end to end, including the customer conversations that scoped it.
- Day 30: A multi-agent system you architected is in production at customer scale, with the eval and observability infrastructure to keep improving it.
What We Are Looking For
- Demonstrated experience shipping LLM-powered products to production with real customers and real evals.
- Demonstrated experience training, fine-tuning, or shipping classical ML models in production. Ranking, classification, embeddings, retrieval.
- Strong fluency with multi-agent systems, tool use, function calling, RAG, and the orchestration patterns that make them reliable.
- Real expertise in evaluation across both LLM and ML systems.
- Strong backend engineering fundamentals. Python is required; familiarity with TypeScript, Postgres, queues, and AWS is a major plus.
- Sharp instinct for cost, latency, and reliability tradeoffs across the AI stack.
- Excellent written and verbal English communication.
- Demonstrated ability to operate independently.
This Role Is Not For You If
- You want to do AI research. We are an applied team. We use frontier models, we don't build them.
- You only want to work on LLMs. Some of the most important work at Centralize is classical ML, ranking, and entity resolution.
- You think evals are someone else's problem.
- You've only built demos or hackathon projects.
- You want a slower pace.
Preferred Qualifications
- Background as an MLE who has flexed into LLM application work, or as an LLM engineer with deep MLE foundations.
- Experience fine-tuning open or closed models for specific tasks.
- Experience with multi-agent orchestration frameworks (LangGraph, Mastra, custom orchestrators) at production scale.
- Experience with classical ML systems in production: ranking models, embedding models, entity resolution, recommendation systems.
- Open-source contributions, technical blog posts, or papers on applied AI or ML work.
- Direct exposure to enterprise sales cycles or B2B SaaS products.
The Team You'll Join
You'll work directly with Rachit and Will, alongside former founders and engineers from Coinbase, Gusto, Modern Treasury, and C3 AI.
Compensation and Logistics
- Location: This role is open to remote candidates in the US, with a strong preference for candidates based in or willing to relocate to San Francisco or New York City.
- Work Authorization: We are unable to sponsor visas. Candidates must have existing US work authorization.
- Compensation: $190,000 to $260,000 base salary depending on level, plus 0.20% to 0.40% equity.
Benefits
- Fully covered medical, dental, and vision insurance
- 401(k)
- Parental leave
- Unlimited PTO plus company holidays
- Quarterly offsite
- Equipment stipend