Principal AI Evaluation Engineer at Backbase

Job Description

About Backbase

As a a Principal AI Evaluation Engineeryou will be leading the evaluation efforts in our AI-powered SDLC team. You will own the evaluation strategy for AI assistants and agentic workflows, ensuring they are reliable, observable, and safeguarded with strong guardrails. Beyond hands-on work, you will mentor engineers, lead triage and reporting, and make evaluation a cornerstone of release decisions.

Meet the job

Define and lead the evaluation strategy and roadmap for AI-powered SDLC core product
Build and oversee evaluation pipelines and guardrails.
Build and maintain evaluation datasets (synthetic and real project data) to benchmark AI behavior.
Analyze evaluation results, identify gaps, and produce clear, actionable reports for engineering and product stakeholders.
Build a culture of innovation and excellence, encouraging continuous improvement and adoption of best practices in AI evaluation and deployment.
Collaborate with cross-functional teams to integrate evaluation insights into development.

How about you?

Strong understanding of software engineering principles and the software development lifecycle (SDLC).
Hands-on experience with test design, test management, observability, and data analysis.
Proficiency in Python (or another scripting language) for automating evaluations.
Familiarity with AI Agent evaluation methods (faithfulness, answer relevancy, contextual accuracy, tool correctness).
Excellent analytical and problem-solving skills.
Strong communication and collaboration abilities, able to work with cross-functional teams and stakeholders.
Demonstrated ability to mentor engineering talent, fostering collaboration and technical excellence.
(Nice to have) Experience with evaluation frameworks, RAG systems, or agentic workflows.

Principal AI Evaluation Engineer

Job Description

About Backbase

Meet the job

How about you?

Location

Related Tags

Related Jobs