Overview
Who We Are Based in The Romania Excellence Centre, Bucharest - our client is seeking experienced professionals who value teamwork, pioneering technology, and innovation. You will be part of a global and diverse team, contribute to all stages of the software development lifecycle and lead the implementation, deployment and test of multi-agent systems. Also, this role will give you the chance to join our team to create the next big thing in digital banking.
What You’ll Be Doing
* Design and build complex agentic systems with multiple interacting agents
* Implement robust orchestration logic (state machines / graphs, retries, fallbacks, escalation to humans)
* Implement RAG pipelines, tool calling, and sophisticated system prompts for optimal reliability, latency, and cost control
* Apply core ML concepts to evaluate and improve agent performance, including dataset curation and bias/safety checks
* Lead the development of agents using Google ADK and/or LangGraph, leveraging advanced features for orchestration, memory, evaluation, and observability
* Integrate with supporting libraries and infrastructure (e.g., LangChain/LlamaIndex, vector databases, message queues, monitoring tools) with minimal supervision
* Define success metrics, build evaluation suites for agents (automatic + human evaluation), and drive continuous improvement
* Curate and maintain comprehensive prompt/test datasets; run regression tests for new model versions and prompt changes
* Deploy and operate AI services in production, establishing CI/CD pipelines, observability, logging, and tracing
* Debug complex failures end-to-end, identifying and document root causes across models, prompts, APIs, tools, and data
* Work closely with product managers and stakeholders to shape requirements, translate them into agent capabilities, and manage expectations
* Document comprehensive designs, decisions, and runbooks for complex systems
What We’re Looking For
* Bachelor’s degree in Computer Science, Engineering, or related field
* At least 3 years of experience as Software Engineer / ML Engineer / AI Engineer, with at least 1-2 years working directly with LLMs in real applications
Programming & Software Engineering
* Strong proficiency in Python (core language features, packaging, testing, async, type hints)
* Very strong software engineering practices: version control (Git), unit/integration testing, code reviews, CI/CD
* Experience building and consuming REST/gRPC APIs and integrating external tools/services
Machine Learning (Good Understanding)
* Understanding of core ML concepts: supervised/unsupervised learning, train/validation/test splits, overfitting, regularization, and common metrics (precision, recall, F1, ROC-AUC, etc.)
* Good understanding of deep learning basics (neural networks, embeddings) and at least one ML/DL framework (e.g., PyTorch, TensorFlow, JAX, scikit-learn)
LLMs & Agentic AI (Very Strong Understanding)
* Deep practical knowledge of large language models
* Tokenization, context windows, temperature, top-p, system vs user prompts
* Prompt engineering patterns (ReAct, chain-of-thought, tool-calling/tool-use)
* Fine-tuning / adapters / instruction-tuning, or experience with RAG as an alternative
* Experience building LLM-powered applications end-to-end: from idea → prototype → production
* Familiarity with safety and reliability considerations: hallucinations, guardrails, content filtering, privacy
Agentic Frameworks (Required Understanding, Experience Preferred)
* Conceptual understanding of modern agentic frameworks and patterns (stateful graphs, multi-agent coordination, human-in-the-loop, memory, and evaluation)
* Hands-on experience with at least one of:
* Google Agent Development Kit (ADK) – building multi-agent workflows, using its orchestration, tools, and evaluation features
* LangGraph – designing graph-based, stateful agent workflows with cycles, branches, and durable execution
* Candidates must be able to read, reason about, and extend ADK/LangGraph-based codebases
* Direct production experience with both ADK and LangGraph is a strong plus
Data & Infra
* Experience working with vector databases (e.g., Pinecone, Weaviate, pgvector, Chroma) for retrieval-augmented generation
* Comfortable with SQL and basic data modeling
* Experience deploying on at least one major cloud platform (GCP, AWS, Azure) and using managed services (e.g., serverless runtimes, container orchestration, secrets management)
Nice-to-Have Experience With
* Vertex AI / Gemini or other hosted LLM ecosystems
* Related frameworks and tools: LangChain, LlamaIndex, semantic search, evaluation frameworks (e.g., RAGAS, custom eval harnesses)
* Monitoring and observability stacks (OpenTelemetry, Prometheus/Grafana/NewRelic, Datadog, etc.)
#J-18808-Ljbffr