Generative Ai Engineer (Data / Ml / Genai)
We're hiring a Generative AI Engineer with 6+ years across Data / ML / GenAI who can design, build, and productionize LLM-powered systems end-to-end. You'll select and fine-tune models (OpenAI, Anthropic, Google, Meta, open-source), craft robust RAG / agentic workflows (AutoGen, LangGraph, CrewAI, LangChain / LlamaIndex), and ship secure, observable services with FastAPI, Docker, and Kubernetes. You pair strong software engineering with MLOps / LLMOps rigorevaluation, monitoring, safety / guardrails, and cost / latency optimization.
Key Responsibilities
- Own E2E design for chat / agents, structured generation, summarization / classification, and workflow automation. Choose the right model vs. non-LLM alternatives and justify trade-offs.
- Build prompt stacks (system / task / tool), synthetic data pipelines, and fine-tune or LoRA adapters; apply instruction tuning / RLHF where warranted.
- Implement multi-agent / tool-calling workflows using AutoGen, LangGraph, CrewAI (state management, retries, tool safety, fallbacks, grounding).
- Stand up retrieval stacks with vector DBs (Pinecone / Faiss / Weaviate / pgvector), chunking and citation strategies, reranking, and caching; enforce traceability.
- Ship FastAPI services, containerize (Docker), orchestrate (Kubernetes / Cloud Run), wire CI / CD and IaC; design SLAs / SLOs for reliability and cost.
- Instrument evals (unit / regression / AB), add tracing and metrics (Langfuse, LangSmith, OpenTelemetry), and manage model / version registries (MLflow / W&B).
- Implement guardrails (prompt injection / PII / toxicity), policy filters (Bedrock Guardrails / Azure AI Content Safety / OpenAI Moderation), access controls, and compliance logging.
- Build / maintain data ingestion, cleansing, and labeling workflows for model / retrieval corpora; ensure schema / version governance.
- Optimize with batching, streaming, JSON-schema / function calling, tool-use, speculative decoding / KV caching, and token budgets.
- Partner with product / engineering / DS; review designs / PRs, mentor juniors, and drive best practices / playbooks.
Preferred Qualifications
Deeper experience with multi-agent planning / execution, tool catalogs, and failure-mode design.Experience with pgvector / Elasticsearch / OpenSearch; comfort with relational / NoSQL / graph stores.Human-in-the-loop pipelines, golden sets, regression suites, and cost / quality dashboards.OSS contributions, publications, talks, or a strong portfolio demonstrating GenAI craftsmanship.Nice to Have
Redis / Celery, task queues, and concurrency controls for bursty LLM traffic.Experience with API gateways (e.g., MuleSoft), authN / Z, and vendor compliance reviews.Prior work in data-heavy or regulated domains (finance / health / gov) with auditable GenAI outputs.
Requirements
6+ years across Data / ML / GenAI, with 12+ years designing and shipping LLM or GenAI apps to production.Strong Python and FastAPI; proven experience building secure, reliable REST services and integrations.Hands-on with OpenAI / Anthropic / Gemini / Llama families and at least two of AutoGen, LangGraph, CrewAI, LangChain, LlamaIndex, Transformers.Practical experience implementing vector search and reranking, plus offline / online evals (RAGAS, promptfoo, custom harnesses).Docker, Kubernetes (or managed equivalents), and one major cloud (AWS / Azure / GCP); CI / CD and secrets management.Familiarity with tracing / metrics tools (Langfuse, LangSmith, OpenTelemetry) and setting SLIs / SLOs.Working knowledge of data privacy, PII handling, content safety, and policy / controls for enterprise deployments.Clear technical writing and cross-functional collaboration; ability to translate business goals into architecture and milestones.