DigiCZE is a leading digital transformation AI firm that supports multiple verticals Finance, Healthcare, Retail and Manufacturing
Weere hiring for a dynamic Junior Agentic AI Engineer (Open-Source Spark Focus)
Overview
We\'re looking for an enthusiastic and driven Junior Agentic AI Engineer to join our team dedicated to advancing cutting-edge AI within the Apache Spark ecosystem . The ideal candidate has a foundational understanding of distributed systems, a passion for open-source development, and an eagerness to explore and implement agentic AI architectures (e.g., planning, tool use, memory) to enhance Spark\'s capabilities, particularly in automated data operations and optimization. This role is a unique opportunity to contribute directly to a globally used open-source project and work at the intersection of Big Data and next-generation AI.
Key Responsibilities
- Open Source Contribution : Develop, test, and submit high-quality code (primarily Python and / or Scala ) to the Apache Spark project, specifically focused on integrating or leveraging agentic AI principles.
- Agentic AI Implementation : Design and implement proof-of-concept AI agents that can interact with Spark APIs and configurations to autonomously manage or optimize data workflows (e.g., automatic query optimization, resource tuning, self-healing pipelines).
- Tool Integration : Integrate and maintain Large Language Models (LLMs) and related tools (e.g., LangChain, LlamaIndex) with Spark components, enabling AI agents to "reason" about data processing tasks.
- Testing and Validation : Write comprehensive unit and integration tests to ensure the stability and reliability of new features and contributions within the distributed Spark environment.
- Documentation : Create clear, concise documentation for new features and agent designs, both for internal use and for the open-source community.
- Collaboration : Work closely with senior engineers and the broader open-source community to align on development standards and future roadmaps.
Essential Qualifications
Education : Bachelors degree in computer science, Data Science, or a related technical field, or a rockstar with hands on practical experience.Programming Proficiency : Strong programming skills in Python (required) and familiarity with Scala (highly preferred).Big Data Fundamentals : Foundational understanding of Apache Spark architecture and core concepts (RDDs, DataFrames, Spark SQL). Experience running Spark jobs in a cluster environment is a plus.AI / ML Familiarity : Basic knowledge of Machine Learning concepts and hands-on experience with foundational LLM frameworks (e.g., OpenAI API, Hugging Face, or related libraries).Software Engineering : Familiarity with software development best practices, including version control ( Git ), code reviews, and testing methodologies.Open Source Mindset : Demonstrated enthusiasm for open-source software, either through personal projects or prior contributions.Desired (Bonus) Qualifications
Prior contributions to Apache Spark or related open-source Big Data projects.Experience with agent-based programming paradigms (e.g., using frameworks like LangChain, CrewAI, or similar).Familiarity with containerization technologies like Docker and orchestration tools like Kubernetes for managing Spark clusters.Knowledge of deep learning frameworks (e.g., PyTorch, TensorFlow).J-18808-Ljbffr