serp_jobs.error_messages.no_longer_accepting

Sr. Database Engineer

Hire TalentCarlyle, IL, United States

job_description.job_card.variable_days_ago

serp_jobs.job_preview.job_type

serp_jobs.job_card.temporary

job_description.job_card.job_description

Title : Data Engineer

Location : Remote

Duration : (3-Month Contract)

This is a non-exempt position.

Project : Supplier Contract Ingestion & Data Pipeline for Negotiation AI

About the Project

We're launching a focused 3-month initiative to :

1. Bulk-ingest over 50,000 supplier contracts into SAP Ariba, with metadata extraction powered by OCR.

2. Design and implement the database architecture and data flows that will feed our Negotiation AI-including contract detail extraction and supplier spend analytics.

This work currently runs separately from the Negotiation AI MVP, but must be future-ready for seamless integration.

Role Overview

As our Data Engineer, you will own the end-to-end data pipelines. This includes designing scalable databases, developing ingestion workflows, collaborating with our internal Machine Learning Engineering team, and structuring supplier spend data. You'll work closely with the Full Stack Developer to co-design the database schema for the Negotiation AI and ensure future compatibility with the ingestion pipeline.

Key Deliverables

Ingestion Pipeline : Build and deploy a robust ETL / ELT pipeline using Azure to ingest 50,000+ contracts.
Metadata Extraction : Configure and run OCR workflows (e.g., OlmOCR / Azure Document Intelligence) to extract key contract fields such as dates, parties, terms etc.
Scalable Database Schema : Design and implement a schema in Azure PostgreSQL to store contract metadata, OCR outputs, and supplier spend data. Collaborate with the Software Developer to design a future-ready schema for AI consumption.

Required Skills & Experience

Data Engineering & ETL / ELT

Experience with Azure PostgreSQL or similar relational databases

Skilled in building scalable ETL / ELT pipelines (preferably using Azure)

Proficient in Python for scripting and automation

OCR Collaboration

Ability to work with internal Machine Learning Engineering teams to validate and structure extracted data

Familiarity with OCR tools (e.g., Azure Document Intelligence, Tesseract) is a plus

SAP Ariba Integration

Exposure to cXML, ARBCI, SOAP / REST protocols is a plus

Comfortable with API authentication (OAuth, tokens) and enterprise-grade security

Agile Collaboration & Documentation

Comfortable working in sprints and cross-functional teams

Able to use Github Copilot to document practices for handover

Preferred Qualifications

Experience with large-scale contract ingestion projects

Familiarity with procurement systems and contract lifecycle management

Background in integrating data pipelines with AI or analytics platforms

Why Join Us?

Focused Scope with Future Impact : Client the foundation for an AI-driven negotiation platform

Cutting-Edge Tools : Work with SAP Ariba, OCR, Azure, and advanced analytics

Collaborative Environment : Partner with Software Developers and AI specialists

Data Engineer (3-Month Contract)

Project : Supplier Contract Ingestion & Data Pipeline for Negotiation AI

About the Project

We're launching a focused 3-month initiative to :

Bulk-ingest over 50,000 supplier contracts into SAP Ariba, with metadata extraction powered by OCR.

Design and implement the database architecture and data flows that will feed our Negotiation AI-including contract detail extraction and supplier spend analytics.

This work currently runs separately from the Negotiation AI MVP, but must be future-ready for seamless integration.

Role Overview

Key Deliverables

Ingestion Pipeline : Build and deploy a robust ETL / ELT pipeline using Azure to ingest 50,000+ contracts.

Metadata Extraction : Configure and run OCR workflows (e.g., OlmOCR / Azure Document Intelligence) to extract key contract fields such as dates, parties, terms etc.

Scalable Database Schema : Design and implement a schema in Azure PostgreSQL to store contract metadata, OCR outputs, and supplier spend data. Collaborate with the Software Developer to design a future-ready schema for AI consumption.

Required Skills & Experience

Data Engineering & ETL / ELT

Experience with Azure PostgreSQL or similar relational databases

Skilled in building scalable ETL / ELT pipelines (preferably using Azure)

Proficient in Python for scripting and automation

OCR Collaboration

Ability to work with internal Machine Learning Engineering teams to validate and structure extracted data

Familiarity with OCR tools (e.g., Azure Document Intelligence, Tesseract) is a plus

SAP Ariba Integration

Exposure to cXML, ARBCI, SOAP / REST protocols is a plus

Comfortable with API authentication (OAuth, tokens) and enterprise-grade security

gile Collaboration & Documentation

Comfortable working in sprints and cross-functional teams

Able to use Github Copilot to document practices for handover

Preferred Qualifications

Experience with large-scale contract ingestion projects

Familiarity with procurement systems and contract lifecycle management

Background in integrating data pipelines with AI or analytics platforms

Why Join Us?

Focused Scope with Future Impact : Client the foundation for an AI-driven negotiation platform

Cutting-Edge Tools : Work with SAP Ariba, OCR, Azure, and advanced analytics

Collaborative Environment : Partner with Software Developers and AI specialists

For the interviews to set expectations

Agile Sprint Breakdown

Sprint 1 (Weeks 1-2) : Database & OCR Foundations

Design scalable schema in Azure PostgreSQL for contract metadata and spend data

Configure Azure Data Factory, Blob Storage, and CI / CD for pipeline deployment

Build proof-of-concept pipeline for ingesting 500 contracts with OCR-based metadata extraction

Collaborate with OCR team to validate extracted fields (e.g., contract dates, parties, spend amounts)

Begin schema design collaboration with Software Developer for Negotiation AI

Sprint 2 (Weeks 3-5) : Scaling & Ariba Integration

Scale pipeline to handle 50,000+ contracts with error-handling and retry logic

Ingest supplier spend data and link it to contract records

Build / refine SAP Ariba integration scripts (if applicable)

Implement robust error-reporting and logging for Ariba API calls

Sprint 3 (Weeks 6-8) : AI Data Alignment & UAT

Finalize data models for Negotiation AI (e.g., legal terms, renewal triggers, spend patterns)

Collaborate with Software Developer to document real-time or scheduled data access

Optimize database queries and indexing for performance

Conduct UAT with stakeholders to validate ingestion and data quality

Sprint 4 (Weeks 9-12) : Finalization & Handover

Complete full-scale ingestion and verify metadata extraction accuracy

Resolve final integration issues and address UAT feedback

Document workflows, schemas, dependencies, and maintenance steps

Deliver performance report (throughput, error rates, data quality metrics)

serp_jobs.job_alerts.create_a_job

Database Engineer • Carlyle, IL, United States