Remote SME Data Engineer

Shuvel Digital
Ashburn, Virginia, United States
Remote
Full-time

Job Description :

Each day U.S. Customs and Border Protection (CBP) oversees the massive flow of people, capital, and products that enter and depart the United States via air, land, sea, and cyberspace.

The volume and complexity of both physical and virtual border crossings require the application of solutions to promote efficient trade and travel.

Further, effective solutions help CBP ensure the movement of people, capital, and products is legal, safe, and secure. CBP seeks capable, qualified, and versatile SME Data Engineers to help develop complex data analytical solutions for law enforcement personnel to assess risk of potential threats entering the country.

Responsibilities include, but are not limited to :

Design, develop, and maintain scalable data pipelines and architectures to support data extraction, transformation, and loading (ETL / ELT) processes.

Utilize strong SQL skills to perform complex data transformations and optimize database queries, ensuring high performance and efficiency.

  • Building comprehensive datasets by aggregating data sourced from various relational databases, facilitating data analysts and data scientists in creating machine learning models, reports, and dashboards.
  • Collaborate with cross-functional teams (data analysts, data scientists, and business stakeholders) to understand business requirements and translate them into technical solutions.
  • Assist with the implementation of data migration / pipelines from on-prem to cloud / non-relational storage platforms.
  • Leverage distributed computing frameworks like Apache Spark to process large volumes of data efficiently.
  • Utilizing data analysis, problem-solving, investigative, and creative thinking skills to handle extremely large datasets, transforming them into various formats for diverse analytical products.
  • Respond to data queries / analysis requests from various groups within an organization. Create and publish regularly scheduled and / or ad hoc reports as needed.
  • Troubleshoot data-related issues, identify root causes, and implement solutions to ensure data integrity and accuracy.
  • Implement best practices for data governance, security, and quality supporting the core business applications.
  • Responsible for data engineering source code control using GitLab.

Basic Qualifications :

  • Experience with relational databases and knowledge of query tools and / or BI tools like Power BI or OBIEE and data analysis tools.
  • Extensive experience with SQL and proficiency in writing complex queries.
  • Solid understanding of data warehousing concepts and platforms such as Oracle and cloud-based solutions.
  • Strong experience in automating ETL jobs via UNIX / LINUX shell scripts and CRON jobs.
  • Demonstrate a strong practical understanding of data warehousing from a production relational database environment.
  • Strong experience using analytic functions within Oracle or similar tools within non-relational (MongoDB, Cassandra etc.) database systems.
  • Strong understanding of distributed computing principles and experience with frameworks like Apache Spark
  • Hands-on-experience with data lake architectures and technologies in a cloud environment.
  • Experience with Atlassian suite of tools such as Jira and Confluence
  • Knowledge of Continuous Integration & Continuous Development tools (CI / CD)
  • Must be able to multitask efficiently and progressively and work comfortably in an ever-changing data environment.
  • Must work well in a team environment as well as independently.
  • Excellent verbal / written communication and problem-solving skills; ability to communicate information to a variety of groups at different technical skill levels.

Preferred Qualifications :

  • 5+ years of experience in developing, maintaining, and optimizing complex Oracle PL / SQL packages to aggregate transactional data for consumption by data science / machine learning applications.
  • 10+ years of experience in working in data engineering, with a focus on building and optimizing data pipelines and architectures.

Must have full life cycle experience in design, development, deployment, and monitoring.

  • Experience with one or more relational database systems such as Oracle, MySQL, Postgres, SQL server, with heavy emphasis on Oracle.
  • Extensive experience with cloud platforms (e.g. AWS, Google Cloud, etc) and cloud based ETL / ELT tools.
  • Experience with Amazon services such as S3, Redshift, EMR and Scala.
  • Experience with migrating on-prem legacy database objects and data to the Amazon S3 cloud environment.
  • Experience or familiarity with data science / machine learning and development experience for supervised and unsupervised learning with structure and unstructured datasets.
  • Certifications in relevant technologies (e.g. AWS Certified Big Data, Google Professional Data Engineer) are a plus.
  • 30+ days ago
Related jobs
Promoted
Zachary Piper
Reston, Virginia
Remote

Zachary Piper Solutions is seeking a skilled Cloud Engineer SME to join our team in Reston, VA. As a Cloud Engineer SME, you will collaborate with government and industry customers to develop a multi-cloud solution. You will provide technical leadership on an agile development team, working alongsid...

Promoted
Citizant
Chantilly, Virginia
Remote

Proficiency in database design principles, data modeling, and database optimization techniques. Cross-Functional Collaboration: Work closely with software developers, network engineers, database administrators, and other stakeholders to integrate system components and ensure seamless interoperabilit...

Promoted
Citizant
Chantilly, Virginia
Remote

Proficiency in database design principles, data modeling, and database optimization techniques. Cross-Functional Collaboration: Work closely with software developers, network engineers, database administrators, and other stakeholders to integrate system components and ensure seamless interoperabilit...

Promoted
TestPros
Sterling, Virginia
Remote

Our capabilities include Program Management, Program Oversight, Process Audit, Intelligence Analysis, Cyber Security, NIST SP 800-171 Assessment and Compliance, Computer Forensics, Software Assurance, Software Testing, Test Automation, Section 508 and WCAG Accessibility Assessment, Localization Test...

Promoted
ICF
Reston, Virginia
Remote

The Data Engineer will use a variety of full-stack software languages and tools to build a data processing system for health care data in a major Health & Sciences agency. The Data Engineer will create new pipelines and build reusable components at scale to support reporting & analytics data...

Promoted
CGI
Fairfax, Virginia

CGI Federal is hiring a Data Engineer SME (Data Collection Engines) to work with a skilled and motivated team of professionals on a high-visibility Department of Homeland Security (DHS) Cybersecurity and Infrastructure Security Agency (CISA) cyber security program. Test and debug custom data extract...

BAE Systems
Herndon, Virginia

BAE Systems, a top-ten prime contractor to the Department of Defense, enables the government to transform data into intelligence and provides engineering, integration and sustainment support for critical military platforms and systems. This program delivers Enterprise Computing engineering service...

CGI
Fairfax, Virginia

CGI Federal is hiring a Data Engineer SME (Data Collection Engines) to work with a skilled and motivated team of professionals on a high-visibility Department of Homeland Security (DHS) Cybersecurity and Infrastructure Security Agency (CISA) cyber security program. Test and debug custom data extract...

TestPros
Sterling, Virginia
Remote

Our capabilities include Program Management, Program Oversight, Process Audit, Intelligence Analysis, Cyber Security, NIST SP 800-171 Assessment and Compliance, Computer Forensics, Software Assurance, Software Testing, Test Automation, Section 508 and WCAG Accessibility Assessment, Localization Test...

Peraton
Chantilly, Virginia

Support data conditioning of finished written products (FININTEL) in the standard IC formats for IC PUBS format and Worldwide News Products in NewsML. Ensure and maintain ATO for the ETL platform on JWICS using existing and new microservices that support data conditioning, format validation, and tra...