Data Engineer

OneSource Regulatory
Manitou Springs, CO, US
Full-time

Job Description

Job Description

Salary : Company Introduction

Company Introduction

OneSource Regulatory Technology hosts a number of innovative solutions to enhance job performance in the Pharmaceutical space.

OSR Technology is looking for an experienced and dedicated data engineer to join our product solutions team!

Job Description

OneSource Regulatory is trying to identify a full-time contractor with at least 4+ years of experience to assist us with ongoing R&D projects.

We are looking for a data engineer to pull data from various sources and do all the necessary steps to clean, normalize, possibly annotate, and finally load the data into databases.

The candidate should be able to develop and implement a strategy for testing the data integrity of the collected data. This role requires extreme attention to detail to ensure data quality is top priority.

Responsibilities

  • Well versed in parsing and synthesizing of XML and / or JSON documents.
  • Curating of data that can involve some intermediate to advanced web scraping. (data may need to be fetched via SFTP, FTP, Wget, Curl, REST APIs, GraphQL queries from spots on the Internet)
  • Proficiency with Linux command line and various simple tools, such as grep, wc, sed, awk, find, ls, cat, piped commands and possibly some very light Bash shell scripting, setting up crontab schedules and programs
  • Must have basic knowledge of SQL with the following databases : PostGres, MySQL, Google BigQuery
  • Must have basic knowledge of No-SQL database knowledge such as MongoDB or similar
  • Familiarity with basic Cloud technology such as storage buckets, cloud serverless functions
  • Must have experience extracting text and images from PDF files
  • Knowledge of Puppeteer or other automatable web client technologies
  • Understanding JavaScript, HTML / CSS and HTTP methods (for understanding page structure for web scraping)

Skills

  • Solid experience with Python and Python Libraries such as Pandas, requests, etc
  • Skill set should match up with required responsibilities listed above
  • Strong English skills (e.g. grammatical analysis and rhetorical structure)
  • Team Player
  • Great communication skills

Bonus Skills

  • Experience within the Pharmaceutical Space
  • Ability to expose data via C# NETCore and / or GraphQL
  • Google Cloud Platform (Cloud Buckets, Google Cloud Functions (.NET, Python, Node.JS))
  • Ability to parallelize data manipulation and scraping via Python multi-threading, etc.
  • Python BeautifulSoup
  • Scrapy
  • Docker (setting up Kubernetes style processing if warranted for data scraping / data ingestion / normalization)
  • Multithreading concepts
  • 30+ days ago
Related jobs
Promoted
The Aerospace Corporation
Colorado Springs, Colorado

Senior Project Engineer - Systems Engineering - Acquisition and Systems Engineering). Senior Project Engineer - Systems Engineering - Acquisition and Systems Engineering. Defense Systems Group (DSG) provides analysis-based decision support to senior leaders on space architectures, policy and strateg...

Promoted
The Aerospace Corporation
Peterson, Colorado

Senior Project Engineer - Systems Engineering - Acquisition and Systems Engineering). Senior Project Engineer – Systems Engineering – Acquisition and Systems Engineering. Defense Systems Group (DSG) provides analysis-based decision support to senior leaders on space architectures, policy and strateg...

Managed Business Solutions
Colorado Springs, Colorado

The Data Conversion Engineer will perform data profiling and analysis; write Extract, Transform, Load (ETL) scripts using SQL or other tools; and write data reports and provide recommendations for improving data for clients. Participate in Data Profiling, Data Issue Resolution and Data Mapping Works...

KBR
Colorado Springs, Colorado

The Senior Data Engineer will play a key role in designing, implementing, and maintaining data pipelines to support our organization’s data-driven projects and initiatives. Strong understanding of data engineering principles, data modeling, and data architecture. This position requires a strong foun...

CVS Health
Work from home, CO, US
Remote

We are seeking a highly skilled and motivated individual to join our team as a Big Data Cloud-Based Vulnerability Management Data Analytics Developer. This is an exciting opportunity to work on cutting-edge technology and contribute to our mission of safeguarding critical data and infrastructure. Th...

Fashion Institute of Design & Merchandising
Colorado Springs, Colorado

HDR is looking for a Data Center Electrical Engineer to join our Building Engineering Services team in Denver, Colorado. Previous Data Center/Mission Critical experience an architectural/engineering, or engineering consulting firm is desired. Electrical EngineerData Center - ( 179071 ). The Data ...

Highmark Health
CO, Working at Home, Colorado

This role within the 'Data Engineering & Self-Service Products' team involves architecting and engineering analytic data solutions, including designing and developing data marts in Databricks using PySpark or Spark SQL, building interactive Power BI dashboards to visualize KPIs and trends, and creat...

Northwestern Mutual Investment Services, LLC
Colorado Springs, Colorado

We need an experienced DevSecOps engineer who has a strong background in database systems across both on-premise and cloud technologies. You will work in an agile and innovative environment, solving complex problems and collaborating with architects and engineers to ensure the safety and security of...

Amentum
Fort Carson, Colorado

Amentum is seeking a Data Engineer for a USSPACECOM program in Colorado Springs, CO. Designs how data will be stored, accessed, used, integrated, and managed by different data regimes and digital systems. Works with data users to determine, create, and populate optimal data architectures, structures...

Highmark Health
CO, Working at Home, Colorado

In partnership with other business, platform, technology, and analytic teams across the enterprise, design, build and maintain well-engineered data solutions in a variety of environments, including traditional data warehouses, Big Data solutions, and cloud-oriented platforms. Align with security, da...