Data Engineer

OneSource Regulatory
Manitou Springs, CO, US
Full-time

Job Description

Job Description

Salary : Company Introduction

Company Introduction

OneSource Regulatory Technology hosts a number of innovative solutions to enhance job performance in the Pharmaceutical space.

OSR Technology is looking for an experienced and dedicated data engineer to join our product solutions team!

Job Description

OneSource Regulatory is trying to identify a full-time contractor with at least 4+ years of experience to assist us with ongoing R&D projects.

We are looking for a data engineer to pull data from various sources and do all the necessary steps to clean, normalize, possibly annotate, and finally load the data into databases.

The candidate should be able to develop and implement a strategy for testing the data integrity of the collected data. This role requires extreme attention to detail to ensure data quality is top priority.

Responsibilities

  • Well versed in parsing and synthesizing of XML and / or JSON documents.
  • Curating of data that can involve some intermediate to advanced web scraping. (data may need to be fetched via SFTP, FTP, Wget, Curl, REST APIs, GraphQL queries from spots on the Internet)
  • Proficiency with Linux command line and various simple tools, such as grep, wc, sed, awk, find, ls, cat, piped commands and possibly some very light Bash shell scripting, setting up crontab schedules and programs
  • Must have basic knowledge of SQL with the following databases : PostGres, MySQL, Google BigQuery
  • Must have basic knowledge of No-SQL database knowledge such as MongoDB or similar
  • Familiarity with basic Cloud technology such as storage buckets, cloud serverless functions
  • Must have experience extracting text and images from PDF files
  • Knowledge of Puppeteer or other automatable web client technologies
  • Understanding JavaScript, HTML / CSS and HTTP methods (for understanding page structure for web scraping)

Skills

  • Solid experience with Python and Python Libraries such as Pandas, requests, etc
  • Skill set should match up with required responsibilities listed above
  • Strong English skills (e.g. grammatical analysis and rhetorical structure)
  • Team Player
  • Great communication skills

Bonus Skills

  • Experience within the Pharmaceutical Space
  • Ability to expose data via C# NETCore and / or GraphQL
  • Google Cloud Platform (Cloud Buckets, Google Cloud Functions (.NET, Python, Node.JS))
  • Ability to parallelize data manipulation and scraping via Python multi-threading, etc.
  • Python BeautifulSoup
  • Scrapy
  • Docker (setting up Kubernetes style processing if warranted for data scraping / data ingestion / normalization)
  • Multithreading concepts
  • 30+ days ago
Related jobs
Promoted
Farm Credit of Southern Colorado
Colorado Springs, Colorado

Staying up-to-date with data engineering tools and technologies, including cloud-based data services, is essential for this position. Manages data warehouses or data lakes to ensure accessibility and reliability of data. Technical expertise with data pipelines, API management, data models, and data ...

Promoted
EMW Staffing Solutions LLC
CO, United States

As a Data Engineer Consultant, you will work closely with their clients to design, develop, and optimize their data infrastructure, ensuring the delivery of high-quality data solutions that meet business requirements. Please also note that the group is somewhat stack agnostic, so those with classica...

Promoted
Prime Data Centers
CO, United States

Electrical Engineer reports to the VP of Engineering and is primarily responsible for electrical engineering efforts related to a portfolio of datacenter projects which can include new construction, phased expansion, retrofits and upgrades, acquisition conversions, MEFP (Mechanical, Electrical, Fire...

The Tatitlek Corporation
Colorado Springs, Colorado

Assess current and recommend future data science and data science-related cybersecurity technologies, methods, considerations, and uses for academics and research. Act as the Dean’s data science coach and liaison to the faculty, facilitating broad data curriculum and research experimentation, adopti...

Bluestaq LLC
Colorado Springs, Colorado

In this role, the Systems Engineer, Data Modeler will employ a broad spectrum of multi-domain knowledge to develop data models and ontologies that support the conditioning, securing, storage, and dissemination of data across various data store technologies. The ideal Systems Engineer is a motivated,...

Odyssey Systems
Colorado Springs, Colorado

Odyssey Systems Consulting Group, .We focus on people, processes, and performance to deliver superior results.Since our inception in 1997, our commitment to mission success and customer satisfaction has been recognized with exponential growth and exceptional past performance ratings.We accept challe...

Parsons Corporation
Colorado Springs, Colorado

Our team is looking for an experienced software/data engineer that has developed enterprise solutions for managing data ingestion, persistence, translation, access, and distribution. Are you a software developer with a focus on data engineering with at least 12 years of experience? Do you want to be...

Wounded Warrior Project
Colorado Springs, Colorado

The Wounded Warrior Project (WWP) Senior Data Engineer is a member of the Web, Data, and Analytics team responsible for data engineering and programming to build systems that collect, manage, and convert raw data into usable information for business analysts. Build required infrastructure for optima...

KBR
Colorado Springs, Colorado

The Data Engineer will play a key role in designing, implementing, and maintaining data pipelines to support our organization’s data-driven projects and initiatives. Strong understanding of data engineering principles, data modeling, and data architecture. This position requires a strong foundation ...

Highmark Health
CO, Working at Home, Colorado

In partnership with other business, platform, technology, and analytic teams across the enterprise, design, build and maintain well-engineered data solutions in a variety of environments, including traditional data warehouses, Big Data solutions, and cloud-oriented platforms. Align with security, da...