Search jobs > Seattle, WA > Data engineer

Data Engineer

OneSource Regulatory
Seattle, WA, United States
Full-time

Company Introduction

OneSource Regulatory Technology hosts a number of innovative solutions to enhance job performance in the Pharmaceutical space.

OSR Technology is looking for an experienced and dedicated data engineer to join our product solutions team!

Job Description

OneSource Regulatory is trying to identify a full-time contractor with at least 4+ years of experience to assist us with ongoing R&D projects.

We are looking for a data engineer to pull data from various sources and do all the necessary steps to clean, normalize, possibly annotate, and finally load the data into databases.

The candidate should be able to develop and implement a strategy for testing the data integrity of the collected data. This role requires extreme attention to detail to ensure data quality is top priority.

Responsibilities

  • Well versed in parsing and synthesizing of XML and / or JSON documents.
  • Curating of data that can involve some intermediate to advanced web scraping. (data may need to be fetched via SFTP, FTP, Wget, Curl, REST APIs, GraphQL queries from spots on the Internet)
  • Proficiency with Linux command line and various simple tools, such as grep, wc, sed, awk, find, ls, cat, piped commands and possibly some very light Bash shell scripting, setting up crontab schedules and programs
  • Must have basic knowledge of SQL with the following databases : PostGres, MySQL, Google BigQuery
  • Must have basic knowledge of No-SQL database knowledge such as MongoDB or similar
  • Familiarity with basic Cloud technology such as storage buckets, cloud serverless functions
  • Must have experience extracting text and images from PDF files
  • Knowledge of Puppeteer or other automatable web client technologies
  • Understanding JavaScript, HTML / CSS and HTTP methods (for understanding page structure for web scraping)

Skills

  • Solid experience with Python and Python Libraries such as Pandas, requests, etc
  • Skill set should match up with required responsibilities listed above
  • Strong English skills (e.g. grammatical analysis and rhetorical structure)
  • Team Player
  • Great communication skills

Bonus Skills

  • Experience within the Pharmaceutical Space
  • Ability to expose data via C# NETCore and / or GraphQL
  • Google Cloud Platform (Cloud Buckets, Google Cloud Functions (.NET, Python, Node.JS))
  • Ability to parallelize data manipulation and scraping via Python multi-threading, etc.
  • Python BeautifulSoup
  • Scrapy
  • Docker (setting up Kubernetes style processing if warranted for data scraping / data ingestion / normalization)
  • Multithreading concepts
  • 1 day ago
Related jobs
Promoted
DeRisk Technologies
Bothell, Washington

A DC engineer is needed to work on-demand for one of our clients. The following duties should be able to be carried out by the engineer:. Knowledge of Infrastructure (Data Center and Network) hardware architecture as to understand the procedure shared by L3 teams during troubleshooting,H&E support. ...

Promoted
Optomi
Seattle, Washington

Familiarity with data modeling and database design. Proficiency in Databricks and Python. Collaboration with Tableau Engineers. Bachelor's degree in Computer Science, Engineering, or related field. ...

Promoted
Vaco
Bellevue, Washington

Developing and maintaining data lake and data warehouse environments. Data pipeline design, implementation, and maintenance. ...

Promoted
Jobs for Humanity
Seattle, Washington

Collaborate with cross-functional teams, including finance, operations, and engineering, to understand their data needs, provide analytical support, and contribute to data-driven decision-making. We are looking for a Senior Data Engineer to help us build a brand-new financial technology platform for...

Promoted
Capgemini
Seattle, Washington

Role- Data Platform Engineer with Databricks and Unity Catalog. Develop, optimize, and managedata pipelines for ETL processes using Databricks, with a focus on dataintegrity and quality. Design and maintain data modelsand schemas, incorporating Unity Catalog and Collibra data governancepractices. En...

Promoted
Branch Metrics
Seattle, Washington

As a Senior Software Engineer - Data Platform and Products at Branch, we are looking for an ambitious, self driven individual who is at home on a PB scale data platform to join our growing data platform, products and engineering team. Partner with data scientists, data analysts, fraud specialists, i...

GoodRx
Seattle, Washington

Experience related to data privacy like CCPA & GDPR (Data deletion, data de-identification, Right to forget and such). Experience with data processing frameworks such as Spark, Databricks and Kafka Streams. Experience building/operating highly available, distributed systems of data extraction, inges...

Amazon.com Services LLC
Seattle, Washington

You will collaborate with analysts, research scientists, data engineers, business intelligence engineers, and software development engineers across Amazon to produce complete data solutions. Experience with non-relational databases / data stores (object storage, document or key-value stores, graph d...

Amazon Data Services, Inc.
Seattle, Washington

The AWS AMER Cost Control team is seeking a passionate and motivated construction cost engineer to support a portfolio of Data Center Construction projects in North and South America. At Amazon we leverage unique opportunities presented to us and are diverse, creative, team oriented professionals wo...

Tekvivid Inc
Bellevue, Washington

Interfacewith other technology teams to extract transform and load data froma wide variety of data sources using SQL and AWS big datatechnologies. Doyou want to collaborate with Business Intelligence Engineers (BIE)and Data Scientists (DS) to build ML/LLM models to support ourmajor customer facing f...