The role
We are seeking an experienced, hands-on Senior Data Specialist to join the Risk Engineering Artificial Intelligence AI team.
As a critical member of our Independent Risk Management function (2LOD), you will play a pivotal role in developing data-intensive solutions to drive data-driven decision making for Risk.
You'll collaborate closely with senior leadership to identify opportunities, develop cutting-edge innovative data and AI solutions, and deliver actionable insights that will shape our risk oversight strategies.
What you’ll do :
This critical role focuses on data acquisition (sourcing, staging, modeling, storage) and feeding data to reporting, analytics and AI applications using APIs.
This role will be instrumental in conceptualizing, prototyping and implementing best-in-class end-end AI-based solutions to meet risk management requirements.
- Data Pipeline Development : Design, develop, and maintain scalable and efficient data pipelines to ingest, process, and store diverse data sets from multiple sources, ensuring data completeness and quality.
- Data Transformation : Perform data cleaning, validation, and transformation to prepare high-quality data sets for analytics, reporting and machine learning.
- Data Staging, Modeling and Storage : Build and manage data staging environments and ensure optimal storage of structured and unstructured data for downstream applications owning data models
- API Development : Create and maintain APIs to expose data for internal teams, ensuring secure and reliable data access.
- AI / ML Collaboration : Work closely with AI / ML specialists, risk financial reporting analysts to participate in deploying enterprise grade data processing pipelines to production.
- Data Quality and Governance : Implement best practices for data governance, including data lineage, versioning, and monitoring to maintain data accuracy and reliability.
- Continuous Improvement & Support : Identify areas for process improvement, automation, and optimization within the data pipeline and implement innovative solutions, support production processes to address issues, find root causes and prevent future issue recurrence
What you’ll need :
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or a related field.
- Minimum of 8+ years of experience in data engineering or a related role building analytics solutions leveraging Git workflows, modularized coding practices, API calls, and web based solutions.
- Prior experience in working with machine learning teams and AI-based solutions deployment is preferred.
- Prior experience in banks / financial companies is preferred
- Programming & Analysis :
- Expert / advanced proficiency in Python for data manipulation and transformation (e.g., pandas, NumPy)
- Experience with large-scale data handling, including unstructured text processing, tokenization, embeddings, and data pipelines.
- Strong experience in ad querying of database schemas using advanced SQL for data manipulation, querying, and database management.
- Hands-on experience with data flow orchestration / scheduling tools for data processing workflows.
- Data Processing :
- Deep hands-on experience with core technologies on Snowflake, Airflow, dbt, git, docker, Tableau, Streamlit, SQL
- Experience with relational and NoSQL databases (e.g., PostgreSQL, MongoDB) and data lakes for handling large amounts of text data.
- Search and Retrieval : Experience with vector databases and retrieval-augmented generation (RAG) techniques using systems like Elasticsearch, Pinecone, or FAISS for enhancing LLM performance.
Working knowledge of large language models (LLMs) and their integration into data pipelines.
- Familiarity with common machine learning frameworks and libraries.
- Cloud Platforms :
- Experience with cloud-based machine learning and AI platforms such as AWS (SageMaker, Lambda) and Snowflake with a focus on GenAI model training, deployment, and monitoring.
- Additional Skills :
- Strong analytical and problem-solving skills with a keen attention to detail.
- Excellent communication skills and the ability to work effectively in cross-functional teams.
- Experience with version control tools like Git and CI / CD pipelines is desirable.
Nice to have :
Prior experience in working with machine learning teams and AI-based solutions deployment is preferred.
Compensation and Benefits
The base pay range for this role is listed below. Final base pay offer will be determined based on individual factors such as the candidate’s experience, skills, and location.
To view all of our comprehensive and competitive benefits, visit our Benefits at SoFi page!
Pay range : $115,200.00 - $216,000.00
Payment frequency : Annual
This role is also eligible for a bonus, long term incentives and competitive benefits. More information about our employee benefits can be found in the link above.