We are looking for an experienced data engineer to aid in designing, developing, and maintaining scalable data pipelines, ensuring quality and reliability. You'll significantly contribute to data warehouse architecture and implementation (schema design, ETL, optimization). This role also involves close collaboration with analytics and BI teams to meet their data needs and support impactful reporting. This is a fully remote position.
Responsibilities:
- Design, develop, and maintain robust, scalable, and efficient ETL data pipelines using various programming languages (e.g., Python, SQL) and data processing frameworks.
- Implement data quality checks, monitoring, and alerting for all data pipelines to ensure data integrity and reliability.
- Optimize existing data pipelines for performance, cost-efficiency, and error handling.
- Contribute to the design, development, and maintenance of data warehouse and data lake solutions, including schema design, data modeling, and indexing strategies.
- Manage and optimize data storage solutions (e.g., Snowflake, Redshift, BigQuery) for optimal query performance and cost-effectiveness.
- Ensure data security and compliance within the data warehouse environment.
- Work closely with data analysts, data scientists, and business intelligence developers to understand their data requirements and provide access to reliable, high-quality data.
- Assist in troubleshooting data-related issues and support data-driven decision-making.
- Develop and maintain documentation for data models, pipelines, and data sources to facilitate self-service analytics.
- Participate in code reviews, contribute to technical discussions, and provide constructive feedback to peers.
- Help mentor junior data engineers and share best practices for data engineering principles.
- Research and evaluate new data technologies and tools to improve the data infrastructure.
What You Need to Succeed in the Role
We're looking for a Data Engineer with:
- 3+ years of experience designing, developing, and deploying robust data pipelines and data warehouse solutions into environments.
- Experience working with Healthcare data is a plus.
- Strong communication skills, with the ability to convey complex technical information in a straightforward, easy-to-understand manner to both technical and non-technical stakeholders.
- The ability to identify data quality issues, architectural inefficiencies, and performance bottlenecks during code reviews.
- Proficiency in designing and building data systems that are scalable and can handle future growth in data volume and complexity.
- A strong understanding of data integration methods, including APIs, webhooks, and various data ingestion techniques.
- The ability to debug and resolve complex production data issues efficiently.
- Experience working effectively within an engineering team, providing and receiving constructive feedback through structured pull requests.
- Comfortable working in a fast-paced, startup environment with tight deadlines.
- A highly collaborative mindset, dedicated to the company's success, and eager to share opinions and provide thoughtful feedback.
- An understanding of project prioritization and making effective tradeoffs between essential functionalities and nice-to-have features.
- The ability to learn quickly on the job and adapt to unfamiliar tools and technologies.
- Familiarity with our specific data tech stack (e.g., Redshift, Matillion, AWS, dbt, etc.) or similar technologies is a plus.
The expected annual salary for this role is between $115,000-$135,000. Actual starting salary will be determined on an individualized basis and will be based on several factors including but not limited to specific skill set, work experience, etc.