Data Engineer
About this role
Key Responsibilities
• Build and maintain high-volume, scalable data pipelines using Apache Kafka and Apache Spark, supporting both real-time and batch data processing needs.
• Design, develop, and optimize data ingestion, transformation, and integration workflows across enterprise systems.
• Ensure data quality, consistency, and integrity across four (4) disparate data sources, implementing validation, cleansing, and reconciliation processes.
• Develop and maintain SQL-based data solutions, including complex queries, stored procedures, performance tuning, and data modeling.
• Collaborate with data analysts, product owners, and application teams to define data requirements and ensure alignment with business needs.
• Implement monitoring, logging, and alerting mechanisms to ensure reliability and observability of data pipelines.
• Support data architecture design and contribute to best practices for scalable and secure data engineering solutions.
• Ensure compliance with federal data governance, security, and privacy requirements.
• Participate in Agile ceremonies and support iterative development and delivery of data capabilities.
• Troubleshoot and resolve data pipeline issues, ensuring minimal disruption to downstream systems and reporting.
Minimum Requirements
• Bachelor’s degree in Computer Science, Information Systems, Engineering, Data Science, or related field (or equivalent experience).
• 3+ years of experience in data engineering, data integration, or related technical roles.
• Strong hands-on experience with Apache Kafka for streaming data pipelines.
• Strong experience with Apache Spark for large-scale data processing (batch and/or streaming).
• Advanced SQL development experience, including complex queries, performance tuning, and data transformation logic.
• Experience integrating and managing data across multiple heterogeneous data sources.
• Experience working in the federal government or other highly regulated environments with security and compliance requirements.
• Strong understanding of data quality management, data validation, and data governance practices.
• Strong problem-solving and analytical thinking abilities.
• Excellent communication skills, with the ability to explain technical concepts to non-technical stakeholders.
• Strong attention to detail, especially in ensuring data accuracy and consistency.
• Ability to work independently in a fast-paced, mission-driven environment.
• Strong collaboration skills across cross-functional technical and business teams.
• US Citizenship or Permanent Residency required.
• Must reside in the Continental US.
• Depending on the government agency, specific requirements may include public trust background check or security clearance.
Preferred Qualifications
• AWS Data Engineer certification.
• Experience with cloud platforms such as AWS, Azure, or GCP (especially data services like S3, Glue, Databricks, or BigQuery).
• Familiarity with data orchestration tools (e.g., Airflow, NiFi, or similar).
• Experience supporting healthcare, insurance, or CMS-related data environments.
• Knowledge of data modeling techniques (dimensional modeling, star/snowflake schemas).
• Experience with DevOps practices, CI/CD pipelines, and infrastructure-as-code for data systems.
• Familiarity with real-time analytics and event-driven architectures.