Perik.ai See who’s hiring. Apply before everyone else.
← Back to all jobs

Synthetic Data Engineer (AI Data/Training)

Hyphenconnect
📍 Hong Kong 📅 Posted April 24, 2026
Apply on Hyphenconnect’s website →

About this role

We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.

Responsibilities:

• Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.

• Implement automated quality scoring and de-duplication systems.

• Manage data pipelines that feed directly into SFT and DPO training loops.

Qualifications:

• Proven experience building large-scale data pipelines (Airflow, Spark, Ray).

• Deep knowledge of prompt engineering for data generation.

• Familiarity with dataset distillation and bias mitigation.

This listing was aggregated by Perik.ai from Hyphenconnect’s public job board. Click the button above to view the full job description and apply directly.
Explore more jobs
More from Hyphenconnect Browse all AI & tech jobs

Perik.ai is an AI & tech job board that aggregates the latest openings from top companies — updated daily so you can apply before everyone else.

About FAQ Privacy Policy Terms of Service Contact