Job Description
Senior Data Engineer
Team: ML
Job Description
Job Title: Senior Data Engineer
Location: Remote
Employment Type: Full-Time
About Us
We are a Computer Vision Product Company on a mission to dramatically increase the operational safety of critical rope applications by delivering real-time data to the right people before catastrophic failures occur.
Failures in critical rope applications are often due to inadequate visual inspection , a standard practice in industries such as Construction, Maritime Mooring, Mining, and Oil & Gas/Drilling . When these ropes fail, lives are lost, and company reputations suffer.
At Scope , we are leveraging the latest advancements in technology to solve this problem. Our current focus is Electric Utility Construction and Maintenance , where we equip operators with the ability to assess the break strength of their stringing lines—without destructive testing . This eliminates reliance on "educated guesses" and allows companies to confidently ensure their lines are fit for service .
What You’ll Do
Architect and build scalable data pipelines and workflows using Dagster to move, transform, and make data available for machine learning and analytics.
Design and optimize storage solutions for large-scale industrial and vision data, ensuring efficient retrieval and accessibility for ML engineers.
Develop robust data ingestion frameworks for consuming live production images, video, and metadata in an extensible and scalable manner.
Collaborate with ML engineers to ensure data- both computer vision and ancillary metadata- is structured and processed optimally for experimentation and model training.
Work with Kubernetes-based environments to orchestrate and deploy data processing jobs.
Enhance CI/CD for data workflows , ensuring automated deployment and testing via GitLab CI/CD . We deploy on merge and you’ll make that better, faster, safer, and cheaper.
Own and maintain AWS-based data infrastructure , leveraging Terraform for Infrastructure as Code.
Implement data governance best practices , including data quality validation, lineage tracking, and metadata management.
Optimize batch and real-time processing frameworks , incorporating best practices for performance, scalability, and reliability.
Act as a technical leader in data engineering, defining best practices and guiding future scaling efforts.
What We’re Looking For
Must-Have Skills
5+ years of experience in data engineering, with a focus on scalable, production-grade data infrastructure.
Strong Python skills with emphasis on type safety, functional programming patterns, and modern Python practices. The ideal candidate has used Rust, Scala, Kotlin, F#, and/or a lisp dialect before.
Experience with data processing frameworks such as Pandas (with Pandera), PyArrow, or Dask.
Deep expertise in data orchestration tools , preferably Dagster (experience with Prefect, Airflow, NiFi, or similar tools is acceptable).
Experience with streaming and event-driven architectures such as Apache Ray Core, Kafka, Kinesis, Pulsar, Storm, or Dempsy, or real time data processing frameworks like Flink or Spark Streaming.
Hands-on experience with Kubernetes , particularly in data pipeline orchestration.
Experience deploying infrastructure via Terraform (or similar IaC tools).
Proficiency in Cloud Services, preferably AWS. S3, EKS, Lambda, Glue, and RDS (or other-cloud equivalents).
Strong database skills , including SQL, NoSQL, and columnar storage (e.g., Postgres, BigQuery, ClickHouse).
Experience with strongly-typed ORMs (e.g., SQLAlchemy/SQLModel , Hibernate, Diesel) and data validation frameworks (e.g., Pydantic, Great Expectations).
Comfortable with hybrid storage , combining databases and blob storage for large objects such as videos and computer vision datasets.
CI/CD expertise, preferably with GitLab for managing automated data pipeline deployments.
Nice-to-Have Skills
Familiarity with ML experiment tracking, metadata management, and data lineage tracking.
Understanding of ML workflows and how data engineering enables efficient model training/deployment.
Experience with embedding management , particularly for inference stores, using tools such as Chroma or pg_vector.
Experience with video processing pipelines and efficient storage/retrieval of large media files.
What We Offer
A chance to own and shape the data infrastructure at a fast-growing computer vision AI company .
A highly collaborative, fast-paced environment working with cutting-edge ML and data engineering .
Competitive salary, annual incentive plan , and benefits.
Opportunities for growth and leadership as we scale our data team.
Job Tags
Full time, Remote job,
Similar Jobs
FiFi's Fine Resale Apparel of Ponte Vedra
Clothing Sales Associate Location Ponte Vedra Beach, FL : Attention Fashion Lovers! Looking for a new job opportunity? Our designer consignment retail store is seeking a part time, experienced sales associate to join our team! Will be needed 2 weekdays 10-6 and Saturdays...
New York Botanical Garden
...Candidates must have experience working with specified age group in an outdoor, informal, experiential education setting.Specific... ...Schedule:Monday Friday 9am 5pm or 8am 4pmTemporary Part Time PositionPay Rate : $25/hr - $28/hrIf you require an...
The Neiders Company
...~15 days of PTO~8 Days of Floating Holiday (Closed on Christmas Day)~ Housing discounts at TNC properties ~ Perks & Rewards through Nectar & Gifted ~ Employee Assistance Program ~ Opportunity for Advancement ~ Supplementary Discounts through MetLife...
Gpac
Field Service Technician One of GPAC's top clients is currently looking to hire a Field Service Technician to join their reputable and growing business. A few key details include:-John Deere experience is required-3 to 5 years experiance working with John Deere/...
Shanghai Bowai Education
Location: Shanghai, Close to Fanrong Road Station of line 18; Job Details: 1.Starting Date: May , 2025; 2. Students' age: 1.5 years old;3. Working hours: 8:00-12:00pm; 4. Working days: Monday to Friday;