Hiring a
Data Scientist
to build and scale the data pipelines behind advanced
chemistry and battery ML models
.
You will turn
raw molecular and electrochemical data
into
model-ready datasets
used by LLMs, graph models and simulation pipelines.
What you’ll do
- Build
ETL pipelines
for large molecular, electrochemical and battery datasets
- Deliver
APIs and data services
to ML and simulation teams
- Apply
QC, schemas and metadata
to ensure ML-ready data
- Run
EDA
to surface gaps and improve model performance
- Work closely with
scientists and ML researchers
What they want
- 2+ years in
data science, data engineering or scientific computing
- Strong
Python + SQL
- Experience with
chemical, pharma, energy or materials data
- MS/PhD (or equivalent industry experience)
in chemistry, materials or chemical engineering
Nice to have
- RDKit, OpenBabel, molecular search, LLM data pipelines
- Battery or electrochemistry data
- Airflow, NoSQL, chemical APIs