We’re hiring a part-time Research Data Scientist to lead end-to-end preparation of complex, large-scale health datasets for peer-reviewed publication. This role centers on cleaning, harmonizing, and structuring messy, multi-source datasets, followed by advanced statistical analysis and machine learning to generate publishable insights.
You’ll work with survey, observational, and real-world health data, building reproducible analytical workflows that meet academic research standards. This role is best suited for a PhD-trained data scientist or quantitative researcher with deep experience in machine learning, advanced statistics, and real-world data analysis.
Key Responsibilities
Data Cleaning & Harmonization Clean, normalize, and integrate messy datasets from multiple sources (e.g., survey data from longitudinal studies) Resolve inconsistencies and schema mismatches across datasets Design scalable approaches to dataset harmonization for cross-study comparability
Data Pipeline Development Build and maintain reproducible data processing workflows for large-scale datasets Structure datasets for downstream statistical modeling and publication-ready outputs Implement version-controlled workflows for data processing and analysis
Statistical Analysis & Machine Learning Apply advanced statistical methods (e.g., mixed-effects models, causal inference, longitudinal modeling) Develop, validate, and interpret machine learning models for large-scale observational data as needed Ensure methodological rigor aligned with peer-reviewed research standards
Research Collaboration Partner with researchers to refine hypotheses, define analytic strategies, and interpret findings Translate complex analyses into clear, defensible results for academic publication
Reproducibility & Publication Support Develop reproducible codebases and documentation (e.g., notebooks, pipelines) Prepare datasets, figures, and statistical outputs for manuscripts, abstracts, and reports Contribute to methodological transparency and auditability of analyses Technical publication-ready writing ability required (e.g., writing up Results and Methods sections for publication)
Requirements
Qualifications PhD (preferred) in Data Science, Statistics, Biostatistics, Epidemiology, Computer Science, Experimental Psychology or a related quantitative field 3–5+ years experience working with large, complex datasets in research, healthcare, or applied data science Strong expertise in data cleaning, preprocessing, and dataset harmonization at scale Advanced proficiency in Python or R (e.g., pandas, tidyverse, scikit-learn, statsmodels) or related software/programming experience Deep experience with machine learning and advanced statistical methods Strong foundation in reproducible research practices Ability to communicate technical findings clearly to interdisciplinary teams and collaborate with team members to produce high quality publications
Preferred Prior experience preparing analyses for peer-reviewed publication Familiarity with survey data (Qualtrics, REDCap) and/or healthcare data standards (FHIR) Background in public health, epidemiology, or biostatistics Experience with causal inference, longitudinal analysis, or real-world evidence studies Experience working with messy, real-world observational datasets across multiple sources Familiarity with cloud or distributed data tools (AWS, GCP, or Spark) Background or familiarity in cannabinoid research
We use cookies on this site to enhance your experience. By using our website you accept our use of cookies.
Cookies
YourMembership uses cookies for your convenience and security. Cookies are text files stored on the browser of your computer and are used to make your experience on web sites more personal and less cumbersome. You may choose to decline cookies if your browser permits, but doing so may affect your ability to access or use certain features of this site. Please refer to your web browser's help function for assistance on how to change your preferences.