End-to-End Data Science Project Workflow

When you’re working on a data science project - data analysis, ML models, LLM experiments, or quick apps - it’s helpful to have a clear system for developing, publishing, and sharing your work.

This is the workflow I use to go from local notebooks to hosted apps and public writeups. It works well for both solo projects and professional prototypes, and uses open-source and free tools throughout.


1. Develop (local or remote)

  • Anaconda to manage Python environments
  • JupyterLab for notebooks and analysis
  • VS Code for scripts, app and package development; integrates with GitHub for version control
  • Quarto for writing reports, websites, and blog-style content (supports Jupyter notebooks for code examples)

2. Publish (code, content, apps)

Code & Reports

  • GitHub: store code, README files, and notebooks in public and private repositories
  • GitHub Pages: host personal portfolio, Quarto-based websites, and writeups
  • nbviewer: fallback viewer for Jupyter notebooks that won’t render on GitHub

Apps & Demos

  • Streamlit: build interactive apps with Python and deploy to streamlit.io
  • Gradio: quickly wrap ML functions into web UIs for demos or prototypes
  • Hugging Face Spaces: host public demos (Gradio or Streamlit) with no infrastructure setup
  • Note: free hosting can be slow to wake up after inactivity

3. Share

  • Post projects or insights on LinkedIn (pin key projects to your profile)
  • Write articles about the work on Medium, Towards Data Science, or Substack
  • Link everything back through your GitHub Pages portfolio site