## **BONUS SECTION:** Career Paths in Data Science and What You Need to Learn

As data science has grown, it has expanded into a collection of roles rather than a single job title. Different organizations emphasize different parts of the data workflow, resulting in specialized paths such as:

- **Data Scientists:** Blend programming, statistical modeling, and domain knowledge to generate insights and build predictive systems.
- **Data Analysts:** Extract patterns, summarize data, and produce reports; often work with SQL, spreadsheets, and BI tools.
- **Machine Learning Engineers:** Design, optimize, and deploy machine learning models at scale, working closely with software engineering teams.
- **Data Engineers:** Build and maintain data infrastructure; pipelines, databases, and large-scale storage, that make analysis possible.
- **Business Intelligence (BI) Developers:** Create dashboards and reporting systems that support operational and strategic decisions.
- **Decision Scientists:** Bridge analytics and strategy, translating analytical findings into business actions and priorities.

Early models of the field, such as **Drew Conway’s Venn diagram**, highlighted the intersection of hacking skills, statistics, and domain expertise.
<div align="center">

<table>
  <tr>
    <td align="center" style="padding-right:20px;">
      <img src="https://images.squarespace-cdn.com/content/v1/5150aec6e4b0e340ec52710a/1364352051365-HZAS3CLBF7ABLE3F5OBY/Data_Science_VD.png" width="330"><br>
      <em>Figure 10: Drew Conway's Venn Diagram</em>
    </td>
    <td align="center" style="padding-left:20px;">
      <img src="https://miro.medium.com/v2/resize:fit:1200/format:webp/0*nOyHIN3F05c6zZV_" width="380"><br>
      <em>Figure 11: The New Data Scientist Venn Diagram (Stephan Kolassa)</em>
    </td>
  </tr>
</table>

</div>


<!-- <p align="center">
  <img src="https://images.squarespace-cdn.com/content/v1/5150aec6e4b0e340ec52710a/1364352051365-HZAS3CLBF7ABLE3F5OBY/Data_Science_VD.png" width="350"><br>
  <em>Figure 9: Drew Conways's Venn Diagram, showing overlap between different disciplines in data science.</em>
</p> -->

A more recent extension by **Stephan Kolassa** shifts the perspective to focus on **principles and goals** rather than just skill sets. In this framing:

- **Statistics** seeks to **explain and understand** relationships.
- **Machine Learning** focuses on **prediction and generalization**.
- **Artificial Intelligence** aims for **autonomous action or decision-making**.
- **Data Science** integrates these goals with computation, visualization, communication, and context.

This evolution reflects how different data roles prioritize different outcomes: explanatory, predictive, or operational.

<!-- <p align="center">
  <img src="https://miro.medium.com/v2/resize:fit:1200/format:webp/0*nOyHIN3F05c6zZV_" width="450"><br>
  <em>Figure 10: The New Data Scientist Venn Diagram. Adapted from Stephan Kolassa, extending Drew Conway’s original concept to highlight principles in Statistics, Machine Learning, and AI.</em>
</p> -->

### **What do you need to learn to become a good Data Scientist/Analyst?**

There is no single recipe, but most data professionals draw from a few core pillars:

- **Programming & Tools:** Python, R, SQL, data wrangling libraries, and automation.
- **Statistics & Math:** Describing data, testing hypotheses, measuring uncertainty, and evaluating models.
- **Domain Knowledge:** Understanding the context of the data and how decisions are made in that domain.
- **Communication:** Turning findings into stories, dashboards, recommendations, or decisions.
- **Business Thinking:** Knowing what questions matter, how tradeoffs work, and how organizations operate.

Different career paths emphasize different combinations: analysts may lean toward communication and business reasoning, ML engineers toward coding and modeling, and data scientists toward modeling and domain integration.

## **Important Tools to Know**

| Category                 | Tools / Libraries                     |
|--------------------------|---------------------------------------|
| **Programming**          | Python, R                             |
| **Data Manipulation**    | Pandas, NumPy, dplyr                  |
| **Databases**            | SQL (PostgreSQL, MySQL), NoSQL        |
| **Machine Learning**     | scikit-learn, XGBoost, LightGBM       |
| **Deep Learning**        | TensorFlow, Keras, PyTorch            |
| **Visualization**        | Matplotlib, Seaborn, Plotly, ggplot   |
| **Big Data**             | Spark, Hadoop                         |
| **Deployment / APIs**    | Flask, FastAPI, Docker                |
| **Business Intelligence**| Tableau, Power BI                     |

The key idea is that **data science is not a single skill**, but a synthesis. Students enter the field from computer science, statistics, economics, biology, psychology, and many other areas because the field rewards curiosity, adaptability, and the ability to connect data to decisions.

------------------------
Finally, it is important to note that **job titles in data science are not standardized**. The responsibilities of a “Data Scientist” at a research lab can look very different from the same title at a startup, a healthcare company, a financial firm, or a government agency. Likewise, some companies call roles “Machine Learning Engineer” or “Decision Scientist” while others bundle those tasks into “Data Scientist” or “BI Analyst.” Understanding the actual workflow and expectations of a role is often more important than the title itself.

------------------------



## Knowledge Check

<iframe src="https://docs.google.com/forms/d/e/1FAIpQLSe8OIJUMSoe4P0DnpdUngqsVGqknqTwX_6fX1FT3aVGbgWwUQ/viewform?embedded=true" width="640" height="1078" frameborder="0" marginheight="0" marginwidth="0">Loading…</iframe>
