# Data Science
Data Science is the interdisciplinary field that uses scientific methods, algorithms, and systems to extract knowledge and insights from structured and unstructured data. It sits at the intersection of mathematics/statistics, computer science, and domain expertise.
The field combines techniques from [[Machine Learning (ML)]], statistical analysis, and [[Data Visualization]] to turn raw data into actionable understanding. While ML focuses on building predictive models, Data Science encompasses the entire pipeline: from question formulation and data collection to analysis, modeling, and communication of results.
## Core Components
| Component | Purpose |
|-----------|---------|
| **Data Collection** | Gathering raw data from databases, APIs, sensors, web |
| **Data Cleaning** | Handling missing values, outliers, inconsistencies |
| **Exploratory Data Analysis** | Understanding distributions, correlations, patterns |
| **Feature Engineering** | Creating informative variables from raw data |
| **Modeling** | Applying [[Machine Learning (ML)]] and statistical models |
| **[[Data Visualization]]** | Communicating findings through charts and dashboards |
| **Deployment** | Putting models into production systems |
## Key Skills
- Programming: [[Python]], R, SQL
- Statistics: hypothesis testing, regression, Bayesian methods
- [[Machine Learning (ML)]]: [[Supervised Learning (SL)]], [[Unsupervised Learning]], [[Deep Learning]]
- Data wrangling: pandas, dplyr, Spark
- Visualization: matplotlib, seaborn, D3.js, Tableau
- Communication: translating technical findings for stakeholders
## Data Science vs Related Fields
| Field | Focus |
|-------|-------|
| **Data Science** | End-to-end insight extraction and decision support |
| **Machine Learning** | Building predictive/generative models |
| **Data Engineering** | Building pipelines and infrastructure for data flow |
| **Data Analytics** | Descriptive and diagnostic analysis of business data |
| **[[Artificial Intelligence (AI)]]** | Building systems that exhibit intelligent behavior |
## References
- https://en.wikipedia.org/wiki/Data_science
## Related
- [[Machine Learning (ML)]]
- [[Deep Learning]]
- [[Artificial Intelligence (AI)]]
- [[Data Visualization]]
- [[Python]]
- [[Supervised Learning (SL)]]
- [[Unsupervised Learning]]