# Database A database is an organized collection of structured data, typically stored electronically. Databases are managed by [[Database Management Systems (DBMS)]] that provide mechanisms for storing, retrieving, updating, and managing data. Databases are fundamental to virtually all modern applications, from simple mobile apps to complex enterprise systems. ## Types of Databases ### Relational Databases (RDBMS) [[Relational Databases (RDBMS)]] store data in tables with rows and columns, using [[SQL]] for queries. Tables can be linked through relationships (foreign keys). Examples: - [[PostgreSQL]] - [[SQLite]] - MySQL - Microsoft SQL Server - Oracle Database ### NoSQL Databases Non-relational databases designed for specific data models and flexible schemas. **Document stores**: Store data as JSON-like documents - MongoDB - CouchDB **Key-value stores**: Simple key-value pairs, extremely fast - Redis - DynamoDB **Wide-column stores**: Store data in column families - Cassandra - HBase **Graph databases**: Store nodes and relationships - Neo4j - Amazon Neptune ### NewSQL Databases Combine relational model with NoSQL scalability: - CockroachDB - TiDB - Spanner ### Specialized Databases - **Time-series**: InfluxDB, TimescaleDB - **Search engines**: Elasticsearch, Meilisearch - **Vector databases**: Pinecone, Weaviate, pgvector - **Embedded**: [[SQLite]], LevelDB - **Embedded analytical (OLAP)**: [[DuckDB]] - **Versioned (Git-style)**: [[Dolt]] ## Key Concepts ### ACID Properties See [[ACID]] for the full treatment. - **[[Atomicity]]**: Transactions are all-or-nothing - **Consistency**: Data remains valid after transactions - **Isolation**: Concurrent transactions don't interfere - **Durability**: Committed data persists; usually via [[Write-Ahead Logging (WAL)]] ### Normalization Process of organizing data to reduce redundancy: - 1NF: Eliminate repeating groups - 2NF: Remove partial dependencies - 3NF: Remove transitive dependencies ### Indexing Data structures that speed up queries at the cost of storage and write performance. Common types: - B-tree (default for most databases) - Hash indexes - GIN/GiST (for full-text, JSON) - Bitmap indexes ### Transactions Group of operations treated as a single unit: ```sql BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 1; UPDATE accounts SET balance = balance + 100 WHERE id = 2; COMMIT; ``` ## Choosing a Database | Need | Consider | |------|----------| | General purpose web app | [[PostgreSQL]] | | Simple local storage | [[SQLite]] | | Caching, sessions | Redis | | Document flexibility | MongoDB | | Graph relationships | Neo4j | | Time-series data | TimescaleDB, InfluxDB | | Full-text search | Elasticsearch, Meilisearch | | AI/ML embeddings | Pinecone, pgvector | ## References - https://en.wikipedia.org/wiki/Database ## Related - [[Database Management Systems (DBMS)]] - [[Relational Databases (RDBMS)]] - [[SQL]] - [[SQLite]] - [[PostgreSQL]] - [[DuckDB]] - [[Dolt]] - [[Litestream]] - [[ACID]] - [[Atomicity]] - [[Write-Ahead Logging (WAL)]] - [[Vector Store]]