Nội dung

DaiPhan

DaiPhan

Full-Stack Developer

Full-stack developer passionate about modern web technologies, best practices, and sharing knowledge with the community.

Skills & Expertise

JavaScript TypeScript React Node.js DevOps
150+
Articles
50k+
Readers
4.9
Rating

Database Design: Nguyên Tắc Thiết Kế Cơ Sở Dữ Liệu

Hướng dẫn thiết kế database hiệu quả với SQL và NoSQL

Database Design: Nguyên Tắc Thiết Kế Cơ Sở Dữ Liệu

Thiết kế database là kỹ năng quan trọng cho mọi developer. Một database tốt sẽ ảnh hưởng trực tiếp đến performance và scalability của ứng dụng.

SQL vs NoSQL

SQL (Relational Databases)

  • Ưu điểm: ACID compliance, complex queries, data integrity
  • Dùng cho: Financial systems, e-commerce, applications requiring transactions
  • Ví dụ: PostgreSQL, MySQL, SQL Server

NoSQL (Non-relational Databases)

  • Ưu điểm: Flexible schema, horizontal scaling, high performance
  • Dùng cho: Real-time applications, big data, content management
  • Ví dụ: MongoDB, Redis, Cassandra

Database Design Principles

1. Normalization

Chia nhỏ dữ liệu để tránh redundancy:

-- Bad design (redundant data)
CREATE TABLE orders (
  order_id INT,
  customer_name VARCHAR(100),
  customer_email VARCHAR(100),
  product_name VARCHAR(100),
  price DECIMAL(10,2)
);

-- Good design (normalized)
CREATE TABLE customers (
  customer_id INT PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(100)
);

CREATE TABLE products (
  product_id INT PRIMARY KEY,
  name VARCHAR(100),
  price DECIMAL(10,2)
);

CREATE TABLE orders (
  order_id INT PRIMARY KEY,
  customer_id INT,
  product_id INT,
  FOREIGN KEY (customer_id) REFERENCES customers(customer_id),
  FOREIGN KEY (product_id) REFERENCES products(product_id)
);

2. Indexing Strategy

-- Index for frequent queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_products_category ON products(category_id);

-- Composite index for multi-column queries
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);

3. Relationship Types

One-to-Many

-- User has many posts
CREATE TABLE users (
  id INT PRIMARY KEY,
  username VARCHAR(50)
);

CREATE TABLE posts (
  id INT PRIMARY KEY,
  user_id INT,
  title VARCHAR(200),
  FOREIGN KEY (user_id) REFERENCES users(id)
);

Many-to-Many

-- Students and courses
CREATE TABLE students (
  id INT PRIMARY KEY,
  name VARCHAR(100)
);

CREATE TABLE courses (
  id INT PRIMARY KEY,
  name VARCHAR(100)
);

CREATE TABLE student_courses (
  student_id INT,
  course_id INT,
  enrollment_date DATE,
  PRIMARY KEY (student_id, course_id),
  FOREIGN KEY (student_id) REFERENCES students(id),
  FOREIGN KEY (course_id) REFERENCES courses(id)
);

MongoDB Schema Design

// User document with embedded posts
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  username: "john_doe",
  email: "john@example.com",
  posts: [
    {
      title: "My First Post",
      content: "Hello world!",
      created_at: ISODate("2024-01-01")
    }
  ]
}

// Separate collections for better scalability
// users collection
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  username: "john_doe",
  email: "john@example.com"
}

// posts collection
{
  _id: ObjectId("507f1f77bcf86cd799439012"),
  user_id: ObjectId("507f1f77bcf86cd799439011"),
  title: "My First Post",
  content: "Hello world!",
  created_at: ISODate("2024-01-01")
}

Performance Optimization

Query Optimization

-- Bad: SELECT * (select all columns)
SELECT * FROM users WHERE email = 'user@example.com';

-- Good: Select only needed columns
SELECT id, username FROM users WHERE email = 'user@example.com';

-- Use EXPLAIN to analyze queries
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';

Connection Pooling

// Node.js with PostgreSQL
const { Pool } = require('pg');

const pool = new Pool({
  user: 'dbuser',
  host: 'localhost',
  database: 'mydb',
  password: 'password',
  port: 5432,
  max: 20, // maximum number of clients in the pool
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

Kết luận

Thiết kế database tốt đòi hỏi hiểu biết về business logic, performance requirements, và scalability needs. Luôn plan ahead và consider trade-offs giữa normalization và performance.

Bài viết này là phần đầu tiên trong series về database design. Trong các bài tiếp theo, chúng ta sẽ explore advanced topics như sharding và replication.