Top 10 Database Schema Design Best Practices

A well-designed database schema is the foundation of any robust application. Following below best practices will help you create maintainable, efficient, and reliable database structure.

The snippet uses PostgreSQL, while it's also applicable to other databases.

1. Use Appropriate Primary Keys#

Every table should have a primary key that uniquely identifies each row. While natural keys (using existing business data) can work in some cases, surrogate keys (artificial identifiers) often provide more flexibility.

-- Using a serial or identity column (PostgreSQL 10+)
CREATE TABLE customers (
    customer_id BIGSERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE NOT NULL,
    name VARCHAR(100) NOT NULL
);

-- Using UUID (often better for distributed systems)
CREATE TABLE products (
    product_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(100) NOT NULL,
    price DECIMAL(10, 2) NOT NULL
);

To learn more, check out How to Choose between UUID and Auto Increment Integer.

2. Implement Proper Normalization#

Normalization reduces data redundancy and improves data integrity. Aim for Third Normal Form (3NF) in most cases, but consider the tradeoffs carefully.

Normalization Tradeoffs#

Advantages of 3NF	Disadvantages of 3NF
Minimizes data redundancy and storage requirements	Can lead to complex joins across multiple tables
Reduces update anomalies and inconsistencies	May reduce query performance for read-heavy applications
Simplifies data management and integrity	Increases the complexity of SQL for reporting and analytics
Better accommodates future data changes	Can add development overhead for simple applications
Improves data quality and consistency	May require more complex indexing strategies

-- Instead of this (violates normalization)
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    customer_name VARCHAR(100) NOT NULL,
    customer_email VARCHAR(255) NOT NULL,
    product_name VARCHAR(100) NOT NULL,
    product_price DECIMAL(10, 2) NOT NULL,
    quantity INT NOT NULL,
    total_price DECIMAL(10, 2) NOT NULL
);

-- Use this normalized approach
CREATE TABLE customers (
    customer_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL
);

CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    price DECIMAL(10, 2) NOT NULL
);

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    customer_id INT NOT NULL REFERENCES customers(customer_id),
    order_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE order_items (
    order_item_id SERIAL PRIMARY KEY,
    order_id INT NOT NULL REFERENCES orders(order_id),
    product_id INT NOT NULL REFERENCES products(product_id),
    quantity INT NOT NULL CHECK (quantity > 0),
    price_at_time_of_order DECIMAL(10, 2) NOT NULL,
    UNIQUE(order_id, product_id)
);

3. Define Foreign Key Relationships#

Foreign keys enforce referential integrity and make relationships between tables explicit, preventing orphaned records. However, there are important tradeoffs to consider when deciding whether to implement them.

Advantages of Foreign Keys	Disadvantages of Foreign Keys
Enforces data integrity at the database level	Adds overhead to write operations (inserts/updates/deletes)
Makes relationships explicit in the schema	Can complicate bulk data loading processes
Prevents orphaned records automatically	May create performance bottlenecks in high-write scenarios
Self-documents schema relationships	Reduces flexibility in distributed database environments
Enables cascade operations for updates/deletes	Can complicate application development in microservices architectures
Provides guaranteed consistency	May require additional index maintenance

CREATE TABLE departments (
    department_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL
);

CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    department_id INT NOT NULL,
    CONSTRAINT fk_department
        FOREIGN KEY (department_id)
        REFERENCES departments(department_id)
        ON DELETE RESTRICT -- Prevent deletion of departments with employees
        ON UPDATE CASCADE  -- If department_id changes, update it for all employees
);

4. Create Indexes#

Indexes dramatically improve query performance but come with storage and write overhead. Create indexes for columns frequently used in WHERE clauses, JOIN conditions, and ORDER BY clauses.

-- Index for frequently filtered columns
CREATE INDEX idx_employees_department ON employees(department_id);

-- Composite index for columns often queried together
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);

-- Index for LIKE queries (only works for patterns starting with a constant)
CREATE INDEX idx_products_name ON products(name varchar_pattern_ops);

-- Partial index for specific query conditions
CREATE INDEX idx_orders_recent ON orders(order_date)
WHERE order_date > CURRENT_DATE - INTERVAL '3 months';

5. Use Consistent Naming Conventions#

Adopt a consistent naming convention for all database objects. This makes your schema more intuitive and maintainable.

-- Table names: plural nouns
-- Column names: singular nouns, prefixed with table name for keys
-- Constraints: descriptive of their purpose
-- Indexes: prefix with idx_ followed by table and columns

CREATE TABLE payment_methods (
    payment_method_id SERIAL PRIMARY KEY,
    name VARCHAR(50) NOT NULL,
    active BOOLEAN NOT NULL DEFAULT TRUE,
    CONSTRAINT uq_payment_methods_name UNIQUE (name)
);

CREATE TABLE payments (
    payment_id SERIAL PRIMARY KEY,
    order_id INT NOT NULL,
    payment_method_id INT NOT NULL,
    amount DECIMAL(10, 2) NOT NULL,
    payment_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT fk_payments_orders FOREIGN KEY (order_id) REFERENCES orders(order_id),
    CONSTRAINT fk_payments_payment_methods FOREIGN KEY (payment_method_id) REFERENCES payment_methods(payment_method_id),
    CONSTRAINT ck_payments_positive_amount CHECK (amount > 0)
);

CREATE INDEX idx_payments_order_id ON payments(order_id);

6. Choose Appropriate Data Types#

Selecting the right data type improves storage efficiency, performance, and data integrity.

CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,            -- Variable length, limited size
    description TEXT,                      -- Unlimited length text
    price DECIMAL(10, 2) NOT NULL,         -- Exact decimal values (for money)
    weight REAL,                           -- Floating point for measurements
    is_available BOOLEAN DEFAULT TRUE,     -- Boolean flag
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, -- Time with timezone
    tags VARCHAR(50)[] DEFAULT '{}',       -- Array type
    metadata JSONB                         -- JSON data with binary storage
);

7. Implement Constraints and Validation Rules#

Use constraints to enforce data integrity at the database level.

CREATE TABLE users (
    user_id SERIAL PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(255) NOT NULL,
    date_of_birth DATE NOT NULL,
    status VARCHAR(20) NOT NULL,

    -- Constraints
    CONSTRAINT uq_users_username UNIQUE (username),
    CONSTRAINT uq_users_email UNIQUE (email),
    CONSTRAINT ck_users_email_format CHECK (email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'),
    CONSTRAINT ck_users_minimum_age CHECK (date_of_birth <= CURRENT_DATE - INTERVAL '13 years'),
    CONSTRAINT ck_users_status CHECK (status IN ('active', 'inactive', 'suspended', 'pending'))
);

8. Plan for Schema Evolution and Versioning#

Design your schema with future changes in mind. Use techniques that allow for seamless evolution.

-- For additive changes (easier to deploy)
ALTER TABLE users ADD COLUMN phone_number VARCHAR(20);

-- To make a column nullable first (when removing NOT NULL)
ALTER TABLE users ALTER COLUMN phone_number DROP NOT NULL;

-- Using a schema version table to track changes
CREATE TABLE schema_versions (
    version_id SERIAL PRIMARY KEY,
    version VARCHAR(50) NOT NULL,
    applied_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    description TEXT,
    script_name VARCHAR(255)
);

9. Document Your Schema#

Good documentation makes your database easier to understand and maintain. Use comments in your DDL scripts and maintain separate documentation.

-- Table comments
COMMENT ON TABLE customers IS 'Stores customer information for all registered users';
COMMENT ON COLUMN customers.email IS 'Primary contact email - must be unique';

10. Implement Audit Trails#

Track changes to important data using audit tables.

-- Basic audit table approach
CREATE TABLE customer_audit (
    audit_id SERIAL PRIMARY KEY,
    customer_id INT NOT NULL,
    action VARCHAR(10) NOT NULL CHECK (action IN ('INSERT', 'UPDATE', 'DELETE')),
    changed_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
    changed_by VARCHAR(50) NOT NULL,
    old_data JSONB,
    new_data JSONB
);

-- Create a trigger function for auditing
CREATE OR REPLACE FUNCTION fn_audit_customer()
RETURNS TRIGGER AS $$
BEGIN
    IF TG_OP = 'INSERT' THEN
        INSERT INTO customer_audit(customer_id, action, changed_by, new_data)
        VALUES (NEW.customer_id, TG_OP, current_user, to_jsonb(NEW));
    ELSIF TG_OP = 'UPDATE' THEN
        INSERT INTO customer_audit(customer_id, action, changed_by, old_data, new_data)
        VALUES (NEW.customer_id, TG_OP, current_user, to_jsonb(OLD), to_jsonb(NEW));
    ELSIF TG_OP = 'DELETE' THEN
        INSERT INTO customer_audit(customer_id, action, changed_by, old_data)
        VALUES (OLD.customer_id, TG_OP, current_user, to_jsonb(OLD));
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

-- Attach the trigger to the customers table
CREATE TRIGGER trg_customer_audit
AFTER INSERT OR UPDATE OR DELETE ON customers
FOR EACH ROW EXECUTE FUNCTION fn_audit_customer();

Bytebase is a database schema change and version control tool that can enhance your ability to implement the best practices outlined above:

Collaborative Review Process: Implements GitLab/GitHub-like review workflows for database changes
Multi-Environment Synchronization: Ensures consistent schema across development, testing, and production
SQL Lint: Catches potential issues in your schema design before they get deployed
Change History: Maintains a comprehensive history of all schema changes