SQL Và Một Số Câu Hỏi Thường Gặp

Oct 10, 2024

Table of Contents

1. INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN in SQL

INNER JOIN: Returns records that have matching values in both tables.
- Use case: When you only want the rows that have matching entries in both tables.
LEFT JOIN (or LEFT OUTER JOIN): Returns all records from the left table and the matched records from the right table. Unmatched rows from the right table will have NULL values.
- Use case: When you want all records from the left table, regardless of whether there’s a match in the right table.
RIGHT JOIN (or RIGHT OUTER JOIN): Returns all records from the right table and the matched records from the left table. Unmatched rows from the left table will have NULL values.
- Use case: When you want all records from the right table, regardless of whether there’s a match in the left table.
FULL OUTER JOIN: Returns all records when there is a match in either table. Rows without matches from either table will contain NULL values.
- Use case: When you want to retrieve all rows from both tables, with or without matches

2. Window Functions in SQL

Window functions perform calculations across a set of table rows that are related to the current row, but do not result in data being grouped into fewer rows, unlike aggregate functions. They allow you to compute things like running totals, ranks, or moving averages.

Example: ROW_NUMBER(), RANK(), SUM() OVER()

Use case: Window functions are useful in scenarios where you need to rank, aggregate, or perform operations on rows while retaining individual row data (e.g., when computing ranks, running totals, or partitioned aggregates).

3. Handling NULL Values in SQL

NULL in JOINs: NULLs are ignored in equality comparisons (=), which can result in missing matches. For instance, INNER JOIN won’t return rows where the join column contains NULL values.
NULL in Aggregations: Functions like COUNT(), SUM(), AVG() ignore NULLs except for COUNT(*), which counts all rows, including those with NULL values.
Implications: In JOINs, NULLs can result in missing rows, and in aggregations, NULL values can skew results if not handled correctly.

4. Indexing in SQL Databases

An index is a data structure that improves the speed of data retrieval operations on a table by allowing the database to find rows faster without scanning the entire table.

How Indexes Improve Performance: Indexes work like a table of contents in a book, pointing to data more quickly than searching every page. Common types include B-trees and hash indexes.
Trade-offs:
- Indexes speed up SELECT queries but slow down INSERT, UPDATE, and DELETE operations because the index must also be updated.
- Indexes take up additional storage space.

5. SQL Query Optimization Techniques

Use Indexes: Ensure that relevant columns in WHERE clauses and joins are indexed.
**Avoid SELECT ***: Fetch only the necessary columns.
Proper Joins: Use the right type of join for the task (INNER JOIN, LEFT JOIN, etc.).
Optimize Subqueries: Consider using JOIN or CTE instead of deeply nested subqueries.
Use Query Execution Plan: Use EXPLAIN to analyze how the database is executing the query and optimize based on insights.
Limit Rows Returned: Use LIMIT or OFFSET for large result sets when appropriate.

6. Normalization in SQL Databases

Normalization is the process of organizing data to minimize redundancy. The goal is to break down large tables into smaller, related tables.

1st Normal Form (1NF): Eliminate repeating groups; ensure each column contains atomic values.
2nd Normal Form (2NF): Ensure the table is in 1NF and all non-key attributes are fully dependent on the primary key.
3rd Normal Form (3NF): Ensure the table is in 2NF and all attributes are directly dependent on the primary key, not on other non-key attributes.

Impact: Normalization reduces data duplication but may require more complex queries with joins, which can affect performance.

7. Common Table Expressions (CTEs) in SQL

A CTE is a temporary result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. It’s defined using the WITH clause.

CTE vs Subquery:

CTE: Improves readability and is reusable in the same query.
Subquery: Usually embedded within other queries but is less readable and cannot be reused.

8. SQL vs. NoSQL Databases

SQL (Relational):
- Structured schema: Uses tables with predefined schemas.
- ACID compliance: Strong consistency, isolation, and transactions.
- Use case: When data relationships are well-defined and consistency is critical (e.g., financial data).
NoSQL (Non-Relational):
- Flexible schema: Supports key-value, document, wide-column, or graph models.
- Scalable and fast: Better for unstructured, distributed, or high-velocity data.
- Use case: When flexibility, scalability, and rapid data access are more important than strict consistency (e.g., social media, big data).

9. ACID Properties in SQL Databases

Atomicity: Ensures that each transaction is treated as a single unit, either fully completing or fully failing.
Consistency: Guarantees that a transaction takes the database from one valid state to another.
Isolation: Ensures that transactions operate independently without interference.
Durability: Once a transaction is committed, it remains so, even in case of a system failure.

These properties are critical for maintaining transactional integrity in SQL databases.

10. Stored Procedures in SQL

Stored procedures are precompiled SQL code that can be executed by the database engine. They allow you to encapsulate business logic and database operations in a reusable and centralized manner.

Advantages:

Performance: Precompiled, which can reduce execution time.
Security: Access to data can be controlled more tightly.
Reusability: Encapsulate logic that can be reused across applications.

Drawbacks:

Complexity: Can make application logic harder to maintain if too much is encapsulated in stored procedures.
Portability: Stored procedures are often specific to the database vendor (e.g., SQL Server, MySQL).

You need to login in order to like this post: click here