WHERE Clause
The WHERE clause in PostgreSQL is used to filter rows in a query based on specified conditions, enabling the retrieval of just the data that meets your needs.
It's a powerful filtering tool with a wide range of options, allowing users to combine conditions, apply logical operators, and format outputs to meet complex query requirements.
Filtering based on a single condition
SELECT product_name, price, unit
FROM sales.products
WHERE price > 50;
This query retrieves products with a price greater than 50.
Filtering based on multiple conditions (AND, OR)
SELECT customer_name, country, city, NOW() AS query_time
FROM sales.customers
WHERE country = 'Germany'
AND city = 'Berlin';
This query retrieves customers from Germany who reside in Berlin and projects a calculated column which is given the name query_time, and its value is produced by the intrinsic function NOW().
Intrinsic functions are built-in functions provided by the database system to perform specific operations, such as mathematical calculations, string manipulation, or retrieving system-generated values. In this example, NOW() is used to fetch the exact date and time the query was executed, enhancing insights into when the data was retrieved. Some other examples of intrinsic functions are CURRENT_DATE, which returns the current date, and LENGTH(), which calculates the length of a given string.
SELECT product_name, category_id, CAST(price AS VARCHAR) as price_text
FROM sales.products
WHERE category_id = 1
OR category_id = 8;
This query selects products belonging to either category 1 (Beverages) or category 8 (Seafood). Additionally, the CAST function is used to convert the price column into text format, presenting it as price_text in the results.
Filtering based on multiple conditions (IN, NOT IN)
SELECT product_name, category_id, CAST(price as VARCHAR) as price_text
FROM sales.products
WHERE category_id IN (1,8);
This query produces the same result as the previous query, but it uses the IN clause instead of OR. The IN clause filters records by checking if a column matches any value within a specified list.
SELECT customer_name, country
FROM sales.customers
WHERE country NOT IN ('USA');
This query finds customers not based in USA.
SELECT order_id, order_date, EXTRACT(MONTH FROM order_date) AS order_month
FROM sales.orders
WHERE EXTRACT(MONTH FROM order_date) = 1;
This query retrieves orders made in January by using the EXTRACT function to isolate the month from the order_date column and display it as order_month.
Using LIKE for pattern matching
SELECT customer_name, address
FROM sales.customers
WHERE customer_name LIKE 'A%';
This query retrieves customers whose names start with 'A'. The symbol % in the string literal is a wildcard used by the LIKE operator for pattern matching. It represents zero or more characters, allowing the query to match any customer_name that begins with the letter 'A' followed by any sequence of characters. For instance, names like "Alice" or "Andrew" would be included in the query results.
SELECT product_id, product_name
FROM sales.products
WHERE product_name LIKE '%Chocolate%';
This query retrieves products whose names contain the word "Chocolate". When placed before and after the word "Chocolate" in the query, it allows for matches where "Chocolate" appears anywhere in the product name—whether at the beginning, middle, or end. For instance, product names such as "Teatime Chocolate Biscuits," or "Chocolate" would be included in the query results.
Using IN and NOT IN for multiple values
SELECT customer_name AS full_name
FROM sales.customers
WHERE country IN ('Germany', 'France', 'UK');
This query retrieves a list of customers whose country is specified as either Germany, France, or the UK. By using the SQL operator IN, the query checks the country column for matches against the specified set of values ('Germany', 'France', 'UK'). If the value in the country column corresponds to any of these three options, the corresponding customer names are returned and displayed under the alias full_name.
SELECT *
FROM sales.products
WHERE category_id NOT IN (1, 3);
This query retrieves products that aren't classified under categories 1 or 3.
Using BETWEEN for range values
SELECT *
FROM sales.products
WHERE price BETWEEN 20 and 40;
This query retrieves products priced between 20 and 40 currency units, inclusive.
The BETWEEN statement in SQL is used to filter the data within a specified range of values. It's inclusive, meaning it includes the boundary values given in the range. This is useful for narrowing down datasets based on numerical, date, or even text values (if applicable). It provides a concise and readable way to define a range, as opposed to using multiple conditional operators like >= and <=.
Using IS NULL and IS NOT NULL for null values
SELECT *
FROM sales.customers
WHERE postal_code IS NULL;
This query retrieves customers with a missing postal_code.
In SQL, NULL represents missing or undefined data in a database.
It isn't equivalent to zero, an empty string, or any other type of value, but rather indicates the absence of a value.
To handle NULL, you can use filters such as IS NULL to identify records with missing values or IS NOT NULL to find entries with determined data.
These filters are useful in cleansing data, as they help segregate records for correction, exclusion, or specialized treatment.