What is Window function ?

A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row β€” the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result. [1]

Let’s take an example of a products table, this table contains a number of products organised into brands

If we wanted to compare each keyboards price to the average price of thier brand, we can achive this using window function

SELECT * , AVG(price) OVER (PARTITION BY brand) FROM products

A window function call always contains an OVER clause directly following the window function’s name and argument(s). This is what syntactically distinguishes it from a regular function or aggregate function. The OVER clause determines exactly how the rows of the query are split up for processing by the window function. The PARTITION BY list within OVER specifies dividing the rows into groups, or partitions, that share the same values of the PARTITION BY expression(s). For each row, the window function is computed across the rows that fall into the same partition as the current row.

If we do not provide any partition clause the calculation will be executed within a single partition containing all of the rows in the table

SELECT *, AVG(price) OVER () FROM products

Window Functions also support an order by clause. The order by function changes the behaviour of window frame. When order by clause is not present the default windows consists of all rows in current partition.

When order by clause is added the window consists of all rows from start of current partition upto current row (plus all row equal to current row according to specified order )

Here we can see in the first four rows, the second, third and fourth row have the same sum and not the sum of first four rows, this is because they share the same price, same behaviour wherever prices match.

Order by clause can be used in conjunction with Partition clause by. Note that partition by clause should always precede the order by clause

SELECT *, SUM(price) OVER (PARTITION BY brand ORDER BY price) from products

Window functions are executed against the virtual table produced by the queries from clause and it’s filtering clauses WHERE, HAVING and GROUP BY hence window functions cannot be part of any of those logics in those clauses and only permitted in select list and order by clause

Let’s rank products according to brand_rank

SELECT *, rank() OVER (PARTITION BY brand ORDER BY price) AS brand_rank from products 

The output is ordered by brand and then their brand ranking. We can order by ranking by passing ORDER BY clause at the end of the query

SELECT *, rank() OVER (PARTITION BY brand ORDER BY price) AS brand_rank from products ORDER by brand_rank

But if we pass Where Clause postgres returns an error

SELECT *, rank() OVER (PARTITION BY brand ORDER BY price) AS brand_rank from products WHERE brand_rank<=2  ORDER by brand_rank 

Windows function executes after where Clause hence the window function is not yet defined. To use WHERE BY clause we can achived this by using window function in sub select

SELECT * FROM (SELECT *, rank() OVER (PARTITION BY brand ORDER BY price) AS brand_rank from products) as ranked_products WHERE brand_rank <=3 ORDER BY brand_rank

References

[1] https://www.postgresql.org/docs/9.1/tutorial-window.html