UNION and UNION ALL in PostgreSQL



UNION Operator in Postgres

The UNION operator is responsible for combining result sets of two or more than two SELECT statements into a single result set. It also removes all the duplicate rows.

Syntax of UNION Operator


SELECT Col_1, Col_2
FROM Tbl_1
UNION
SLECT Col_1, Col_2
FROM Tbl_2
UNION
SELECT Col_1, Col_2
FROM Tbl_3


When using the UNION operator in PostgreSQL, we have to follow some rules.

1- All Queries (Which SELECT statements result set wan to combine) should return the same number of columns


2- The corresponding columns in the queries must have compatible data types (For example Col_1 of Tbl_1 and Col_1 of Tbl_2 should be compatible).

UNION Operator in Postgres



The UNION ALL operator is also responsible for combining the result sets of two or more than two SELECT statements into a single result set but it does not remove the duplicate row like UNION operator.

Syntax of UNION ALL operator



SELECT Col_1, Col_2
  FROM Tbl_1
  UNION ALL
SLECT Col_1, Col_2
  FROM Tbl_2
  UNION ALL
SELECT Col_1, Col_2
  FROM Tbl_3



This has the same rule of UNION operator like all queries should return the same number of columns and corresponding columns in the queries must have compatible data types.


For the example, I am creating an EMPLOYEE table which containing some columns and records such as First Name, Last Name, User Name, Password, contact, salary, Date of Joining, Manager ID, and Age.


CREATE TABLE employee
(
  empid serial not null,
  fname text,
  lname text,
  username text,
  password text,
  contact text,
  salary numeric,
  doj date,
  managerid integer,
  age integer,
  constraint pk_employee_empid Primary key(empid)
);

-- Inserting some data into EMPLOYEE table

INSERT INTO employee (fname, lname, username, password, contact, salary, managerid, age)
  VALUES('Naina','Singh', 'naina@gmail.com','test1234','9012234439',120000,0,40);

INSERT INTO employee (fname, lname, username, password, contact, salary, managerid, age)
  VALUES('Ashish','Singh', 'ashish@gmail.com','test1234','9887778876',90000,1,35);

INSERT INTO employee (fname, lname, username, password, contact, salary, managerid, age)
  VALUES('Rahul','Jain', 'rahul@gmail.com','test1234','4556676643',70000,1,30);

INSERT INTO employee (fname, lname, username, password, contact, salary, managerid, age)

  VALUES('Ravish','Jha', 'ravish@gmail.com','test1234','9897765565',40000,2,25);


UNION operator example



SELECT fname, lname, managerid
  FROM public.employee WHERE fname LIKE '%Ra%'
UNION
SELECT fname, lname, managerid
  FROM public.employee WHERE salary > 100000;               


Result:



Now we will check, it removes the duplicate records or not.


I am changing the second SELECT statement (where clause) so that it returns duplicate records that already exist in the first SELECT statement result set.


SELECT fname, lname, managerid
  FROM public.employee WHERE fname LIKE '%Ra%'
  UNION
SELECT fname, lname, managerid

  FROM public.employee WHERE salary > 30000;


Result:




There are two records are duplicate (Rahul and Ravish) but UNION operator removed the duplicate records.

UNION ALL operator examples:



If I execute the above records with UNION ALL operator, its returns the duplicate records.


SELECT fname, lname, managerid
  FROM public.employee WHERE fname LIKE '%Ra%'
UNION ALL
SELECT fname, lname, managerid

  FROM public.employee WHERE salary > 40000;


Result:


EXPLAIN ANALYZE of UNION and UNION ALL operator



EXPLAIN ANALYZE SELECT fname, lname, managerid
  FROM public.employee WHERE fname LIKE '%Ra%'
UNION
SELECT fname, lname, managerid

  FROM public.employee WHERE salary > 40000;


Result:




EXPLAIN ANALYZE SELECT fname, lname, managerid
  FROM public.employee WHERE fname LIKE '%Ra%'
UNION ALL
SELECT fname, lname, managerid

  FROM public.employee WHERE salary > 40000;


Result:




If we do the analysis of the above queries (Union and UNION ALL)

We can see UNION ALL performance is better than UNION operator, For the same query execution time is not same, UNION operator execution time is .084 ms (millisecond) while UNION ALL operator execution time is .044 ms which half almost. So use these operators carefully in your query.

JOINS in PostgreSQL


As per the requirement of the project data, normalization, and table relationships are two factors that come to our mind when creating the database schema, it means we minimize the redundancy from a relation or set of relations, and for the referential integrity between them, we draw the relationship. For the user-friendly application's interface when we obtain the records from the database sometimes we need to combine the column from one (self-join) or more tables based on the values of the common columns between the tables. Using PostgreSQL JOIN we combine the column from two or more tables, based on a related columns between them.

JOINs available in PostgreSQL are:-

1- INNER JOIN                  
2- LEFT JOIN
3- RIGHT JOIN
4- FULL OUTER JOIN
5- CROSS JOIN
6- NATURAL JOIN
7- SELF-JOIN


To understand the concept of the JOIN I am creating the following table and make a relationship between them.


-- Employee table
CREATE TABLE public.employee
(
    empid serial NOT NULL,-- primary key
    fname text,
    lname text,
    username text,
    password text,
    contact text,
    salary numeric NOT NULL DEFAULT 0.0,
    doj date NOT NULL,
    managerid bigint,
    age integer,
    CONSTRAINT employee_primarykey PRIMARY KEY (empid),
    CONSTRAINT "uk_UserName" UNIQUE (username),
    CONSTRAINT chk_employee_age CHECK (age > 18)

);



-- Address of employee

CREATE TABLE public.address
(
    id serial NOT NULL,
    empid integer NOT NULL,-- reference of employee
    address1 text,
    address2 text,
    city text,
    state text,
    country text,
    CONSTRAINT pk_address_id PRIMARY KEY (id)

);


1- INNER JOIN


INNER JOIN is used massively in PostgreSQL query statements, INNER JOIN only displays the rows that match the two joined tables on a particular column. To understand the INNER JOIN, please create the tables described above. EMPLOYEE and ADDRESS are the two tables that have a relation on id and empid column, EMPLOYEE table has a primary key on column id and ADDRESS has the foreign key of EMPLOYEE table on empid. Suppose we want to display the data of employees with their address, then we will write a PostgreSQL query statement with INNER JOIN. If we are using INNER JOIN in the statements of PostgreSQL, it means that the query will only return employee records that have address data in the ADDRESS table.

Syntax of PostgreSQL INNER JOIN



SELECT t1.column1,t1.column2,t2.column1,....
FROM table1 t1
INNER JOIN table2 t2

ON t1.matching_column = t2.matching_column;

Example of INNER JOIN

For example, we inserted some records in EMPLOYEE and ADDRESS tables.

EMPLOYEE table containing the following records


The table ADDRESS is containing the following records



Now, execute the following PostgreSQL statement


SELECT e.fname,e.lname,e.username,e.contact,a.address1,a.address2,a.city,a.state,a.country
  FROM employee e
  INNER JOIN address a
  ON e.empid=a.empid



Result:



table1: EMPLOYEE table
table2: ADDRESS table
matching_column: e.empid and a.empid

Here e and a are the alias of EMPLOYEE and ADDRESS table respectively.


2- LEFT JOIN


If you looked at the INNER JOIN result there are 4th row missing of EMPLOYEE table, which is because of INNER JOIN statement displays the data when data matched in both tables (EMPLOYEE and ADDRESS) on the particular columns(e.empid=a.empid). The ADDRESS table doesn't have any record for the id = 4 (Employee) that's why this record has been missed in the result-set of the INNER JOIN statement. But this is the wrong result set because it may be Admin forgot to update the address of the employee, but the employee other data (exists in EMPLOYEE table) should be displayed. We can resolve this issue using PostgreSQL LEFT JOIN.


If you want to select the row of EMPLOYEE table which may or may not have corresponding rows in the ADDRESS table, you should use the LEFT JOIN clause. In case, there is no matching row in the ADDRESS table, the values of the columns in the ADDRESS table is substituted by the NULL values.

Syntax of PostgreSQL LEFT JOIN


SELECT t1.column1,t1.column2,t2.column1,....
FROM table1 t1
LEFT JOIN table2 t2

ON t1.matching_column = t2.matching_column;

Example of LEFT JOIN


Execute the following PostgreSQL statement


SELECT e.fname,e.lname,e.username,e.contact,a.address1,a.address2,a.city,a.state,a.country
  FROM employee e
  LEFT JOIN address a

  ON e.empid=a.empid


Result:



table1: EMPLOYEE table
table2: ADDRESS table
matching_column: e.empid and a.empid

Here e and a are the alias of EMPLOYEE and ADDRESS table respectively.




You can see the result-set of PostgreSQL LEFT JOIN query for the employee Naina, we don't have the address records that's why the ADDRESS table's column value substituted by NULL.

3- RIGHT JOIN



RIGHT JOIN is the same as LEFT JOIN, in the above (LEFT JOIN query) if the ADDRESS table has a row which but it is not corresponding to the EMPLOYEE table. In case, there is no matching row in the EMPLOYEE table, the values of the columns in the EMPLOYEE table are substituted by the NULL values.

Example of RIGHT JOIN

Execute the following PostgreSQL statement


SELECT e.fname,e.lname,e.username,e.contact,a.address1,a.address2,a.city,a.state,a.country
  FROM employee e
  RIGHT JOIN address a

  ON e.empid=a.empid


Result:




table1: EMPLOYEE table
table2: ADDRESS table
matching_column: e.empid and a.empid

Here e and a are the alias of EMPLOYEE and ADDRESS table respectively.


4- FULL OUTER JOIN



If we are using PostgreSQL FULL OUTER JOIN then it will return result-set by combining the result-set of INNER JOIN, LEFT JOIN and RIGHT JOIN. We have to tables(EMPLOYEE and ADDRESS table as above), and if we applied the FULL OUTER JOIN on these two tables,  for the matching rows, it will return a single row included in the result set that contains columns populated from both joined tables. If the rows in the joined table do not match, the full outer join sets NULL values for every column of the table that lacks a matching row.

Syntax of PostgreSQL FULL OUTER



SELECT t1.column1,t1.column2,t2.column1,....
FROM table1 t1
FULL OUTER JOIN table2 t2

ON t1.matching_column = t2.matching_column;


Example of PostgreSQL FULL OUTER JOIN


Execute the following PostgreSQL statement


SELECT e.fname,e.lname,e.username,e.contact,a.address1,a.address2,a.city,a.state,a.country
  FROM employee e
  FULL OUTER JOIN address a

  ON e.empid=a.empid


Result:



table1: EMPLOYEE table
table2: ADDRESS table
matching_column: e.empid and a.empid

Here e and a are the alias of EMPLOYEE and ADDRESS table respectively.


5- CROSS JOIN


Joins between tables without conditions produce the Cartesian product called cross join. The size of the cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table.

For example, suppose we have tables A and B and Table A have n records and table B have m record if we write a PostgreSQL statement of CROSS JOIN for table A and table B its result-set will return the n*m rows. When I wrote about cross joining for SQL server, a lot of people asked me for a live example.


Let's explain the PostgreSQL CROSS JOIN with a live example. First of all, we will create two tables, first is PRODUCT and second is SUB PRODUCT, run the below query.


--Create product table

CREATE TABLE product
(
       id serial not null,
       product_name text,
       cost numeric,
       CONSTRAINT pk_product_id PRIMARY KEY (id)
);

--Create sub_product table

CREATE TABLE sub_product
(
       sub_id serial,
       sub_product_name text,
       cost numeric,
       CONSTRAINT pk_sub_product_id PRIMARY KEY (sub_id)

);


To insert some records run the following query.


--Insert some records into the product table

INSERT INTO product(product_name,cost)
  VALUES('DK Pizza',400);

INSERT INTO product(product_name,cost)
  VALUES('DK Burger',200);

INSERT INTO product(product_name,cost)
  VALUES('DK Pizza Puff',150);

--Insert some records into sub_product table

INSERT INTO sub_product(sub_product_name,cost)
  VALUES('Soft Drink',100);

INSERT INTO sub_product(sub_product_name,cost)

  VALUES('Garlic Bread',80);


Now, if we are selling this product online then we have to display user-friendly data so that customers can get the details of the combined product and its total cost. It makes easy to select their respective choice.

Example of PostgreSQL CROSS JOIN


Execute the following PostgreSQL statement


SELECT    p.product_name,sp.sub_product_name,(p.cost+sp.cost)asTotalCost
  FROM product p

  CROSS JOIN sub_product sp


Result:


6- NATURAL JOIN


NATURAL JOIN creates an implicit join based on the same column names in the joined table, for example, you can see the example table (EMPLOYEE and ADDRESS table), and both tables have the column empid. It means we do not need to write "ON e.empid=a.empid" statement part in the query. NATURAL JOIN can be INNER JOIN, LEFT JOIN, and RIGHT JOIN by default it works like INNER JOIN.

Syntax of NATURAL JOIN


SELECT t1.column1,t1.column2,t2.column1,....
FROM table1 t1

NATURAL < INNER, LEFT, RIGHT>JOIN table2 t2


After NATURAL keyword we have to specify one of these <INNER, LEFT, RIGHT>, by default it works like an INNER JOIN.

Example of PostgreSQL NATURAL JOIN

For example, I have considered the table EMPLOYEE and ADDRESS, Both have the common column name (empid).

a) Default NATURAL Joins



SELECT e.fname,e.lname,e.username,e.contact,a.address1,a.address2,a.city,a.state,a.country
  FROM employee e

  NATURAL JOIN address a


Result:


Note: By default, it behaves like INNER JOIN

b) With LEFT keyword



SELECT e.fname,e.lname,e.username,e.contact,a.address1,a.address2,a.city,a.state,a.country
  FROM employee e

  NATURAL LEFT JOIN address a


Result:


c) With RIGHT keyword


SELECT e.fname,e.lname,e.username,e.contact,a.address1,a.address2,a.city,a.state,a.country
  FROM employee e

  NATURAL RIGHT JOIN address a


Result:


b) With FULL OUTER keyword


SELECT e.fname,e.lname,e.username,e.contact,a.address1,a.address2,a.city,a.state,a.country
  FROM employee e

  NATURAL FULL OUTER JOIN address a


Result:


7- SELF-JOIN



A self-join is the PostgreSQL statement in which a table is joined to itself, it is useful for comparing values in a column of rows within the same table.

Syntax of PostgreSQL SELF-JOIN



SELECT Col_1, Col_2...Col_N
FROM my_table a

INNER JOIN my_table b ON a.Col_1=b.Col_2;


Example of PostgreSQL SELF-JOIN

In the above EMPLOYEE table, I want to get the corresponding reporting managers of employees, then I will use the PostgreSQL SELF-JOIN


Execute the following query to get an accurate result.


SELECT a.fname,a.lname, (b.fname ||' '|| b.lname) managername
  FROM employee a
  LEFT JOIN employee b
  ON b.empid=a.managerid


Result:


Related Posts

What is the Use of isNaN Function in JavaScript? A Comprehensive Explanation for Effective Input Validation

In the world of JavaScript, input validation is a critical aspect of ensuring that user-provided data is processed correctly. One indispensa...