Joins and Sub Queries (DML)

EQUI JOINS (Two Tables)

If we did the following query we do not get what their salary actually is - only their grade.

SELECT emp_fname, emp_sname, grade
FROM employee;

We know that the grade in the employee table links to the grade in the pay_structure table. If we refer to the salary and tell the query to include data from two tables we can retrieve not only the grade but also the salary that grade refers to.

SELECT emp_fname, emp_sname, employee.grade, salary
FROM employee, pay_structure;

Note: Why do we need to put employee.grade? Because grade is a field in both employee and pay_structure so we need to specify which table the grade we want to include is referring to.

Example 2

Suppose we want to display only the employees who have been assigned to a project (and therefore have an entry in the emp_on_project table).

SELECT emp_fname, emp_sname
FROM employee, emp_on_project
WHERE employee.emp_no = emp_on_project.emp_no;

What is the problem with this query and why does it output this result?

Try this:

SELECT DISTINCTemp_fname, emp_sname
FROM employee, emp_on_project
WHERE employee.emp_no = emp_on_project.emp_no;

Now let us refine this query further by only considering employees whose projects have duration of more than 9 months:

SELECT DISTINCT emp_fname, emp_sname
FROM employee, emp_on_project
WHERE employee.emp_no = emp_on_project.emp_no
AND months_allocated_to_project>9;

EQUI JOINS (Three Tables)

Equi joins for three tables is no different in principle or practice than equi joins for two tables.

We could output just the contents of the emp_on_project table to display what employees have been allocated to each project (or which employees are working on a particular project).

SELECT emp_no, proj_code FROM emp_on_project

However the data would be in the following format:

Emp_No / Proj_Code
12 / A21

Clearly this is not as easy to understand as it could be because it is not clear who employee 12 is or what project A21 refers to. So it would be much better if we could include the employee name and the project description.

What we need to do is include information from both the employee and project tables. Using the same method described in two table queries, our query should look for a match between the employeenoin the emp_on_project table and the emp_no in the employee table (to retrieve the name) and the proj_code in the emp_on_project table and the proj_code in the project table (to retrieve the project description) - when it finds a match the employee name and project description are outputted.

SELECT employee.emp_no, emp_fname, emp_sname, emp_on_project.proj_code, proj_desc
FROM employee, project, emp_on_project
WHERE emp_on_project.emp_no = employee.emp_no
AND emp_on_project.proj_code = project.proj_code

Note: If you do not include both conditions (the WHERE and the AND) the results will not be correct.

LEFT JOIN

The LEFT JOIN will return all the rows from the FROM table regardless of whether a match is found in the LEFT JOIN one. Here it is displaying a list of all project descriptions and codes – but it is getting the code from the emp_on_project table rather than the project table (as this table that determines whether or not a particular project has had any employees assigned to it – and therefore may not have a project code for each project).

SELECT proj_desc, emp_on_project.proj_codeFROM projectLEFT JOIN emp_on_projectON emp_on_project.proj_code = project.proj_code;

Note: The FROM table now only contains a single table

If you wanted to be more specific and only show a single (DISTINCT) instance of each project you would amend the query as follows:

SELECT DISTINCT proj_desc, emp_on_project.proj_codeFROM projectLEFT JOIN emp_on_projectON emp_on_project.proj_code = project.proj_code;

RIGHT JOIN

RIGHT JOIN will return all the rows in the table given as the RIGHT JOIN regardless of whether there is a match on the condition ON from the table specified in FROM.

Here it will display all the employees on the system and the projects they are assigned to; if a project has not been assigned to an employee it will still display their name but have NULL (empty) next to the proj_code field indicating that they haven’t got an entry in the emp_on_project table.

SELECT employee.emp_No, emp_fname, emp_sname, proj_code
FROM emp_on_project
RIGHT JOIN employee
ON employee.emp_no = emp_on_project.emp_no;

INNER JOIN

An INNER JOIN works in the same way as the examples shown above - it only produces rows where the joins can find corresponding values in both tables referred to in the ON part.

SELECT employee.emp_No, emp_fname, emp_sname, proj_codeFROM emp_on_projectINNER JOIN employeeON employee.emp_no = emp_on_project.emp_no;

Multiple JOINs

This example builds on the one above and displays not only the employee details (from employee) for all employees in the emp_on_project table - along with the project code cited - but also the project description from the project table. This requires both a JOIN to the emp_on_project table and a JOIN to the project table.

SELECT employee.emp_no, emp_fname, emp_sname, emp_on_project.proj_code, proj_desc
FROM employee
INNER JOIN emp_on_project ON emp_on_project.emp_no = employee.emp_no
INNER JOIN project ON emp_on_project.proj_code = project.proj_code

Sub Queries

Sub queries enable greater flexibility in query creation. They allow the query designer to develop specific queries to match detailed conditions which would not be possible using any other method.They involve nesting a query within a query.

This query shows all the employees who have NEVERbeen allocated to a project (in emp_on_project)

SELECT emp_no, emp_fname, emp_sname
FROM employee
WHERE emp_no NOT IN
(SELECT DISTINCT emp_no
FROM emp_on_project);

To develop these sort of queries we have to build them up in parts and test them. If we wanted to find out what the maximum number of projects anyone was working for and just display a list of all the employees with this number of projects we have three things to find:

1)To produce a count of the number of projects each employee is working on

2)Determine the highest number from this count

3)Identify all employees who are assigned a number of projects equivalent to this count

1) So to identify point one we could write and test the following:

SELECT count(emp_on_project.emp_no) AS total_proj
FROM emp_on_project
GROUP BY emp_no

2) Now to determine the highest number of projects (which we know from the list but we need a way of isolating this value). To do this we simply need to sort on the total field (the count) and extract the first record (limit the results to just one entry).

SELECT count(emp_on_project.emp_no) AS total_proj
FROM emp_on_project
GROUP BY emp_no
ORDER BY total_proj DESC
LIMIT 1

3) Now we can use this result to search all entries and find just those that match the results of this entry. We embed the query above as a sub query which is used by the other query to identify only entries from emp_on_project that match the count returned by the sub query:

SELECT emp_on_project.emp_no, count(emp_on_project.emp_no) AS total_proj
FROM emp_on_project
GROUP BY emp_no
HAVING total_proj=
(SELECT count(emp_on_project.emp_no) AS total_proj
FROM emp_on_project
GROUP BY emp_no
ORDER BY total_proj DESC
LIMIT 1)

1