17 DBMS Interview Questions | Database Management System Interview Questions

dbms interview questions

Contributed by: Nallaperumal

Some of the topics that can be concentrated while attending DBMS interview questions are depicted below. These are some of the essential questions that you come across in any assignments or if they give some problem to solve on the spot.

We will see the topics and then jump on to the problem to have a quick look at some examples.

  1. Self-Join
  2. Aggregate based on dates (Month, year, week)
  3. Cumulative sum & rolling averages
  4. Multiple join conditions
  5. Use of Analytical function
  6. Use of Window functions
  7. Cross join
  8. Use of CASE and PIVOT statements (Display rows as columns vice versa)
  9. SQL – NULL Values
  10. GROUP BY and HAVING clause (Difference between Group by and Having)
  11. Use of LIMIT Clause
  12. UNIQUE and PRIMARY Key difference
  13. Use of DISTINCT and ORDER BY
  14. Use of functions such as SUBSTRING, LENGTH, UPPER, LOWER
  15. Conditions like EXISTS and IN

Before jumping into the above topics I would like to give a quick overview on the topic CTE (Common Table Expression). This can be used as an alternative to a subquery or if you want to combine multiple tables with some unique conditions this can be leveraged. It even works well for UPDATE, DELETE and INSERT. If you also want to go for scenarios where it is recursive this can be utilized. It acts as a temporary table.

CTE (Common Table Expressions)

It starts with a WITH clause followed by a temporary table name.

With [CTE_NAME] AS
(
CTE QUERY
)

CTE NAME – temporary table name

CTE Query – Query that we are going to use

Table: Employee_details

EmpID Emp_Name  City
1 Jacob Chennai
2 Utkarsh Bangalore
3 Ritesh Kumar Hyd
4 Kanan Kochi
5 Madhan Chennai

If we want to display only the Emp Name and EmpID from the above table corresponding to city Chennai then,

Solution:

WITH
CTE AS (SELECT Emp_Name, EmpID  FROM employee WHERE city = ‘Chennai’)
select * from CTE;

Output: Here point to note is that whatever you have selected inside the CTE query only those columns will be accessible.

Emp Name  EmpID
Jacob 1
Madhan 5

We can also add more tables in the CTE by giving a comma and naming the next temporary tablename.

1. SELF-JOIN: (Manager /Employee hierarchy)

Let say the query is to find out the list of Manager hierarchy for the employee Madhan.

Table name:  Employee

EmpID Emp_Name  Manager_ID
1 Jacob NULL
2 Utkarsh 6
3 Ritesh Kumar 1
4 Kanan 2
5 Madhan 3
6 Naveen NULL

Solution:

Our thought process should be Madhan->Ritesh Kumar->Jacob 

The above all rows should be displayed. We will use a CTE here (Note the following is based on Oracle SQL) slight change of syntax might occur in other backend utilities.

WITH EMP_CTE_MNG AS
(
SELECT EmpID, Emp_Name,manager_ID FROM Employee WHERE ManagerId IS NULL
),
EMP_LOW_LEVEL(
SELECT T.EmpID,T.Emp_Name , T.manager_ID FROM Employee T
INNER JOIN Employee C ON C.EmployeeId=T.ManagerId
And T.EmpID = 5 )
SELECT a.EmpID,a.Emp_Name,a.manager_ID from  EMP_CTE_MNG 
union all 
SELECT b.EmpID,b.Emp_Name,b.manager_ID from EMP_LOW_LEVEL

Output:

EmpID Emp_Name  Manager_ID
1 Jacob NULL
3 Ritesh Kumar 1
5 Madhan 3

This type of problem is called as recursive problem statement.

2. AGGREGATE BASED ON DATES:

Let us look at a problem where we have a pictures table and say we want to display the total number of pictures sold with respect to each month and yearly wise then

Table: Pictures

Date_created No_of_pictures_Sold
1/1/2020 200
2/2/2020 400
2/3/2020 500
3/4/2020 800
3/5/2020 1200
12/5/2019 5000

Solution:

select extract(year from date_created) as year, extract(month from date_created) as month,
 sum(No_of_pictures_Sold) as Total_pictures_sold
from Pictures
group by extract(year from date_created), extract(month from date_created)
order by year,month asc;





Output:

Year  Month Total_pictures_sold
2019 12 5000
2020 1 200
2020 2 900
2020 3 2000

3. CUMMULATIVE SUM and ROLLING AVERAGE:

Write a query for the following table to find out the cumulative amount for each day.

Table: Amounts

Date Amount
1/1/2020 100
1/2/2020 -100
1/3/2020 200
1/4/2020 300
1/5/2020 500

Solution: 

Here we can solve using two ways. One main advantage of SQL is that we can solve the problem with many solutions but you have to get the problem right.

Solution a:

SELECT a.date date, SUM(b.amount) as cummulative_amount FROM amounts a
JOIN amounts b 
ON a.date >= b.date
GROUP BY a.date 
ORDER BY date ASC





Solution b: (using window functions)

SELECT date, 
 SUM(amount)OVER (ORDER BY date ASC)
 as cumulative_amount 
FROM Amounts
ORDER BY date ASC


Output:

Date cummulative_amount 
1/1/2020 100
1/2/2020 0
1/3/2020 200
1/4/2020 500
1/5/2020 1000

ROLLING AVERAGE:

The following problem is an adaptation from https://www.sisense.com/blog/rolling-average/

Write a query to get 7-day rolling (preceding) average of daily Sales amount.

Table: Sales

Date Sales_Amount
1/1/2020 20
1/2/2020 30
1/3/2020 40
1/4/2020 50
1/5/2020 60
1/6/2020 70
1/7/2020 80
1/8/2020 90
1/9/2020 100
1/10/2020 110

Solution:

SELECT a.date, AVG(b.Sales_Amount) Avg_Sales_Amount 
FROM  Sales a 
INNER JOIN Sales b ON a.date <= b.date + interval '6 days' 
AND a.date >= b.date
GROUP BY  a.date

The above solution can also be handled using analytical functions in oracle such as preceding and current rows. Say if we have month and  amount if we want to find out moving average for 3 months then,

Month Amount Moving_Avg
1 200 0.00
2 300 0.00
3 400 300.00
4 500 400.00
5 700 533.33
6 800 666.67
7 1200 900.00
8 300 766.67
9 400 633.33
10 500 400.00
11 900 600.00
12 1000 800.00

To get the moving average for the above we will use the following query:

SELECT month, SUM(amount) AS month_amount,
AVG(SUM(amount)) OVER
 (ORDER BY month ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS moving_average
FROM  sales
GROUP BY month
ORDER BY month;

4. USE OF ANALYTICAL FUNCTIONS:

There are many analytical functions which we use but here we are going to look upon two functions LEAD and LAG.

Let us see with an example supposing if we want to see the order date and when a previous order was made for a particular order with respect to same product we can leverage by making use of LAG function.

Table : Products

ORDER_DATE PRODUCT_ID QTY
9/25/2010 1000 20
9/26/2010 2000 25
9/27/2010 1000 90
9/28/2010 2000 100
9/29/2010 2000 20
9/30/2010 1000 5

Solution:

SELECT product_id, order_date,
LAG (order_date,1) OVER (PARTITION BY product_id ORDER BY order_date) AS prev_order_date
FROM orders;

Output:

PRODUCT_ID ORDER_DATE PREV_ORDER_DATE
1000 9/25/2010 NULL
1000 9/27/2010 9/25/2010
1000 9/30/2010 9/27/2010
2000 9/26/2010 NULL
2000 9/28/2010 9/26/2010
2000 9/29/2010 9/28/2010

For the same problem if we need to look at next order date then,

SELECT product_id, order_date,
LEAD (order_date,1) OVER (PARTITION BY product_id ORDER BY order_date) AS next_order_date
FROM orders;

Output:

PRODUCT_ID ORDER_DATE NEXT_ORDER_DATE
1000 9/25/2010 9/27/2010
1000 9/27/2010 9/30/2010
1000 9/30/2010 NULL
2000 9/26/2010 9/28/2010
2000 9/28/2010 9/29/2010
2000 9/29/2010 NULL

6. USE OF WINDOW FUNCTIONS:

Let us say we have a Salaries table – If we need to find out the top 3rd salary from the table

Table : Salaries

Dept No empID salary
IT Engineer 1 50000
IT Engineer 4 45000
IT Engineer 3 30000
IT Engineer 2 10000
IT Engineer 5 35000
Infrastructure 6 55000
Infrastructure 7 50000
Data Analyst 9 55000
Data Analyst 8 50000
Data Analyst 10 30000

Solution:

WITH sal_rank AS 
 (SELECT empID, 
    DENSE_RANK() OVER(ORDER BY salary DESC) Dense_Rnk  FROM  salaries)
SELECT empID FROM  sal_rank
WHERE   Dense_Rnk = 3;

Here, I would suggest using DENSE_RANK() , if we go by RANK() it will skip ranks if the values are the same that is why dense_rank is recommended here. Let us look at the output of rank and dense_rank as well.

Dept No empID salary Rank Dense Rank
IT Engineer 1 50000 3 2
IT Engineer 4 45000 6 3
IT Engineer 3 30000 9 6
IT Engineer 2 10000 10 7
IT Engineer 5 35000 7 4
Infrastructure 6 55000 1 1
Infrastructure 7 50000 3 2
Data Analyst 9 55000 1 1
Data Analyst 8 50000 3 2
Data Analyst 10 30000 9 6

Rank will skip the rows if two salaries are the same. For instance, in the above case if you see it has skipped rank 2 because the first salary which is the highest has repeated twice.

7. CROSS JOIN:

Consider a table state Channels where each row is a state and total hours watched per week for a channel.

Table : Channels

State Total_hours_watched
TN 12000
KL 14000
AP 5000
KA 5200
TA 4000
MH 3500

Question: 

Write a query to get the pairs of states with total channel hours within 2000 of each other. 

For the snippet above, we would want to see something like (we need all the list this is just a sample)

Solution:

SELECT
 a.state as State_A, 
 b.state as State_B 
 FROM    Channels  a
 CROSS JOIN Channels  b 
 WHERE ABS(a. Total_hours_watched - b. Total_hours_watched) < 2000
  AND a.state <> b.state ;

Note: In the above query if we want to eliminate the duplicates (i.e) TN and KL should appear only once in that case a minor tweak is required.

  SELECT
 a.state as State_A, 
 b.state as State_B 
 FROM    Channels  a
 CROSS JOIN Channels  b 
 WHERE ABS(a. Total_hours_watched - b. Total_hours_watched) < 2000
  AND a.state > b.state


8. USE OF CASE AND PIVOT STATEMENT:

PIVOT is usually used to convert rows to columns and vice versa. If you also want to do a mapping with respect to month wise, we can make use of pivot. 

Oracle PIVOT
Image credits:  Oracletutorial.com

Table: orders

order_id customer_ref product_id
100001 SMITHA 10
100002 SMITHA 20
100003 ANDREW 30
100004 ANDREW 40
100005 JAMES 10
100006 JAMES 20
100007 SMITHA 20
100008 SMITHA 10
100009 SMITHA 20

If we want to look at cross tabulation as in what is the number of times each products are bought by the customer.

Solution:

SELECT * FROM
(
  SELECT customer_ref, product_id
  FROM orders
)
PIVOT
(
  COUNT(product_id)
  FOR product_id IN (10, 20, 30)
)
ORDER BY customer_ref;

Output:

customer_ref 10 20 30
ANDREW 0 0 1
JAMES 1 1 0
SMITHA 2 3 0

CASE Statements:

There are some exceptional cases where CASE statements can be used as a substitute to PIVOT as well.

Let us try to look at the example:

If you want to update or display certain values or update columns based on certain conditions we can leverage CASE statements.

Table: Products

Product_ID Product_Name Amount
1 PC 30000
2 Iphone 50000
3 Moto 12000
4 Speaker 1200
5 Book 500

Say if the ask is to find to label the items as luxury,economy,mass based on the amount range say

0-100 – Cheap
101-1500 – Mass
1501-30000-economy
>30000 as Luxury 

Then, CASE statement would come in handy.

Solution:

SELECT
  Product_ID,
  Product_Name,
  Amount ,
  CASE
  WHEN Amount between 0 and 100 THEN 'Cheap'
  WHEN Amount between 101 and  1500 THEN 'Mass'
  WHEN Amount between 1501 and  30000   THEN 'Economy'
  WHEN Amount >30000  THEN 'Luxury'
  ELSE 'Not Available'
  END product_group
FROM  products ;


One advise here would be CASE always works in first come first in basis – Meaning (If a conditions satisfies for a row it will not go for next one ) 

Supposing in the above say if we have amount = 0 as ‘NA’ but in your CASE statement if the first condition is Amount <=100 and last condition is amount = 0. It will take up first and will ignore the last.

9. SQL-NULL Values:

With respect to NULL values always remember if you are going to check for a condition in a column which has null values then it should be of the form ‘…WHERE COLUMN_NAME IS NULL

One more point to note here is that NULL is never equal to NULL***.

10. GROUP BY AND HAVING clause:

Say you have a table named Employees

Table Name: Employee

EmpID Department Salary
1 IT 55000
2 IT 60000
3 IT 40000
4 Sales 35000
5 Sales 44000
6 IT 35000

If our ask is to find out the number of people with respect to each department who are getting salary greater than or equal to 40000 and having more than 2 person (in nos) in the respective department.

Solution:

SELECT department, COUNT(*) AS "Number of employees"
FROM employees
WHERE salary >= 40000
GROUP BY department
HAVING COUNT(*) > 2;

Output:

Department Number of employees
IT 3

The above GROUP BY can also be used to find out aggregate functions (such as SUM,MIN,MAX) etc.

One  pro tip here would be – Always it is a good practice to write to the columns selected in SELECT clause to incorporate that in the GROUP BY clause (though in some SQL IDEs it will accept other columns) but it is ideal to have the columns that are provided in SELECT clause.

If you are going to use an AGGREGATE function then GROUP is recommended most of the times.

WHERE Clause HAVING Clause
It is used prior to GROUP BY It is used after GROUP BY
Mostly if you want a pre-filter go for WHERE clause If you want a post-filter after GROUP BY go for HAVING
It is not generalized to GROUP BY It works only with GROUP BY

The above differences can be related to query used before.

11. LIMIT clause:

As the name suggests, if you want to LIMIT the result that is displayed in the SQL result. You can make use of this LIMIT functionality.

This LIMIT is not supported by all SQL versions in some of them it may not work.

Lets say you have products table as shown below

Table Name: Products

ORDER_DATE PRODUCT_ID QTY
9/25/2010 1000 20
9/26/2010 2000 25
9/27/2010 1000 90
9/28/2010 2000 100
9/29/2010 2000 28
9/30/2010 1000 35

Our ask is to find out the products where the qty is greater than 25 and to display only top3 items.

Solution:

SELECT product_id, qty
FROM products
WHERE qty > 25
ORDER BY qty DESC
LIMIT 3;

Output: It will display only top 3 rows

PRODUCT_ID QTY
2000 100
1000 90
1000 35

To Generalize the GROUP BY, HAVING and LIMIT clause,

I have portrayed a picture to depict as how you should ideally look at them. Note that the following is just for your better understanding. In real-time along with the following the execution steps goes through a series of process but here we are just outlaying the overall idea.

12. Difference between PRIMARY and UNIQUE KEY:

PRIMARY KEY UNIQUE KEY
The main purpose of the primary key is to provide a means to identify each record in the table. Unique can be used to ensure rows are unique with respect to the particular column
There can be only one primary key for a table There can be multiple UNIQUE keys for a table
It should always be NOT NULL and should not get repeated Column can contain NULL
It results in CLSUTERED index by default. It is used when we want our columns to be unique

13. Use of DISTINCT and ORDER BY:

If we want to display only unique values for a column, we use DISTINCT.

Consider the table films:

ID LENGTH FILM
1 220 FilmName1
2 220 FilmName2
3 230 FilmName3
4 220 FilmName4
5 240 FilmName5
6 230 FilmName6
7 120 FilmName7

If we want to find out unique length of films then,

Solution:

SELECT DISTINCT length 
FROM films;

It shows the unique value in the table.

14. Use of functions such as SUBSTRING, LENGTH, UPPER, LOWER:

Subbstr() is used to extract a certain number of letters from the column.

For example:

It starts with 1 as index:

Substr(column_name,start_position,length)

SUBSTR(‘Great Lakes’, 1, 5)

Result: Great

SUBSTR(‘Great Lakes’, 7, 2)

Result: La

UPPER and LOWER: These are used to convert column names into capital letters or lower case.

15. Difference between EXISTS and IN:

EXISTS clause IN clause
The EXISTS operator stops scanning rows once the subquery returns the first row because it can determine the result The IN operator must scan all rows returned by the subquery to conclude the result
IN clause can’t compare anything with NULL values EXISTS clause can compare everything with NULL values
The EXISTS operator is faster than IN operator when the result set of the subquery is large The IN operator is faster than EXISTS operator when the result set of the subquery is small

16. Query to find monthly change in a column from table A 

Select month,
          Column_A,
         Column_A - LAG(Column_A,1) Over (Order By month) As Month_on_Month_change
From table_X ;

17. Query to find Running total by Business Units for a specific company 

Select year,
 Business_Unit,
Amount_utilized,
Sum(Amount_utilized) 
Over (Partition By 
Business_Unit Order by 
Year) As RunningTotal
From 
Table_companies;

Inspired by Eric Weber.

Some Pro tips for Interviews:

  • Understand the problem clearly, determine the main objective of the problem, try to visualize the output once.
  • Understand the ER diagram provided and see how the mapping is done.
  • Try to identify the type of JOIN you want to use (INNER, CROSS, LEFT, RIGHT). If it is a self—join problem, handle them carefully.
  • If it is on the spot problem – try to verbalize your assumptions and by clarifying the same with the interviewer as well so that in case if you see some difference – you can correct it.
  • Most interviewers will be looking upon the logic you are going to use. Be precise and clear of what is expected from the question asked.
  • Practice is the only key for any language and with respect to SQL – it is 100% true.
  • Try to do some exercise from hacker rank.
  • Lastly with respect to SQL there can be multiple approaches to solve a problem, just see to it that the end results are met (not only for specific cases but also for some corner cases as well).

I would like to conclude by saying that– 

“The difference between ordinary and extraordinary is practice” –Vladimir Horowitz

This brings us to the end of the blog on DMBS Interview Questions. We hope that these DBMS Interview Questions help you. If you wish to learn more such concepts, you can join Great Learning Academy’s Free Online Courses and learn the most in-demand skills today.

Frequently asked Questions on DBMS

What are the 4 types of DBMS?

The main Four Types of Database Management System are:

  • Hierarchical database: Hierarchical database model is organized in a tree structure and represented using a parent-child relationship; In which a parent may have many children but a child has only one parent.
  • Network database: Network database model also represents the parent-child relationship in which a child has multiple parents.
  • Relational database: This is one of the most widely used models, where it can normalize data in the rows and columns of the tables.
  • Object-Oriented database: In object-oriented databases, data is stored in the form of many objects.

What are the important topics in DBMS?

Important topics in DBMS are as follows:

  • Entity-Relationship Model
  • Relational Model
  • Relational Algebra
  • Functional Dependencies
  • Normalisation
  • Transactions and Concurrency Control
  • Indexing, B and B+ trees
  • File Organization
  • Advanced Topics
  • SQL
  • Quick Links

How do I prepare for a database interview?

  • Hop on over to the MySQL Basics course or Introduction to SQL on Great learning academy.
  • Once you’ve got that under your belt, spend some time studying the SQL interview questions that are likely to be asked in your interview.
  • Now keep practising SQL queries, understand SQLs place in the world, be honest about your skill, get real-world experience in SQL and crack the interview with confidence. 

What are the examples of DBMS?

Some of the real-time DBMS examples are : 

Sector Use of DBMS
Financial sector/Bank For account activities, payments, loans, etc.
College/Schools For student information and other activities
Telecommunication For keeping call records, maintaining balances, etc.
Marketing and Sales Use for storing customer, product & sales information.
Manufacturing For the management of the supply chain and for tracking the production of items.
HR Management Used to track the information about employees, salaries, payroll, deduction and many more

What is data integrity in DBMS?

Data Integrity in DBMS is defined as the accuracy and consistency of data throughout its design, implementation and usage stages. Here all the data must be accurate and reliable and data characteristics must be correct – including business rules, relations, dates, definitions and lineage for data to be complete.

What is the purpose of DBMS? 

The purpose of DBMS is to collect the programs that enable the user to create and maintain the database. 

What are the advantages of DBMS?

Advantages of Database Management System are: 

  • Improved data integrity.
  • Improved data security.
  • Improved data sharing.
  • Better data integration.
  • Improved decision making.
  • Reduced data redundancy.
  • Reduced updating errors and increased consistency.
  • Reduced data entry, storage, and retrieval costs.

Is DBMS easy?

Yes, DBMS is easy to learn and understand.

Make sure you improve your skills with the help of the DBMS course and understand how to generate, insert, update and delete data from a table.

One needs to start with a simple table and

  • Try to modify the table in different ways
  • Make your own logical queries with satisfiable conditions
  • With the same table try to retrieve the data in different formats

What are the components of DBMS?

Components of DBMS

DBMS has several components and each of the components perform significant tasks in the DBMS environment.

  • Software: DBMS software component is used to control and manage the overall database. 
  • Hardware: DBMS hardware component consists of a set of physical electronic devices such as computers, I/O devices, to provide the interface between computers and real-world systems.
  • Data: DBMS is all about collecting, storing, processing and accessing the data. So, this is one of the important components in  DBMS.
  • Procedures: The procedure component is used in designing, running the database and to guide the users that operate and manage it.
  • Database Access Language: This component is used to access the data, enter the new data, update the data, and retrieve the data. 
  • Query Processor: This component is used to transform the user queries into a series of low-level queries and translates them into an efficient series of operations.
  • Run Time Database Manager: This component handles the database at run-time and converts operations into queries. 
  • Data Manager: This component is also called cache manager which is responsible for providing recovery to the system that allows it to recover the data after a failure.
  • Database Engine: This component is a core service in DBMS that manages the 
  • storing, processing, and securing data. 
  • Data Dictionary:  This component is a reserved place in DBMS where it stores information about itself.
  • Report Writer:  This component is responsible for generating reports. 

What are the features of DBMS?

DBMS Features are: 

  • Data Availability.
  • Minimized Redundancy.
  • Database Customization.
  • Data Accuracy, Consistency and Relevance.
  • Data Structuring.
  • Improved Data Security.
  • Easiness in Data Management.
  • File Consistency.

2

Related Articles

Leave a Reply

Back to top button