这两道题都基于hr.employees表,表的结构如下:
题一:注意count的坑
1、编写一个查询显示员工总数以及其中在2004年、2005年、2006年和2007年聘用的员工数。创建合适的列标题。(注:分别使用横表、纵表)
横表
横表的目标查询结果如下:
这道题我思考了半个多小时。一开始,我的想法是:
select count(employee_id) as "total",
count(extract(year from hire_date)=2004) as "2004",
count(extract(year from hire_date)=2005) as "2005",
count(extract(year from hire_date)=2006) as "2006",
count(extract(year from hire_date)=2007) as "2007"
from hr.employees;
但是结果是这样的:
于是,我想着先把count
去掉,看看每一列的内容究竟是什么。当然,这样的话就得暂时先使用group by
了:
select count(employee_id) as "total",
extract(year from hire_date)=2004 as "2004",
extract(year from hire_date)=2005 as "2005",
extract(year from hire_date)=2006 as "2006",
extract(year from hire_date)=2007 as "2007"
from hr.employees
group by "2004", "2005", "2006", "2007";
显示的结果如下:
因此,我便产生了在count
中增加判断结果为true
的想法,也就是:
select count(employee_id) as "total",
count((extract(year from hire_date)=2004)=true) as "2004",
count((extract(year from hire_date)=2005)=true) as "2005",
count((extract(year from hire_date)=2006)=true) as "2006",
count((extract(year from hire_date)=2007)=true) as "2007"
from hr.employees;
但是,结果依然不对!结果所显示的依旧是:
这让我感到十分疑惑。于是,我去查询了count()
函数的用法才发现,count()
函数不会统计null
值,但是会统计 bool 为false
的值。这是问题的核心。因此,我需要去手动地把false
转为null
,因此我决定去使用case...when
语法,果然奏效。
最终的sql语句为:
select count(extract(year from hire_date)) as "total",
count(case when extract(year from hire_date)=2004 then 1 else null end) as "2004",
count(case when extract(year from hire_date)=2005 then 1 else null end) as "2005",
count(case when extract(year from hire_date)=2006 then 1 else null end) as "2006",
count(case when extract(year from hire_date)=2007 then 1 else null end) as "2007"
from hr.employees;
结果为:
纵表
纵表的目标查询结果为:
这里,我觉得难点是对“总数”这一行的实现。如果没有这里的“总数”的话,我可以这样来实现:
select extract(year from hire_date) as "年份", count(employee_id) as "雇佣员工数"
from hr.employees
where extract(year from hire_date) in (2004, 2005, 2006, 2007)
group by "年份";
这样写的话“年份”这一列的数据类型就是 double precision。而题目已经提示我说“年份”这一列的数据类型为 text,因此在琢磨一番后,我采用了如下的写法:
select '2004' as "年份", count(*) as "雇佣员工数" from hr.employees where extract(year from hire_date)=2004
UNION ALL
select '2005' as "年份", count(*) as "雇佣员工数" from hr.employees where extract(year from hire_date)=2005
UNION ALL
select '2006' as "年份", count(*) as "雇佣员工数" from hr.employees where extract(year from hire_date)=2006
UNION ALL
select '2007' as "年份", count(*) as "雇佣员工数" from hr.employees where extract(year from hire_date)=2007
UNION ALL
select '总数' as "年份", count(extract(year from hire_date)) as "雇佣员工数" from hr.employees;
我这样处理的思路是:逐行判断,一行一个 where 来约束条件。尤其是 select '总数' as "年份"
,这让我在年份这列中新建了一个自定义的行。
这样就完美达到了结果:
题二:交叉表
2、编写一个矩阵查询显示department_id为20、50、80和90的职务(job_id)、基于department_id的职务平均薪金、以及该职务的总薪金,并为每个列给出合适的标题。
目标查询结果为:
一开始我也有点被吓到了,后来转念一想,这和上面的case...when
实际上是一个套路!只不过,这里要注意一下when
后的then
。
我的写法为:
select job_id as "Job",
round(avg(case department_id when 20 then salary else null end), 2) as "Dept 20",
round(avg(case department_id when 50 then salary else null end), 2) as "Dept 50",
round(avg(case department_id when 80 then salary else null end), 2) as "Dept 80",
round(avg(case department_id when 90 then salary else null end), 2) as "Dept 90",
sum(salary) as "Total"
from hr.employees
group by job_id;
结果为:
完美达到了预期。