generate_series to cover for missing data in reports
Problem: Generating a nice report grouped by date while some dates are missing data. (I admit the topic indicates something a little bit more exciting...)
Previously I have solved this kind of problem in the client code or by having some sort of calendar table to join with. This time I don't have any client code (yet) and I have no other tables to use. Lucky me I'm using PostgreSQL since it has the generate_series function.
So basically all I have to do is a RIGHT OUTER JOIN with generate_series as dates:
select date '2009-02-01' + generate_series(0 span>,6);
An full example to play with:
drop table if exists mytable; create table mytable(id serial, day date, metric int); insert into mytable(day, metric) values span>('2009-02-01', random()*10), ('2009-02-01', random()*10), ('2009-02-01', random()*10), ('2009-02-03', random()*10), ('2009-02-03', random()*10), ('2009-02-03', random()*10), ('2009-02-03', random()*10), ('2009-02-04', random()*10), ('2009-02-04', random()*10), ('2009-02-05', random()*10), ('2009-02-05', random()*10), ('2009-02-05', random()*10), ('2009-02-05', random()*10), ('2009-02-07', random()*10), ('2009-02-07', random()*10);
Now try it out:
select coalesce(mytable.day, foo.day) as day, coalesce(mytable.sum, foo.sum) as sum from (select day, sum(metric) from mytable group by day) as mytable right outer join (select date '2009-02-01' + foo.day as day, 0 as sum from generate_series(0,6) as foo(day)) as foo on mytable.day=foo.day order by day;
As stated earlier I have usually solved this problem in client code. I recall doing so both when using MSSQL and Oracle although I haven't worked with them for a long while now. I can't recall ever having done any similar reports while using MySQL.
I'm curious about the current state of Oracle, MSSQL and MySQL since I'm not actively using them. Is there something similar as generate_series? What about DB2?
I don't think there is one built in in any of these other databases. I still use SQL Server 2005 and above a lot and at least I don't know of one.
Simon Greener created one for MSSQL and Oracle.
SQL Server 2008: http://www.spatialdbadvisor.com/sqlserverblog/86/generateseries-for-sql-server-2008
Oracle: http://www.spatialdbadvisor.com/oraclespatialtipstricks/82/generateseries-an-oracle-implementation-in-light-of-sql-design-patterns
In Oracle there is an easy way to generate rows on the fly by using a recursive query on dual. The following query returns 1..10:
SELECT level FROM dual CONNECT BY level <= 10
Incidentally I wrote an article about the same topic a couple of days ago: http://pgtuning.net/archives/7-Generate-rows-that-arent-there.html
In Oracle, there is no direct equivalent that I am aware. Instead, the tip I've seen is to "use" the ALLOBJECTS "table" (a part of system catalog which always exists and has many rows). By that approach, the right join subquery would be:
SELECT DATE '2009-02-01' + rownum-1 AS day
FROM allobjects
WHERE rownum <= 6
In MySQL you need to fake rows with SELECT ... UNION SELECT ... or Cartesian combinations thereof. Or pick series from existing tables with enough sequential ids.