On the job nifty hacks – Postgres


Setting up a Docker environment

The easiest way to get up and running with these simple hacks is to install Docker and run a postgres container, we can then connect to the container using a simple IDE – personally I use PSequel, its free and easy to use with my Mac.

Postgres hacks

1 ) plpgsql function for when GreenPlum doesn’t like Postgres

While developing a pipeline we wanted to keep all of our hard codes dependencies in one table which we could access as developers and the business could access through automated documentation that is produced from the table itself.

This way we can have a quality assurance that goes along with our output and make the development more transparent. To do this we thought we should make some useful SQL functions that make the script less verbose and save our fingers!

First we created a table to accept the data we wanted to store in it:

The dummytable we will use in this example is:

The initial function was as follows:

Now if we try to use this function (using GreenPlum):

We would expect this table:

namedescrcodescodes_descr
animal_codesCodes of the animals used in the Zoo pipeline{"CHI", "BA", "COW", "MON"}{"Chickens are on the run", "The bats have escaped", NULL, NULL}

But instead we get an error:

ERROR function cant execute on segment because it accesses relation “dummytable”

Solution

Now we get the required solution, simply by adding the insert into the function!

namedescrcodescodes_descr
animal_codesCodes of the animals used in the Zoo pipeline{"CHI", "BA", "COW", "MON"}{"Chickens are on the run", "The bats have escaped", NULL, NULL}

2 ) One case statement with multiple columns on output

Say for example, we have a single condition where we want to display information in our table as follows:

A timestamp of the application, a description of the application outcome and the number of days since that outcome was decided. This is easy enough to do in something like Python where we would just give a conditional statement and make all the changes we needed to for multiple columns. However in Postgres this is a little tricky, the solution comes in the form of making our own data type with the data types of the new columns we want to output as a record type.

So for our example, we would want a new data type that holds a timestamp, an integer and text. We can our new data type a very imaginitive name of floatytextdate. Lets also make some dummy data to do with applications of visas and passports:

Now time for the cheeky hack!

Notice how we only used a single case statement but ended up with three columns with different data types outputed! Pretty nifty.

Result

days_sincedays_since_reasonrelevent_date
251Days since most recent passport has been valid2017-05-23 00:00:00
1961Days since most recent passport has been valid2012-09-16 00:00:00
1823Days since most recent visa expired2013-02-01 00:00:00
251Days since most recent visa has been valid2017-05-23 00:00:00
1823Days since most recent passport expired2013-02-01 00:00:00

Leave a comment

Your email address will not be published. Required fields are marked *