PostgreSQL
The world's most advanced open source database

..::Planet PostgreSQL::..

September 01, 2010

Selena Deckelmann

Explaining MVCC in Postgres: system defined columns

I’m playing around with some diagrams for explaining MVCC that I’ll be posting here over the next few days. Not sure if I’ll end up giving up on slides and just use a whiteboard for the talk. I made an illustrated shared buffers deck to go along with Greg Smith’s excellent talk on shared buffers a while back. This is the beginning of a talk that I hope will emulate that.

Here are my first few slides, showing the system-defined columns. The next few slides will describe optimizations PostgreSQL has for managing the side effects of our pessimistic rollback strategy, and reducing IO during vacuuming and index updates.

Related posts:

  1. Postgres mailing list traffic over time
  2. FSM, visibility map and new VACUUM awesomeness
  3. PgCon 2010 – PL/Parrot, Simulated Annealing, Exclusion Constraints, Postgres-XC

by selena at September 01, 2010 03:00 PM

August 31, 2010

Selena Deckelmann

Using logger with pg_standby

Piping logs to syslog is pretty useful for automating log rotation and forwarding lots of different logs to a central log server.

To that end, the command-line utility ‘logger’ is nice for piping output from utilities like pg_standby without having to add syslogging code to the utility itself. Another thing is that logger comes by default with modern packages of syslog.

Here’s an easy way to implement this:


restore_command = 'pg_standby -d -s 2 -t /pgdata/trigger /shared/wal_archive/ %f %p %r 2>&1 | logger -p local3.info -t pgstandby'

Related posts:

  1. Customizing the RPMs from pgrpms.org
  2. Snow Leopard and PostgreSQL: installation help links
  3. Greg’s THREE talks at PostgreSQL Conference East

by selena at August 31, 2010 11:49 PM

Andreas Scherbaum

PostgreSQL @ FrOSCamp 2010 in Zurich

Author
Andreas 'ads' Scherbaum

There's a new FOSS conference taking place at the ETH in Zurich, called FrOSCamp. It's a two day conference but unlike many others it's on friday and saturday, september 17th/18th. FUDCon, the Fedora conference, is running parallel to FrOSCamp. Same time, same location.

The PostgreSQL guys from Switzerland and myself submitted a project booth and several talks and workshops. Four talks and workshops as well as the booth got accepted. You can attend:

  1. PostGIS/PostgreSQL for interactive maps (talk, speaker: Stefan Keller)
  2. Replication with PostgreSQL 9.0 (workshop, speaker: Andreas Scherbaum)
  3. PostgreSQL administration and optimization (talk, speaker: Stephan Wagner)
  4. What's new in PostgreSQL 9.0? (talk, speaker: Andreas Scherbaum)

Here is the full schedule.

Come and visit us in Zurich!


by nospam@example.com (Andreas 'ads' Scherbaum) at August 31, 2010 06:54 AM

Selena Deckelmann

Variable substitution with psql

Updated: Thanks @johto for s/:bar/:foo/. :)

A coworker asked about variable substitution with psql using \set, and so I looked into it a bit further.

You definitely can do things like this:

16:55 sdeckelmann@[local]:5432|postgres=> \set test 'select * from :foo limit 10;'
16:56 sdeckelmann@[local]:5432|postgres=> \set foo 'test'
16:56 sdeckelmann@[local]:5432|postgres=> :test
myint
-------
1
2
3
4
5
6
7
8
9
10
(10 rows)

But, what about something like this:


=> \set test 'select * from :var limit 10;'
=> :test mytable

Unfortunately, this isn’t supported.

The best you could do is something pathological like:

=> \set s 'select * from '
=> \set pr ' limit 10;'
=> :s mytable :pr
=> :s test :pr
myint
-------
1
2
3
4
5
6
7
8
9
10
(10 rows)

Related posts:

  1. Quick start on Hot Standby
  2. psql and file, CSV exports
  3. an example for upcoming very brief normalization talk (part 1)

by selena at August 31, 2010 01:53 AM

August 30, 2010

Dimitri Fontaine

Happy Numbers

After discovering the excellent Gwene service, which allows you to subscribe to newsgroups to read RSS content (blogs, planets, commits, etc), I came to read this nice article about Happy Numbers. That's a little problem that fits well an interview style question, so I first solved it yesterday evening in Emacs Lisp as that's the language I use the most those days.

A happy number is defined by the following process. Starting with any positive integer, replace the number by the sum of the squares of its digits, and repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1. Those numbers for which this process ends in 1 are happy numbers, while those that do not end in 1 are unhappy numbers (or sad numbers).

Now, what about implementing the same in pure SQL, for more fun? Now that's interesting! After all, we didn't get WITH RECURSIVE for tree traversal only, did we?

Unfortunately, we need a little helper function first, if only to ease the reading of the recursive query. I didn't try to inline it, but here it goes:

create or replace function digits(x bigint)
  returns setof int
  language sql
as $$
  select substring($1::text from i for 1)::int
    from generate_series(1, length($1::text)) as t(i)
$$;

That was easy: it will output one row per digit of the input number — and rather than resorting to powers of ten and divisions and remainders, we do use plain old text representation and substring. Now, to the real problem. If you're read what is an happy number and already did read the fine manual about Recursive Query Evaluation, it should be quite easy to read the following:

with recursive happy(n, seen) as (
    select 7::bigint, '{}'::bigint[]
  union all
    select sum(d*d), h.seen || sum(d*d)
      from (select n, digits(n) as d, seen
              from happy
           ) as h
  group by h.n, h.seen
    having not seen @> array[sum(d*d)]
)
  select * from happy;
  n  |       seen
-----+------------------
   7 | {}
  49 | {49}
  97 | {49,97}
 130 | {49,97,130}
  10 | {49,97,130,10}
   1 | {49,97,130,10,1}
(6 rows)

Time: 1.238 ms

That shows how it works for some happy number, and it's easy to test for a non-happy one, like for example 17. The query won't cycle thanks to the seen array and the having filter, so the only difference between an happy and a sad number will be that in the former case the last line output by the recursive query will have n = 1. Let's expand this knowledge into a proper function (because we want to be able to have the number we test for happiness as an argument):

create or replace function happy(x bigint)
  returns boolean
  language sql
as $$
with recursive happy(n, seen) as (
    select $1, '{}'::bigint[]
  union all
    select sum(d*d), h.seen || sum(d*d)
      from (select n, digits(n) as d, seen
              from happy
           ) as h
  group by h.n, h.seen
    having not seen @> array[sum(d*d)]
)
  select n = 1 as happy
    from happy
order by array_length(seen, 1) desc nulls last
   limit 1
$$;

We need the desc nulls last trick in the order by because the array_length() of any dimension of an empty array is NULL, and we certainly don't want to return all and any number as unhappy on the grounds that the query result contains a line input, {}. Let's now play the same tricks as in the puzzle article:

=# select array_agg(x) as happy from generate_series(1, 50) as t(x) where happy(x);
              happy
----------------------------------
 {1,7,10,13,19,23,28,31,32,44,49}
(1 row)

Time: 24.527 ms

=# explain analyze select x from generate_series(1, 10000) as t(x) where happy(x);
                      QUERY PLAN
----------------------------------------------------------------------------------------
 Function Scan on generate_series t  (cost=0.00..265.00 rows=333 width=4)
                          (actual time=2.938..3651.019 rows=1442 loops=1)
   Filter: happy((x)::bigint)
 Total runtime: 3651.534 ms
(3 rows)

Time: 3652.178 ms

(Yes, I tricked the EXPLAIN ANALYZE output so that it fits on the page width here). For what it's worth, finding the first 10000 happy numbers in Emacs Lisp on the same laptop takes 2830 ms, also running a recursive version of the code.

by Dimitri Fontaine at August 30, 2010 08:00 AM

Pavel Stehule

performance issue of SQL functions

I though so sql functions is always faster than plpgsql function. But it's not true. Only when sql function is inlined, then function is faster, else plpgsql can be faster. Now I know, so sql functions where parameter is twice and more used are not inlined - because it means double parameter evaluation.you can see:

CREATE OR REPLACE FUNCTION is_empty(text)
RETURNS bool AS $$
SELECT $1 IS NULL OR $1 = ''
$$ LANGUAGE sql;
This function returns true when parameter is empty string or is null. The first parameter is used twice and then the body of this function isn't inlined.

postgres=# EXPLAIN VERBOSE SELECT count(is_empty(CASE WHEN random() 0.5 THEN NULL ELSE 'x' END)) FROM generate_series(1,100000);
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Aggregate (cost=12.50..12.77 rows=1 width=0)
Output: count(is_empty(CASE WHEN (random() 0.5::double precision) THEN NULL::text ELSE 'x'::text END))
-> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000 width=0)
Output: generate_series
Function Call: generate_series(1, 100000)
(5 rows)
The execution time is 458ms. When I modify the body for one time paremeter using:

CREATE OR REPLACE FUNCTION is_empty(text)
RETURNS bool AS $$
SELECT COALESCE($1,'') = ''
$$ LANGUAGE sql;
Then function is inlined:

postgres=# EXPLAIN VERBOSE SELECT count(is_empty(CASE WHEN random() 0.5 THEN NULL ELSE 'x' END)) FROM generate_series(1,100000);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=12.50..12.52 rows=1 width=0)
Output: count((COALESCE(CASE WHEN (random() 0.5::double precision) THEN NULL::text ELSE 'x'::text END, ''::text) = ''::text))
-> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000 width=0)
Output: generate_series
Function Call: generate_series(1, 100000)
(5 rows)
and execution time is only 68ms (5 x faster than not inlined function).Attention: this note is valid for some more complex parameters - like volatile functions. The life isn't simple - for basic parameters (like columns or simple expression), the most simple SQL function (in my article function with OR operator and twice used parameter) can be better - wirking better with planner.

by Pavel Stěhule (noreply@blogger.com) at August 30, 2010 12:38 AM

August 29, 2010

PostgreSQL Weekly News

PostgreSQL Weekly News August 29th 2010

PostgreSQL 9.0 RC1 out soon. The more you test, the sooner 9.0.0 comes out! http://www.postgresql.org/developer/beta/

August 29, 2010 08:00 AM

August 28, 2010

Selena Deckelmann

Online aggregation paper from 1997 and PSU’s database reading group

A couple weeks ago, Mark Wong and I took a field trip over to the Database Reading Group at Portland State University. It’s a group of students and professors that meet weekly throughout the school year to go over research papers. The papers are picked by the participants, and vary in topic from obscure to very practical.

This week’s paper reading was led by Professor Len Shapiro, and titled “Online Aggregation“. The paper is considered a foundational paper about SQL aggregates (like COUNT() or AVERAGE), and was published in 1997 by researchers from UC Berkeley and IBM. It’s also precursor to research into query parallelization and streaming databases. It was also awarded the SIGMOD “Test of Time” award in 2007, and is cited by over 170 other papers in the ACM archive.

The basic idea behind the paper centered around how to improve user experience in reporting results of aggregate queries – asking questions about how to solve three key problems when solving aggregates: blocking, fairness and control (from a user’s perspective). Roughly: Blocking is what happens when some part of the system waits and doesn’t return results to the user as a result of the waiting. Fairness concerns whether certain types of operations prevent certain groups of data from being processed (the example given had to do with GROUP BY and groups being processed one at a time). Control concerns whether or not a user can exert control over the speed of computation applied to a group (example given being a lever that “speeds up” processing of a set).

One insight from the paper is how online aggregates should be treated differently than traditional query processing – which might favor expensive plans involving sorts so that the output is ordered. When you’re dealing with online aggregates, you prefer unordered, or ideally random order, because your intermediate results will be more representative of the ultimate result. I guess that’s probably obvious once you think about it, but the paper provided some concrete examples.

Another interesting thought experiment involving the planner is how you pick plans that favor non-blocking, fairness and user control. Each of those properties is not narrowly defined, and changes based on individual user expectation. Professor Kristen Tufte mentioned that she’d be interested in how the ideas presented in this paper would be applied today, and Professor David Meier brought up that we might most be interested in applications involving managing Hadoop.

Prof Meier also brought up an interesting paper involving alternating nested loop joins during a discussion about optimizing JOIN algorithms for online aggregates. Another cool thing about the paper is that it involved modifications to Postgres! Granted, it was Postgres95, which doesn’t resemble the modern PostgreSQL 9.0 very much. But it was nice to revisit research that used Postgres that’s still relevant today.

Related posts:

  1. User Group Idea: The After-party
  2. User Group Idea: Patch Review Party
  3. Women in Open Source: a focus group in March

by selena at August 28, 2010 11:03 PM

August 26, 2010

Dimitri Fontaine

Playing with bit strings

The idea of the day ain't directly from me, I'm just helping with a very thin subpart of the problem. The problem, I can't say much about, let's just assume you want to reduce the storage of MD5 in your database, so you want to abuse bit strings. A solution to use them works fine, but the datatype is still missing some facilities, for example going from and to hexadecimal representation in text.

create or replace function hex_to_varbit(h text)
 returns varbit
 language sql
as $$
  select ('X' || $1)::varbit;
$$;

create or replace function varbit_to_hex(b varbit)
 returns text
 language sql
as $$
  select array_to_string(array_agg(to_hex((b << (32*o))::bit(32)::bigint)), '')
    from (select b, generate_series(0, n-1) as o
            from (select $1, octet_length($1)/4) as t(b, n)) as x
$$;

To understand the magic in the second function, let's walk through the tests one could do when wanting to grasp how things work in the bitstring world (using also some reading of the fine documentation, too).

=# select ('101011001011100110010110'::varbit << 0)::bit(8);
   bit
----------
 10101100
(1 row)

=# select ('101011001011100110010110'::varbit << 8)::bit(8);
   bit
----------
 10111001
(1 row)

=# select ('101011001011100110010110'::varbit << 16)::bit(8);
   bit
----------
 10010110
(1 row)

=# select * from *TEMP VERSION OF THE FUNCTION FOR TESTING*
 o |                b                 |    x
---+----------------------------------+----------
 0 | 10101100101111010001100011011011 | acbd18db
 1 | 01001100110000101111100001011100 | 4cc2f85c
 2 | 11101101111011110110010101001111 | edef654f
 3 | 11001100110001001010010011011000 | ccc4a4d8
(4 rows)

What do we get from that, will you ask? Let's see a little example:

=# select hex_to_varbit(md5('foo'));
                                                          hex_to_varbit
----------------------------------------------------------------------------------------------------------------------------------
 10101100101111010001100011011011010011001100001011111000010111001110110111101111011001010100111111001100110001001010010011011000
(1 row)

=# select md5('foo'), varbit_to_hex(hex_to_varbit(md5('foo')));
               md5                |          varbit_to_hex
----------------------------------+----------------------------------
 acbd18db4cc2f85cedef654fccc4a4d8 | acbd18db4cc2f85cedef654fccc4a4d8
(1 row)

Storing varbits rather than the text form of the MD5 allows us to go from 6510 MB down to 4976 MB on a sample table containing 100 millions rows. We're targeting more that that, so that's a great win down here!

In case you wonder, querying the main index on varbit rather than the one on text for a single result row, the cost of doing the conversion with varbit_to_hex seems to be around 28 µs. We can afford it.

Hope this helps!

by Dimitri Fontaine at August 26, 2010 02:45 PM

Leo Hsu and Regina Obe

Explain Plans PostgreSQL 9.0 - Part 2: JSON and JQuery Plan Viewer

In part 1 of this series on PostgreSQL 9.0 planner outputs, we demonstrated how to render explain plans in YAML, JSON, and XML using the new explain features in PostgreSQL 9.0. In this second part, we'll demonstrate how to build a user interface that allows you input a JSON formatted explain plan and have it render into a printable, navigateable display using JQuery, javascript and a little bit of HTML coding. In part 3 we'll do something similar using XML and XSLT programming.

For those who aren't familiar with JQuery, it is an MIT licensed javascript library that is fairly light weight and allows you to inspect and change html elements with fairly intuitive syntax, has some nice ajax methods and tools for converting xml/json to native objects that can be manipulated. You can check it out at JQUERY.

We are not experts in JQuery, but from what we have used of it, we really like it and the fact the base package is MIT licensed, fairly light weight and lots of plugins available for it are real pluses.

The most difficult thing I think most people find about reading explain plans is that they are upside down; it starts with a conclusion and backtracks how to arrive at it. Humans by nature think about planning steps from start to finish. In order to make an explain plan understandable to mere mortals, we generally display them upside down or having the child-nodes shown left most. We shall follow that approach.


Continue reading "Explain Plans PostgreSQL 9.0 - Part 2: JSON and JQuery Plan Viewer"

by nospam@example.com (Leo Hsu and Regina Obe) at August 26, 2010 08:13 AM

August 25, 2010

Joshua D. Drake

PgWest: 2010 Call for Papers (2nd call)

Yes, it is the second call. That means some of you haven't submitted after the first call. Of course, I haven't submitted mine either; so it is time for everyone to get on it. West is just around the corner and from all observations this West stands to be the largest PostgreSQL Conference, ever. (O.k. we might not over take Brazil).

Here is the announcement for everyone to review, enjoy and click on the CFP link:

Following on the smashing success of PostgreSQL Conference East, PostgreSQL Conference West, The PostgreSQL Conference for Decision Makers, End Users and Developers, is being held at the Sir Francis Drake Hotel in San Francisco from November 2nd through 4th 2010. Please join us in making this the largest PostgreSQL Conference to date!

  • Main conference site
  • Call for Papers

    Thank you to our sponsors: Founding: Command Prompt Diamond: EnterpriseDB

    Time line:

    July 14th: Talk submission opens Sept 5th: Talk submission closes Sept 10th: Speaker notification
    This year we will be continuing our trend of covering the entire PostgreSQL ecosystem. We would like to see talks and tutorials on the following topics:

          * General PostgreSQL: 
                  * Administration 
                  * Performance 
                  * High Availability 
                  * Migration 
                  * GIS 
                  * Integration 
                  * Solutions and White Papers 
          * The Stack: 
                  * Python/Django/Pylons/TurboGears/Custom 
                  * Perl5/Catalyst/Bricolage 
                  * Ruby/Rails 
                  * Java (PLJava would be great)/Groovy/Grails 
                  * Operating System optimization
                    (Linux/FBSD/Solaris/Windows) 
                  * Solutions and White Papers 
    

  • August 25, 2010 07:14 PM