::Planet PostgreSQL::

July 20, 2008

Peter Eisentraut

New Job at Sun

On July 22nd, 2008, I will be joining Sun Microsystems as PostgreSQL software engineer. Sun has been a valuable contributor to the PostgreSQL project for a
number of years now, and I am looking forward to joining them in this effort.
I am glad that I will be able to continue my personal role in the PostgreSQL
project with the support of the great resources that Sun provides.

I would like to take this opportunity as well to thank my former colleagues at
credativ GmbH for their support of the PostgreSQL project and my own career. I wish them much success in their continued development.

So, I expect that I will have more time to contribute to PostgreSQL
development from now on, and both Sun and I have a sizeable backlog of
projects and ideas that we would like to realize. Time to get started!

July 20, 2008 09:30 PM

Jignesh K. Shah

Peter Eisentraut joins Sun Microsystems


Just read the news on Sun's On The Record blog entry that Peter Eisentraut is joining Sun Microsystems. [Read More]

July 20, 2008 09:21 PM

Josh Berkus

Sun Rise, Sun Set

After two years as Sun's PostgreSQL Lead, I'm leaving to pursue other opportunities. This does not mean that Sun is dropping PostgreSQL; far from it. Instead my fellow core team member Peter Eisentraut is taking over my role leading the PostgreSQL team at Sun.

July 20, 2008 06:22 PM

July 19, 2008

Leo Hsu and Regina Obe

More Database Comparisons

As many people who know us know we sit on several camps especially when it comes to databases. The camps we sit on are growing rather than shrinking. While we do have our favorites, we understand that peoples needs and comfort levels are different from ours and we try to take that into consideration when making recommendations to people. The only thing that is generally true about the clientele we consult for is that they fit one of the following features:

It has come up as a topic of discussion, now that SQL Server 2008 is coming out soon and with its new fangled geodetic spatial support, how does this change things?

The short answer is - not much except to increase awareness of spatial databases and to give us more business. As part of our due diligence work we have put together a comparison of the 3 databases spatial functionality - Cross Compare SQL Server 2008 Spatial, PostgreSQL/PostGIS 1.3-1.4, MySQL 5-6 to compliment our Cross Compare of SQL Server, MySQL, and PostgreSQL


Continue reading "More Database Comparisons"

July 19, 2008 11:00 PM

Josh Berkus

State Of Lightning Talks 2008 "Final" Program

For three years now, OSCON has provided a 90-minute session for open source projects to update OSCON attendees on "what's going on this year." If you're new to open source, it's a great way to get an idea of the scope of open source software out there. If you're an OSS geek, it's a chance to see what's new with your favorite projects.

Most importantly, with each presentation strictly limited to 5 minutes (using the "Hourglass of Death") and some "zinger" presentations, SOLT is

July 19, 2008 03:21 AM

July 17, 2008

Josh Berkus

PostgreSQL at OSCON

As always, we're going to have a lot of PostgreSQL activity around OSCON. Here's my list of things to see and do ...

July 17, 2008 08:06 PM

July 16, 2008

Greg Sabino Mullane

check_postgres 2.0.1 is released

I just released version 2.0.1 of the check_postgres.pl program. While in the past this script was strictly for use as a Nagios plugin, it now features two additional output modes. While Nagios is still the default, you can now ask for MRTG and "simple" output by using the --output= option. The MRTG option outputs four lines of information, with the first always being a simple number. The "simple" output mode simply outputs the first line of the MRTG output - in other words, a single number, nothing more and nothing less. This is very useful for times when you just want to check the state of something without worrying about setting warning limits, querying exit codes, etc. See the the documentation for complete details.

This version also greatly speeds up the "check_bloat" action, which gives a rough estimate of how bloated the tables and indexes in your database are. Rather than use a large regex to parse the output, it now simply does a split. Kudos goes to the new version of the Devel::NYTProf module, which pointed out that particular bottleneck quite easily (and colorfully!).

New ideas and patches always welcome, just mail the list or play around with the git repository.

July 16, 2008 10:42 PM

July 14, 2008

Leo Hsu and Regina Obe

SQL Idiom Design Patterns

Programming Design Patterns define recommended approaches of solving common application problems. Within design patterns is a subset of design patterns called Idioms. Idioms you can think of as a strategy for expressing recurring constructs or if you will sub-problems and often take advantage of the special features of a language. They tend to be specific to a programming language and can not be reused in other languages they were not specifically designed for. To demonstrate the differences lets compare two design patterns we commonly use.


Continue reading "SQL Idiom Design Patterns"

July 14, 2008 12:19 PM

David Fetter

PostgreSQL Weekly News - July 13 2008

PostgreSQL Weekly News - July 13 2008
PostgreSQL Product News
Open Technology Group has created a high-availability training course.
http://www.otg-nc.com/training-courses/coursedetail.php?courseid=65&cat_id=8
Continue reading "PostgreSQL Weekly News - July 13 2008"

July 14, 2008 02:34 AM

Robert Hodges

PostgreSQL Gets Religion About Replication

The PostgreSQL community is getting really serious about replication. On Thursday May 29th, Tom Lane issued a manifesto concerning database replication on behalf of the PostgreSQL core team to the pgsql-hackers mailing list. Tom's post basically said that lack of easy-to-use, built-in replication is a significant obstacle to wider adoption of PostgreSQL and proposed a technical solution based on log shipping, which is already a well-developed and useful feature.

What was the reaction? The post generated close to 140 responses within the next two days, with a large percentage of the community weighing in. It's one of the most significant announcements on the list in recent history. There is pent up demand for this feature and within a few hours people were already deep into the details of the implementation.

The basic idea comes from an excellent presentation by Takahiro Itagaki and Masao Fujii of NTT at PGCon 2008 in Ottawa. They have developed a system that replicates database log records synchronously to a standby database. The standby can recover quickly and without data loss, which makes it a good availability solution. The core team manifesto proposes to integrate this into the PostgreSQL core and add the ability to open the standby for reads.

So, is this the end of the story on replication? I don't think so. There's no question that synchronous log shipping with reads would be a great feature. Basic availability is the first problem users run into when setting up production systems and this feature looks considerably better than alternatives for other databases like MySQL. It will help if NTT donates their code to the community, but still the whole effort will take considerable time. Adding the ability to open a standby for reads is at least a version out (read: up to 2 years).

More importantly, log shipping is most useful for availability. It does not help you replicate across database versions (nice for upgrades), between different databases, from a master to large numbers of slaves, or bi-directionally between databases. Finally, it's a less than ideal solution for clustering data between sites, something that is rapidly becoming one of the most important overall uses of replication. For these and other cases you need logical replication, which turns log records into SQL statements and applies them using a client.

I'm therefore starting an effort to get logical replication hooks included as a parallel effort. If you are interested in this let me know. Meanwhile, stay tuned. Tom's message represents a real change of heart for the PostgreSQL community. Accepting the important of replication opens up the doors for a new round of innovation in scale-out based on PostgreSQL. It could not come at a better time.

July 14, 2008 12:03 AM

Myosotis Connector: a Fast SQL Proxy for MySQL and PostgreSQL

SQL proxies have been very much in the news lately, especially for open source databases. MySQL Proxy and PG-Pool are just two examples. Here is another proxy you should look at: Myosotis.

Myosotis is a 'native-client' to JDBC proxy for MySQL and PostgreSQL clients. We originally developed it to allow clients to attach to our Java-based middleware clusters without using a JDBC driver. Myosotis parses the native wire protocol request from the client, issues a corresponding JDBC call, and returns the results back to the client. As you can probably infer, it's written in Java. "Myosotis" incidentally is the scientific name for "Forget-Me-Not," a humble but strikingly beautiful flower.

Myosotis is still rather simple but it already has a couple of very interesting features. First, it works for both MySQL and PostgreSQL. That's a good start. Wire protocols are very time-consuming to implement. Another feature is that Myosotis is really fast. This deserves explanation and some proof.

As other people have discovered, proxying is very CPU-intensive. It also involves a lot of concurrency, since a proxy may have to manage hundreds or even thousands of connections. Java is already fast in single threads--after a few runs through method invocations, the JVM has compiled the bytecodes down to native machine code. In addition, Java uses multiple CPUs relatively efficiently. Myosotis uses a thread per connection. Java automatically schedules these on all CPUs and optimizes of memory access in multi-core environment.

We can show Myosotis throughput empirically using Bristlecone, an open source test framework we wrote to measure performance of database clusters. We test proxy throughput by issuing do-nothing queries as quickly as possible with varying numbers of threads. The following run compares Myosotis against a uni/cluster 2007.1 process (a much more complex commercial middleware clustering software) and MySQL Proxy 0.6.1 running without Lua scripts. The proxy test environment is a Dell SC 1425 with 4 cores running CentOS5 and MySQL 5.1.23.

The results are striking. Myosotis gets between 3000 and 3500 queries per second when 8 threads are simultaneously running queries. To demonstrate processor scaling, run htop when the Myosotis Connector is being tested. You see something like this--a nice distribution across 4 cores.
Myosotis is a very simple proxy now but it has the foundation to create something great. We have big plans for Myosotis--it's a key part of our Tungsten architecture for database scale-out, which we will be rolling out later in the summer. The next step is to add routing logic so that we can implement load balancing and failover. We'll be doing that over the next few months. Meanwhile, if you want to see how fast Java proxies for SQL can be, check us out at at http://myosotis.continuent.org.

p.s., If you want to repeat the test shown here on your own proxy, download Bristlecone and try it out. I used the ReadSimpleScenario test, which is specifically designed to check middleware latency.

July 14, 2008 12:02 AM

July 12, 2008

Hubert Lubaczewski

Suppressing CONTEXT lines in psql

CONTEXT messages are quite informative, but when you don’t need them, they can be real pain in the eyes: psql:trees.sql:68: NOTICE: Adding tree structures to table public.testit. CONTEXT: SQL function "add_tree_structure" statement 1 psql:trees.sql:68: NOTICE: Tree table will be public.testit_tree. CONTEXT: SQL function "add_tree_structure" statement 1 psql:trees.sql:68: NOTICE: Primary key in source table is: id CONTEXT: [...]

July 12, 2008 12:40 PM

July 10, 2008

Selena Deckelmann

PDXPUG Day on July 20 - Register now!

pgday 2007

photo courtesy of Dan Browning

Registration for PDXPUG Day on July 20, 2008 is open! Please sign up and let us know what size t-shirt you’d like. We’re requesting a $20 donation (by cash or check) at the door. All proceeds to to Software in the Public Interest, a 501(c)3 organization that is used to fund PostgreSQL advocacy.

Registration for OSCON is not required to attend.

Registering also gets you in the door at the Gotham Tavern, our after-party location close to the convention center!

Our line-up of talks includes:

PostgreSQL Unit Testing with pgTAP - David Wheeler
Inside the PostgreSQL Shared Buffer Cache - Greg Smith
Muldis D - Portable Databases At Full Power - Darren Duncan
A Streaming Database Talk - Rafael J. Fernández-Moctezuma
Using GLORP to connect Squeak Smalltalk to PostgreSQL - RandalSchwartz
Fighting Disease with PostgreSQL Full Text Search and JRuby on Rails - Mike Herrick
All Your GIS Are Belong to You - Abe Gillespie
What’s PgUS - Joshua Drake

Sign up today!

July 10, 2008 02:01 PM

Andrew Dunstan

More fun with encoding

This time my client needed a way to fix invalid UTF8 encoding in place, without dumping cleaning and reloading.

I lightheartedly promised to do this in a plperl function. It proved to be somewhat trickier than I imagined. However, after a little research and a couple of false starts, I managed to come up with this plperlu function.
Continue reading "More fun with encoding"

July 10, 2008 01:07 PM

July 08, 2008

Josh Berkus

Happy birthday, Elephant!

The PostgreSQL project is 12 years old today.

July 08, 2008 06:16 PM

July 07, 2008

David Fetter

PostgreSQL Weekly News - July 06 2008

PostgreSQL Weekly News - July 06 2008
The July CommitFest has begun. Start reviewing :)
http://wiki.postgresql.org/wiki/CommitFest?:2008-07
Continue reading "PostgreSQL Weekly News - July 06 2008"

July 07, 2008 04:05 AM

July 06, 2008

Leo Hsu and Regina Obe

YUM 2: Almost Idiot's Guide to upgrade from PostgreSQL 8.3.1 to 8.3.3

In our April Issue An Almost Idiot's Guide to PostgreSQL YUM we covered using the new PostgreSQL Yum repository to install the PostgreSQL 8.3.1 release on Fedora, RedHat Enterprise, and CentOS. We also received numerous useful feedback from others on issues they ran into and how they overcame them. The blog comments are definitely worth a read.

Now that 8.3.3 has come out, many of you should be considering upgrading if you haven't already since there are a couple of bug fixes as outlined in http://www.postgresql.org/docs/8.3/static/release-8-3-2.html, http://www.postgresql.org/docs/8.3/static/release-8-3-3.html, and for those running 8.3.0 you will need to reindex your tables after as noted in http://www.postgresql.org/docs/8.3/static/release-8-3-1.html. If you are running version 8.3.1 and above then this is a fairly painless upgrade that just requires you to backup your data as a precautionary measure, but doesn't require a dump reload.


Continue reading "YUM 2: Almost Idiot's Guide to upgrade from PostgreSQL 8.3.1 to 8.3.3"

July 06, 2008 10:34 AM

July 04, 2008

Robert Treat

Index pruning techniques

Index pruning

Last week I ran across two different blog posts discussing removing duplicate and useless indexes from a database. Coincidentally, I have a nagging TODO item to clean up some indexes in the schema on one of the applications we have been developing. So, with inspiration in hand, here are some techniques to clean up your indexes.

Continue reading "Index pruning techniques"

July 04, 2008 03:48 PM

Francisco Figueiredo Jr

Finally post my PGCon2008 pictures!!

Hi all, I finally post my pictures from PGCon2008. You can see them on my picasa web album.

July 04, 2008 12:19 AM

July 03, 2008

Devrim Gündüz

Looking for a presentation regarding HOT?


In case you missed this, here is a very good presentation about HOT. Oscar goes to Pavan Deolasse. (I prepared a Turkish presentation regarding MVCC,Vacuum,Bgwriter and HOT, and you can also find it here, if you know Turkish ;-) )

...and yes, I'm enjoying a perfect holiday -- I have always made good decisions while swimming.

July 03, 2008 10:23 PM

Kenny Gorman

Python script showing PostgreSQL objects in linux memory: pg_osmem.py

I got some email and comments about the code I used for my post; Mapping Linux memory for PostgreSQL using fincore so I thought I would post the code I am using. Nothing too fancy here, it needs a config file and some more bits. I highly recommend someone do this in perl vs python and incorporate fincore as a module instead. I used psycopg2, other than that, all the other modules are stock with python 2.5.2. If someone wants to show me how to query the data directory PostgreSQL is currently using and incorporate that into the script vs the static string: mydir = “/data/mydb/base”, that would be great. In order to use this script, you must change the variables for fincore, and mydir below.

 
#!/home/postgres/python/bin/python
#
# script to find show memory usage of PG buffers in OS cache
# 2008 kcg
#
import os
import psycopg2
import commands
import re
import sys
from optparse import OptionParser
 
parser = OptionParser()
parser.add_option("-u","--username",dest="username",help="username for PostgreSQL")
parser.add_option("-m","--machine",dest="machine",help="machine to connect to.. aka: hostname")
parser.add_option("-d","--dbname",dest="dbname",help="database name to connect to")
parser.add_option("-p","--password",dest="password",help="password for PostgreSQL")
(options, args) = parser.parse_args()
 
osmem   ={}
# change these to match actual locations
fincore ="/home/kgorman/fincore.pl"
mydir   = "/data/mydb/base"
 
# get list of dbs on host, and return dictionary of db=oid sets
def lookup_dbs():
 dbs={}
 connectstr="host="+options.machine+" dbname="+options.dbname+" user="+options.username+" port=5432 password="+options.password
 handle=psycopg2.connect(connectstr)
 curs=handle.cursor()
 sql="select datname,oid from pg_database where datname = '"+options.dbname+"' and datname not like '%template%'"
 curs.execute(sql)
 for d in curs.fetchall():
   dbs[d[0]]=d[1]
 return dbs
 
# get object
def lookup_oid(oid,dbname):
 connectstr="host="+options.machine+" dbname="+dbname+" user="+options.username+" port=5432 password="+options.password
 handle=psycopg2.connect(connectstr)
 curs=handle.cursor()
 sql="select relname from pg_class where oid = "+oid
 curs.execute(sql)
 for d in curs.fetchall():
  return d[0]
 
dbs=lookup_dbs()
for v, i in dbs.iteritems():
  for ii in os.listdir(mydir+"/"+str(i)):
     p = re.compile('\d')
     if p.match(ii):
     	#print ii
        rel=lookup_oid(ii,v)
        fullpath=mydir+"/"+str(i)+"/"+ii
	cmd=fincore+" "+fullpath
	#print cmd
	pages=commands.getstatusoutput(cmd)
	#print pages
	n=pages[1].split(' ')
        size=n[1]
	if p.match(size):
           if rel:
	     osmem[v+":"+rel]=(int(size)*1024)
 
# sort and output
sdata=sorted(osmem.iteritems(), key=lambda (k,v): (v,k), reverse=True)
a=0
print "OS Cache Usage:"
while a < len(sdata):
  print sdata[a][0]+":"+str(sdata[a][1])
  a=a+1

July 03, 2008 01:21 AM

July 02, 2008

Jignesh K. Shah

wal_buffers and PostgreSQL 8.3.3

Recently Felix Malinkevich joined the ISV-E Open Source Team at Sun as an Intern working on PostgreSQL [Read More] 

July 02, 2008 09:11 PM

Robert Hodges

What's Your Favorite Database Replication Feature?

Replication is one of the most flexible technologies available for databases. We are implementing a new open-source, database-neutral replication product that works with MySQL, Oracle, and PostgreSQL. Naturally we've done a lot of thinking about the feature set. It's tough to pick any single feature as the most important, but one that really stands out is optional statement replication. Here's why.

Database replication products tend to replicate row changes and DDL. However, Mark Callaghan has a great example of why you want to replicate statements as well--it enables Maatkit distributed consistency checking to work. If you dissect the mk-table-checksum --replicate command you will see that it uses a nice trick. The SQL queries generate checksums into the master table and then replicate as statements rather than row updates out to slaves. That way the slaves recompute the checksum locally at the same point in the overall transaction history. Very elegant!

Replicated consistency checks are a wonderful feature for large systems that can't afford to stop in order to compare tables between servers. However, you cannot use it if your database cannot replicate statements. As Mark points out, not even all MySQL engines do this. The proposed replication additions for PostgreSQL won't support it either.

Optional statement replication is really the best kind of feature: it is useful on its own, but also enables features like consistency checking and other nice administrative tricks. We're going to put a "worm-hole" in our replication engine that allows applications to invoke statement replication at the SQL level. Can you guess how we are going to do it? If not, you'll have to wait until we release. :)

So what's your favorite database replication feature?

July 02, 2008 04:40 PM

Hubert Lubaczewski

Who/what is trashing db performance?

Every so often I need to find who exactly is damaging database performance. I mean - I have db connection which does something strange/wrong with the database, but what exactly is responsible? Switching to separate user for every program, script and developer would solve the problem, but it is not always an option. So, what should [...]

July 02, 2008 01:40 PM

Robert Hodges

MySQL, Sun, and the Future of Open Source Databases

So what's it like now that Sun now owns MySQL? The executive summary: a little weird. I was at the MySQL User Conference a couple of weeks ago and had a chance to talk with a lot of people in the community as well as many MySQL folks. Marten Mickos is now the head of database products at Sun. It's not very hard to figure out what Sun will do with MySQL products for the near future--pretty much what MySQL was doing already.

The real question for a lot of people is what will happen with databases like PostgreSQL and Derby. Sun has invested heavily in both of them, and PostgreSQL in particular is now quite fast. With the MySQL acquisition, Sun has an opportunity to run the table with multiple offerings that cover both enterprise applications as well as web and embedded. However, that would mean cutting down the MySQL roadmap to concentrate, for example, on scale-out rather than scale-up. It would also require thinking big to combine with other vendors in order to disrupt the market leader Oracle. Done right, there's a chance to upend the industry in a way that has not occurred since Microsoft muscled into databases in the early 1990s using code bought from Sybase.

Based on talks from people like Rich Green and Marten Mickos, it's hard to see this happening. Sun is taking a hands-off approach to MySQL *and* giving MySQL management control of overall database strategy. A disruptive change therefore seems unlikely. In fact, the more likely result is stagnation, now that MySQL no longer has to fight for its existence. The MySQL roadmap is still pretty diffuse and there has been little product movement since the 2007 User Conference. MySQL 5.1 is still not out the door. Falcon is likely to show up ready for production use around the time the Boeing Dreamliner rolls out. MySQL is still working on multiple storage engines (2 new ones plus NDB and MyISAM, to name a couple.) There's not even a glimmer of a date for cool new replication features like a pluggable replication interface. In short, not much evidence for radical changes of any kind.

Also, there must be the awful temptation to focus on vertical scaling so that MySQL can work on Sun hardware with large numbers of cores. I asked Marten Mickos specifically about the choice between scaling up and scaling out but didn't get a very clear answer. Personally I think for MySQL to concentrate very hard on vertical scaling would be a strategic error. The community that made MySQL great is into commodity hardware and scale-out in a big way. First rate support for highly scaled SMP architectures is going to be a long slog that will compromise delivery of many other features.

Given all of this it's hard not to see innovation, particularly in problems like scale-out, shifting away from MySQL to other databases as well as middleware. This would be a great time for the PostgreSQL community to get really serious about data replication. MySQL won't fade--it's already a great database. But there's likely to be a crowd of people in the MySQL community eying other solutions. It's going to be interesting to see what they come up with.

July 02, 2008 11:38 AM

Tungsten Scale-Out Stack Presentation from MySQL Conference

There have been a number of requests for copies of the slides to the Tungsten Scale-Out Stack talk I gave at the MySQL Conference in April. Here they are courtesy of the nice folks at O'Reilly who organized the conference.

Tungsten is our codename for a set of technologies to raise database performance and availability using scale-out. In the database world scale-out is a term of art that means spreading data across servers on multiple systems. With data in multiple places you are less subject to failures--when one copy crashes you just use the others. Similarly, if your application runs a lot of queries, you can spread them over different machines, which makes for faster and more stable response times.

So database scale-out sounds great (and is too), but getting it to work properly is harder than you would think. Along with practical issues like management, there are theoretical barriers. Let's say you are creating a product catalog service using database replicas on different hosts. Applications connect to any replica to get information. Your manager, a guy with pointed hair, tells you to make sure of the following:

1. The catalog service is always available.
2. The service keeps working even if you get a network partition between hosts.
3. The copies are always consistent (e.g., you can go to any copy and get the same data).

Here's an ugly surprise. It turns out your data service can only have two of the three properties at any given time, a result that was proven only recently and is now called the CAP Principle. If you want to be available and handle network partitions, you must accept that data will sometimes be inconsistent. Your manager is going to be very disappointed.

That's where we get back to Tungsten and the Scale-Out Stack. We realized a while back that you can't think in terms of a single product or even family of products to solve scale-out in a general way. It's better to design a flexible set of technologies with different strengths and weaknesses that users choose based on what's important to them. If you need to cluster over a WAN, use master/slave replication. If you don't want master failures, use synchronous replication in middleware.

Read the slides to learn more about the thinking. Database scale-out is a fascinating problem and we are looking forward to making it much easier to handle. Please stay tuned! I'll be writing more about this in the weeks and months to come.

July 02, 2008 11:37 AM

Webinar: The Coolest Scale-Out Projects on the Planet

My company Continuent sponsors Continuent.org, an open source site that contains some of the coolest scale-out projects around. You may have heard of Sequoia, which implements middleware clustering of any database that has a JDBC driver. However, Sequoia is really just the beginning.

We have several other projects that offer interesting scale-out technologies. Myosotis implements fast SQL proxying, Hedera provides wrappers for group communications, and Bristlecone has tools for performance testing of scale-out architectures. This summer we will add projects for database neutral master/slave replication as well as cluster management. In short, there's a lot to look at.

If you would like a closer look at Continuent.org, I'm doing a Webex webinar to talk about the overall technology stack and project roadmaps. It's scheduled for Thursday June 26th at 10am EDT. You can sign up here to see what's going on.

Don't worry if you miss the presentation--I'll post slides here and will be doing a series of blog entries on each of the projects in the coming weeks.

July 02, 2008 11:36 AM

Leo Hsu and Regina Obe

How to Inherit, Unherit and Merge Inherit

A lot of this information is nicely tucked away in the PostgreSQL docs in http://www.postgresql.org/docs/8.3/interactive/ddl-inherit.html, but since the docs are so huge and rich, one may tend to miss these things.

While there are numerous interesting use cases for the PostgreSQL inheritance structure, one of the key reasons people use it is for table partitioning strategies.

How do you make a stand-alone table a child of another table?

The first question that comes to mind is why would you ever need a table to adopt another table. There are 2 reasons that come to mind.

This fits into one of the categories of things that PostgreSQL lets you do that PgAdmin III doesn't have a graphical way to let you do it. If you try to inherit in PgAdmin III from a table that already exists, that option is just greyed out. So you have to resort to DDL SQL statements. Luckily its fairly trivial. Well this really only works for PostgreSQL 8.2+. I don't think PostgreSQL 8.1 and below supported INHERIT/NO INHERIT in the ALTER TABLE statement.


Continue reading "How to Inherit, Unherit and Merge Inherit"

July 02, 2008 07:37 AM

June 30, 2008

David Wheeler

CITEXT Patch Submitted to PostgreSQL Contrib

On Friday, I submitted a patch to add a locale-aware case-insensitive text type as a PostgreSQL contrib module. This has been among my top requests as a feature for PostgreSQL ever since I started using it. And as I started work on yet another application recently, I decided to look into what it would take to just make it happen myself. I'm hopeful that everyone will be able to benefit from this bit of yak shaving.

Read More »

June 30, 2008 07:49 PM

Bruce Momjian

Attending OSCON

My son Matthew, 15, is attending OSCON this year, but not to be with me, but rather to attend the conference tutorials and sessions --- imagine that.

June 30, 2008 04:00 PM

pgpool PG Pool Party

I am having a pool party and barbecue at my house on Saturday, August 9, for Postgres community members in the Philadelphia area and nearby states. Families are welcome. Food and drink will be provided, and, of course, swimming is encouraged. Please come anytime between 2pm and 7pm (directions).

Read More »

June 30, 2008 04:00 PM

David Fetter

PostgreSQL Weekly News - June 29 2008

PostgreSQL Weekly News - June 29 2008
July's commit-fest is starting soon.

Continue reading "PostgreSQL Weekly News - June 29 2008"

June 30, 2008 05:21 AM

June 29, 2008

Robert Treat

reconnoiter goes... alpha

I'm not sure if this is official, but I noticed that reconnoiter got it's own fully fledge project page on the labs site. What is reconnoiter? It's a statistical graphing package that can be used for monitoring and trending various pieces of information about your server architecture. Think cacti, except that it has a more component driven architecture, and that it uses an RDBMS as the data store, rather than rrd. This means you can not only do standard graphical layouts, but use full fledged SQL to query your collected data set in new and different ways (which can be real handy when you're trying to investigate anomalies common in todays web ecosystem. It's still pretty new, but worth a look if you need these types of services (and likely you do, even if you don't know it).

UPDATE: Theo has done a more comprehensive explanation of reconnoiter that is worth a look. It even includes some pretty pictures. :-)

June 29, 2008 02:21 AM

June 28, 2008

Devrim Gündüz

Time for a vacation


It has been almost 2 years since I spent time for a vacation -- After I got married about 2 years ago,we had a 1-week honeymoon -- Then, last year, my son was born, and we found no time for a holiday.

Everyone near me, except me ( ;-) ) said that I'm tired -- so I was forced for a 2-week holiday. It is hard to be away from my laptop, but still :-) See you two weeks later.

June 28, 2008 07:52 PM

June 25, 2008

Gevik Babakhani

Compiling Qt with native PostgreSQL drivers on Windows XP

I spent the last three hours looking and Googling to solve a very annoying problem in order to compile PostgreSQL drivers natively with Qt 4.4.0 framework. After a lot of frustration I discovered that one has to provide the path to libpq.lib and the PostgreSQL include dir in UNIX style or 8.3 short filenames to get the qmake.exe to read them correctly.

After running configure.exe, I had to modify .qmake.cache like to following to make it work:

QMAKE_QT_VERSION_OVERRIDE = 4
LIBS           += C:\PROGRA~1\POSTGR~1\8.3\lib\libpq.lib
LIBPATH        += C:\PROGRA~1\POSTGR~1\8.3\lib
INCLUDEPATH    += C:\PROGRA~1\POSTGR~1\8.3\include
OBJECTS_DIR     = tmp\obj\debug_shared
MOC_DIR         = tmp\moc\debug_shared
RCC_DIR         = tmp\rcc\debug_shared
sql-drivers    += odbc psql
sql-plugins    += sqlite
....
....
....
QMAKE_LIBDIR_QT = $$QT_BUILD_TREE\lib

June 25, 2008 11:05 PM

Robert Treat

Could OSDB replace the SQL Standard Committe?

Lukas' recent post on solving the prepared statement problem got wondering, if open source databases implement something, how likely is it to become a defacto standard? The best example of this is the LIMIT clause, which both MySQL and PostgreSQL support, and that anyone who has used either database will tell you is a far superior tool to what it is available in other systems (don't get me wrong, rownum() has it's place, but for what LIMIT does it's not as good). And while LIMIT is not standard SQL, can you imagine if the SQL Standard Committee were to adopt a syntax that conflicted with the Postgres/MySQL implementations?

Take this with a grain of salt, but really there are only 5 databases that matter anymore, Oracle, DB2, MS SQL, MySQL, and Postgres. Sure, there are many other successful database systems, and also some niche databases that are big players in their segment, but those 5 are the ones that set the tone. Even with Oracle far out pacing MySQL on dollars, the number of people who have joined the tech industry within the last 5 years who have hands on experience with MySQL high enough that it is not uncommon for people to look for their syntax in other database products (for better or worse). And remember, Postgres is even more popular than MySQL in some places; one should expect the impact of open source databases to rise in this area faster than just by watching market share.

So, what kind of problems could be solved with new syntax? Honestly I am not really sure. One piece of syntax I always though SQL got wrong was the UPDATE statement. I've always thought that the syntax should have been UPDATE table WHERE conditions SET column=data, if only because it would have saved us from countless unconstrained updates from junior DBA's. Other examples might be to take something like MySQL's ON DUPLICATE KEY syntax, clear up any sanity issues with implementation, and then implement it in Postgres. (Yes, this examples are questionable, since there are SQL Standard ways of doing this stuff, but the syntax of this command sure seems more user friendly).

Of course, this whole idea overlooks the notion that the SQL Standard committee has often been used as a tool between database vendors to make life tougher for their competitors, so it might be unlikely that the committee would ever adopt the syntax from the FLOSS community if it meant giving them some type of advantage, but this is also why FLOSS databases could give hope, since as more user oriented projects, if they worked together you should really see syntax that makes everyones lives easier. Imagine Postgres, MySQL, Firebird, SQLite, and Ingres all working together to create their own, user driven syntaxes.

June 25, 2008 10:47 PM

June 24, 2008

Bruce Momjian

Upcoming Presentations

I am speaking at two new venues; one is tonight in New York City, the other is in Massachusetts in July; see my events page for more details.

June 24, 2008 04:00 PM

June 23, 2008

Devrim Gündüz

Added Fedora 9 support to RPM repository


Sorry for the delay, folks -- I promise that it won't happen again. A few days ago, I added Fedora-9 supporto to PostgreSQL RPM Repository. 8.3 support is complete. I added 7.4, 8.0, 8.1 and 8.2 packages, too, but did not add other PostgreSQL related packages to those branches, yet.

June 23, 2008 04:14 PM

Leif B. Kristensen

Code prettification

Inevitably, as you’re learning a new skill, such as a programming language, you may want to revisit your old work and see if you can do it better. I had this pair of functions to fetch the previous and the next “page” of a source collection, based on the sort order of the source. In plain text, in order to get to the previous “page” I want the source with the maximum sort order smaller than the present one, and with the same parent id. Here is my brute-force approach from a couple of years ago:

CREATE OR REPLACE FUNCTION get_prev_page(INTEGER) RETURNS INTEGER AS $$
DECLARE
    self_page INTEGER;
    prev_page INTEGER;
    prev_src INTEGER;
    par_id INTEGER;
BEGIN
    SELECT parent_id FROM sources INTO par_id WHERE source_id = $1;
    SELECT sort_order FROM sources INTO self_page
        WHERE source_id = $1;
    SELECT MAX(sort_order) FROM sources INTO prev_page
        WHERE parent_id = par_id AND sort_order  self_page;
    SELECT source_id FROM sources INTO prev_src
        WHERE parent_id = par_id AND sort_order = prev_page;
    RETURN COALESCE(prev_src,0);
END;
$$ LANGUAGE plpgsql STABLE;

Ugly, ugly, ugly. I remember what was the main stumbling block here: You can’t use an aggregate function such as MAX() in a WHERE clause. The thing is that you don’t need all those assignments. A little code folding, replacing variables with sub-selects, takes you a long way:

CREATE OR REPLACE FUNCTION get_prev_page(INTEGER) RETURNS INTEGER AS $$
DECLARE
    pp INTEGER;
BEGIN
    SELECT source_id INTO pp FROM sources
        WHERE parent_id = (SELECT parent_id FROM sources WHERE source_id = $1)
        AND sort_order  (SELECT sort_order FROM sources WHERE source_id = $1)
        ORDER BY sort_order DESC LIMIT 1;
    RETURN COALESCE(pp, 0);
END
$$ LANGUAGE plpgsql STABLE;

The “ORDER BY sort_order DESC LIMIT 1″ is a great idiom whenever you need to use an extreme value as part of a WHERE clause.

Even if the two versions of the code above are functionally equivalent, and the PostgreSQL planner probably will rewrite the query to something like version 1, most programmers would prefer version 2. Why? I think there’s a lot to the concept that “coding is poetry”. Version 2 is more esthetically pleasing, because it conveys its inner meaning in a much more succinct way, as in the Merriam-Webster definition of the word succinct: “marked by compact precise expression without wasted words”. At least to me it does. And that’s probably some of the essence of poetry.

June 23, 2008 11:51 AM

David Fetter

PostgreSQL Weekly News - June 22 2008

PostgreSQL Weekly News - June 22 2008
New Survey: What TODO would you most like?
http://www.postgresql.org/community/
Continue reading "PostgreSQL Weekly News - June 22 2008"

June 23, 2008 02:23 AM

Ow Mun Heng

Automatic Raid Array Rebuilding

Hi guys, long time no post. Last post was at March and it's now already June.

Been busy as usual, however, not been dabbling as much as I "should" as I've been busy with other NON-FOSS related stuffs. (psst: I'm now heavily into photography. Went to shoot some Japan GT queens!! Kawaaiii)

Anyway, since this is a (nearly) purely an FOSS based blog, I'm gonna talk about my automatic Raid Rebuilding script.

You see, what happens is this, my postgresql box, (celeron 2x500GB in Raid 1) has a tendency to keep dieing once in a while for X reasons. (I have till now, been unable to locate the reason why it's dieing so often) I've tried to the write-all, read-all using dd but thus far, has not seen errors being thrown out. So, it's been a manual instance of...

go to work. see the email : Your raid has Died!
log onto the box, do the rebuild.

After a while, this just becomes tiring and I decided to fsck it and make it automatic.

Here's the script

#!/bin/bash

FAIL_DRV=`mdadm --detail /dev/md0 | grep faulty | awk '{print $6}'`

if [ -n "$FAIL_DRV" ]
then
  echo "Detected degraded array : $FAIL_DRV"
  echo "Starting automated array rebuild process"
  mdadm /dev/md0 --fail $FAIL_DRV --remove $FAIL_DRV --add $FAIL_DRV
else
  echo "Nothing to do"
fi


Simple eh..

So, now I don't have to come to work to see it all wonky because it'll automatically rebuild itself.

Some of you may ask, how come I don't just replace the drive? Because I can't find any replacement drive which is a PATA connection and at 500GB capacity! The largest I can find are 160GB.

Bummer

June 23, 2008 01:32 AM

June 22, 2008

Robert Treat

Dealing with out of disk issues for the abusive dba

When you work at a technology consulting firm, you're going to eventually be involved in fixing someone's broken database. Disaster recovery is one of those things that get's easier with experience, but of course if you're doing things correctly, those chances don't come up that often. ("Luckily" that state of commodity hardware still give's us some opportunities...)

One way to gain experience in this area is to abuse the hell out of your development systems. That's not to mean that you should try and break things on purpose, but occasionally mucking around with the underside of things can sometimes be an interesting experience. Of course, sometimes abusing your database can cause you trouble...


Continue reading "Dealing with out of disk issues for the abusive dba"

June 22, 2008 11:03 AM