PostgreSQL version 9.0 release date prediction
So when will PostgreSQL version 9.0 come out? I decided to “run the numbers” and take a look at how the Postgres project has done historically. Here’s a quick graph showing the approximate number of days each major release since version 6.0 took:
Some interesting things can be seen here: there is a rough correlation between the complexity of a new release and the time it takes, major releases take longer, and the trend is gradually towards more days per release. Overall the project is doing great, releasing on average every 288 days since version 6. If we only look at version 7 and onwards, the releases are on average 367 days apart. If we look at just version 7, the average is 324 days. If we look at just version 8, the average is 410. Since the last major version that came out was on July 1, 2009, the numbers predict 9.0 will be released on July 3, 2010, based on the version 7 and 8 averages, and on August 15, 2010, based on just the version 8 averages. However, this upcoming version has two very major features, streaming replication (SR) and hot standby (HS). How those will affect the release schedule remains to be seen, but I suspect the 9.0 to 9.1 window will be short indeed. …
community database open-source postgres
LCA2010: Postgres represent!
I had the pleasure of attending and presenting at LinuxConf.AU this year in Wellington, NZ. Linux Conf.AU is an institution whose friendliness and focus on the practical business of creating and sustaining open source projects was truly inspirational.
My talk this year was “A Survey of Open Source Databases”, where I actually created a survey and asked over 35 open source database projects to respond. I have received about 15 responses so far, and also did my own research on the over 50 projects I identified. I created a place-holder site for my research at: ossdbsurvey.org. I’m hoping to revise the survey (make it shorter!!) and get more projects to provide information.
Ultimately, I’d like the site to be a central location for finding information and comparing different projects. Performance of each is a huge issue, and there are a lot of individuals constructing good (and bad) systems for comparing. I don’t think I want to dive into that pool, yet. But I would like to start collecting the work others have done in a central place. Right now it is really far too difficult to find all of this information.
Part of the talk was also a foray into the dangerous world of classification. …
postgres
Automatic migration from Slony to Bucardo
About a month ago, Bucardo added an interesting set of features in the form of a new script called slony_migrator.pl. In this post I’ll describe slony_migrator.pl and its three major functions.
The Setup
For these examples, I’m using the pagila sample database along with a set of scripts I wrote and made available here. These scripts build two different Slony clusters. The first is a simple one, which replicates this database from a database called “pagila1” on one host to a database “pagila2” on another host. The second is more complex. Its one master node replicates the pagila database to two slave nodes, one of which replicates it again to a fourth slave using Slony’s FORWARD function as described here. I implemented this setup on two FreeBSD virtual machines, known as myfreebsd and myfreebsd2. The reset-simple.sh and reset-complex.sh scripts in the script package I’ve linked to will build all the necessary databases from one pagila database and do all the Slony configuration.
Slony Synopsis
The slony_migrator.pl script has three possible actions, the first of which is to connect to a running Slony cluster and print a synopsis of the Slony setup it …
postgres bucardo replication
PostgreSQL tip: using pg_dump to extract a single function
A common task that comes up in PostgreSQL is the need to dump/edit a specific function. While ideally, you’re using DDL files and version control (hello, git!) to manage your schema, you don’t always have the luxury of working in such a controlled environment. Recent versions of psql have the \ef command to edit a function from within your favorite editor, but this is available from version 8.4 onward only.
An alternate approach is to use the following invocation:
pg_dump -Fc -s | pg_restore -P 'funcname(args)'
The -s flag is the short form of –schema-only; i.e., we don’t care about wasting time/space with the data. -P tells pg_restore to extract the function with the following signature.
As always, there are some caveats: the function name must be spelled out explicitly using the full types as they occur in the dump’s custom format (i.e., you must use ‘foo_func(integer)’ instead of ‘foo_func(int)’). You can always see a list of all of the available functions by using the command:
pg_dump -Fc -s | pg_restore -l | grep FUNCTION
postgres
Postgres: Hello Git, goodbye CVS
It looks like 2010 might be the year that Postgres officially makes the jump to Git. Currently, the project uses CVS, with a script that moves things to the now canonical Postgres Git repo at git.postgresql.org. This script has been causing problems, and is still continuing to do so, as CVS is not atomic. Once the project flips over, CVS will still be available, but CVS will be the slave and Git the master, to put things in database terms. The conversion from Git to CVS is trivial compared to the other way around, so there is no reason Postgres cannot continue to offer CVS access to the code for those unwilling or unable to use Git.
On that note, I’m happy to see that the number of developers and committers who are using Git—and publicly stating their happiness with doing so—has grown sharply in the last couple of years. Peter Eisentraut (with some help from myself) set up git.postgresql.org in 2008, but interest at that time was not terribly high, and there was still a lingering question of whether Git was really the replacement for CVS, or if it would be some other version control system. There is little doubt now that Git is going to win. Not only for the Postgres project, but …
git open-source postgres
Slony: Cascading Subscriptions
Sometime you run into a situation where you need to replicate one dataset to many machines in multiple datacenters, with different costs associated with sending to each (either real costs as in bandwidth, or virtual costs as in the amount of time it takes to transmit to each machine). Defining a Slony cluster to handle this is easy, as you can specify the topology and paths taken to replicate any changes.
Basic topology:
- Data center A, with machines A1, A2, A3, and A4.
- Data center B, with machines B1, B2, B3, and B4.
- Data center C, with machines C1, C2, C3, and C4.
Figure 1: Non-cascaded slony replication nodes/pathways.
Node A1 is the master, which propagates its changes to all other machines. In the simple setup, A1 would push all of its changes to each node, however if data centers B and C have high costs associated with transfer to the nodes, you end up transferring 4x the data needed for each data center. (We are assuming that traffic on the local subnet at each data center is cheap and fast.)
The basic idea then, is to push the changes only once to each datacenter, and let the “master” machine in the data center push the changes out to the others in the data center. This …
postgres scalability
Blog versus Forum, Blogger versus WordPress in Ecommerce
Today, Chris sent me an email with two questions for one of our ecommerce clients:
- For ecommerce client A, should a forum or blog be added?
- For ecommerce client A, should the client use Blogger or WordPress if they add a blog?
These are relevant questions to all of our clients because forums and blogs can provide value to a static site or ecommerce site. I answered Chris’ question and thought I’d expand on it a bit for a brief article.
First, a rundown comparing the pros and cons of blog versus forum:
Blog | Forum | |
Pros |
|
|
Cons |
|
|
ecommerce seo
SEO 2010 Trends and Strategies
Yesterday I attended SEOmoz’s webinar titled “SEO Strategies for 2010”. Some interesting factoids, comments and resources for SEO in 2010 were presented that I thought I’d highlight:
-
Mobile browser search
- Mobile search and ecommerce will be a large area of growth in 2010.
- Google Webmaster Tools allows you to submit mobile sitemaps, which can help battle duplicate content between non-mobile and mobile versions of site content. Another way to handle duplicate content would be to write semantic HTML that allows sites to serve non-mobile and mobile CSS.
-
Social Media: Real Time Search
- Real time search marked its presence in 2009. The involvement of Twitter in search is evolving.
- Tracking and monitoring on URL shortening services should be set up to measure traffic and benefit from Twitter.
- Dan Zarrella published research on The Science of Retweeting. This is an interesting resource with fascinating statistics on retweets.
-
Social Media: Facebook’s Dominance
- Recent research by comScore has shown that 5.5% of all time on the web is spent in Facebook.
- Facebook has very affordable advertising. Facebook has so much demographic and psychographic data that allows sites to deliver …
ecommerce seo