End Point speakers at PGCon 2009
PGCon is the annual conference for PostgreSQL users and developers, and PGCon 2009 in Ottawa, Canada, is now only about 3 weeks away. The schedule of presentations looks excellent, and I’m excited to have three of my co-workers presenting talks there. Here’s a quick rundown of those talks.
Power psql by Greg Sabino Mullane: The psql command-line interface to PostgreSQL is extremely powerful and versatile. While it’s easy to get started with, investing a little time in learning its many features really pays off in improved productivity. Greg will explore some corners and features you might not have known about, and also delve a little into its history and, more importantly, its future.
VACUUM Strategy by Selena Deckelmann: VACUUM is an important topic for both new and seasoned users of Postgres. Selena’s talk will focus on changes in Postgres from version 8.0 on, tuning configuration parameters related to VACUUM for best performance, autovacuum, the updated Free Space Map in 8.4, and the brand new Visibility Map.
Writing a Procedural Language by Josh Tolley: Stored procedures and user-defined functions offer a lot of power, and PostgreSQL already allows developing such code in many …
conference postgres
Inside PostgreSQL - Data Types and Operator Classes
Two separate posts taken from two separate mailing lists I’m on have gotten me thinking about PostgreSQL data types and operator classes today. The first spoke of a table where the poster had noticed that there was no entry in the pg_stats table for a particular column using the point data type. The second talks about Bucardo failing when trying to select DISTINCT values from a polygon type column. I’ll only talk about the first, here, but both of these behaviors stem from the fact that the data types in question lack a few things more common types always have.
The first stems from the point type’s lack of a default b-tree operator class and lack of an explicitly-declared analyze function. What are those, you ask? In the pg_type table, the column typanalyze contains the OID of a function that will analyze the data type in question, so when you call ANALYZE on a table containing that data type, that function will be run. In a default installation of PostgreSQL, all rows contain 0 in this column, meaning use the default analyze function.
This default analyze function tries, among other things, to build a histogram of the data in the column. Histograms depend on the values in a table …
postgres bucardo
Why OpenAFS?
Once you’ve understood what OpenAFS is, you might ask “Why use OpenAFS?” There are several very good reasons to consider OpenAFS.
First, if you need a cross-platform network filesystem, OpenAFS is a solid choice. While CIFS is the natural choice on Windows, and NFS is a natural choice on Unix, OpenAFS gives a hetergeneous choice (and it works on Mac OS X, too).
Setting aside which filesystem is natural for a given platform, though, OpenAFS has a strong advantage with respect to remote access. While it’s common to access systems remotely via a Virtual Private Network (VPN), Secure Shell (SSH), or Remote Desktop, OpenAFS allows the actual files themselves to be shared across a WAN, a dialup link, or a mobile device (and since OpenAFS is cross platform, the issue of which remote sets of remote access software to support is lessened). Having files appear to be local to the device reduces the need for remote access systems and simplifies access. The big win, though, is that OpenAFS’ file caching helps performance and lessens bandwidth requirements.
Another reason to use OpenAFS is if you need your network filesystem to be secure. While both CIFS and NFS have secure versions, in …
openafs
SEO Ecommerce
I recently read an article that discusses Magento SEO problems and solutions. This got me to think about common search engine optimization issues that I’ve seen in e-commerce. Below are some highlighted e-commerce search engine optimization issues. The Spree Demo, Interchange Demo, and Magento Demo are used as references.
Duplicate Home Pages (www, non-www, index.html)
Duplicate home pages can come in the form of a homepage with www and without www, a homepage in the form of http://www.domain.com/ and a homepage with some variation of “index” appended to the url, or a combination of the two. In the Interchange demo, http://demo.icdevgroup.org/i/demo1 and http://demo.icdevgroup.org/i/demo1/index.html are duplicate, http://demo.spreecommerce.com/ and http://demo.spreecommerce.com/products/ in the Spree demo, and finally http://demo.magentocommerce.com/ and http://demo.magentocommerce.com/index.php in the Magento demo.
External links positively influence search engine performance more if they are pointing to one index page rather than being divided between two or three home pages. Since the homepage most likely receives the most external links, this issue can be more problematic than …
ecommerce seo cms magento spree
In Interchange, You Might Need to [try] [goto]. What’s the [catch]?
Interchange provides tags that allow error trapping and handling within ITL—[try] and [catch]—that can be thought of as analogous to Perl’s eval {}
followed by if ($@) {}
. However, as I discovered the hard way, the analogy is not perfect.
I set up a block of ITL within [try] that had two major actions, with the 2nd depending on the success of the first. In particular, these two actions were a credit card authorization, followed by a capture of that auth as long as (a) the authorization succeeded, and (b) the merchant’s internal rules for analyzing order content compared to AVS results “passed”. (b) was necessary as a fraud-protection measure, tightening up the impact of AVS results based on the historic tendency of certain products to be targeted by crooks. In the event that the auth succeeded, but the tests from (b) failed, it is very important that the capture never be attempted because, to the gateway, the auth is entirely valid and the catpure attempt would succeed.
The code that assesses whether AVS passes is done in its own [calc]. From within the code, if the assessment does not pass, the code issues a die(), which in fact does trigger [try] to log the error that becomes …
interchange perl
That Feeling of Liberation? It’s Git.
In the last few weeks, a few of us have been working on a project for Puppet involving several lines of concurrent development. We’ve relied extensively on the distributed nature of Git and the low cost of branching to facilitate this work. Throughout the process, I occasionally find myself pondering a few things:
- How do teams ever coordinate work effectively when their version control system lacks decent branching support?
- The ease with which commits can be sliced and diced and tossed about (merge, rebase, cherry-pick, and so on) is truly delightful
- It is not unreasonable to describe Git as “liberating” in this process: here is a tool with which the the logical layer (your commit histories) largely reflect reality, with which the engineer is unencumbered in his/her ability accomplish the task at hand, and from which the results’ cleanliness or messiness is the product of the engineering team’s cleanliness or messiness rather than a by-product of the tool’s deficiencies
The current process, in accordance with practices in use within the Puppet project itself, basically involves:
- One “canonical” branch in a particular repository, into which all work is merged by a single …
git
Messaging, Information, and Making Assertions About Stuff You Cannot See
My coworker/friend/distractobot Jon pointed me at this most interesting public HTML resource:
http://www.unlimitednovelty.com/2009/04/twitter-blaming-ruby-for-their-mistakes.html
We’ve done a fair amount of investigation in recent years of the free, open messaging field to try to identify the “best” free/open messaging solution. “Best” in quotes because, in software, the belief that one has found the “best” solution for given problem X often says a lot more about the person holding the belief than it says about the solution itself or problem X.
In a survey of messaging options done for a client last year, we determined (at that time) that ActiveMQ was likely a good solution for the client’s needs. For simple deployments/usage, setup is quite straightforward. There’s good cross-language client support (particularly thanks to native STOMP support). There’s positive feedback out in the community about ActiveMQ. It’s an active project that’s been making decent progress over the years. Etc.
But then the little horror stories pop up. You hear/read that ActiveMQ falls down in various situations. Without getting any visibility into the specifics, it’s impossible to know what the problem …
rails
OFFSET 0, FTW
A query I worked with the other day gave me a nice example of a useful PostgreSQL query planner trick. This query originally selected a few fields from a set of inner-joined tables, sorting by one particular field in descending order and limiting the results, like this:
SELECT <some stuff> FROM table_a INNER JOIN table_b ON (...)
INNER JOIN table_c ON (...) WHERE table_a.field1 = 'value'
ORDER BY table_a.field2 DESC LIMIT 20
The resulting query plan involved a bunch of index scans on the various tables, joined with nested loops, all based on a backward index scan of an index on the table_a.field2 column, looking for rows that matched the condition in the WHERE clause. PostgreSQL likes to choose backward index scans when there’s a LIMIT clause and it needs result sorted in reverse order, because although backward index scans can be fairly slow, they’re easy to interrupt when it finds enough rows to satisfy the LIMIT. In this case, it figured it could search backward through the index on table_a.field2 and quickly find 20 rows where table_a.field1 = ‘value’ is true. The problem was that it didn’t find enough rows as quickly as it thought it would.
One way …
postgres