Tail_n_mail and the log_line_prefix curse
One of the problems I had when writing tail_n_mail (a program that parses log files and mails interesting lines to you) was getting the program to understand the format of the Postgres log files. There are quite a few options inside of postgresql.conf that control where the logging goes, and what it looks like. The basic three options are to send it to a rotating logfile with a custom prefix at the start of each line, to use syslog, or to write it in CSV format. I’ll save a discussion of all the logging parameters for another time, but the important one for this story is log_line_prefix. This is what gets prepended to each log line when using ‘stderr’ mode (e.g. regular log files and not syslog or csvlog). By default, log_line_prefix is an empty string. This is a very useless default.
What you can put in the log_line_prefix parameter is a string of sprintf style escapes, which Postgres will expand for you as it writes the log. There are a large number of escapes, but only a few are commonly used or useful. Here’s a log_line_prefix I commonly use:
log_line_prefix = '%t [%p] %u@%d '
This tells Postgres to print out the timestamp, the PID aka process id (inside of square …
monitoring open-source perl postgres
A WordPress Migration Quick Tip
This morning Chris Kershaw asked me about a WordPress migration issue he was experiencing. Chris dumped the database from the current live server and imported it to another server with a temporary domain assigned and then tried to access the blog. Whenever he would attempt to visit the login (wp-admin) page, he would be redirected to the live domain admin login URL instead of the temporary domain. Luckily, there’s a quick fix for this.
The simplified explanation for this is that throughout the WordPress code, there are various places where the base URL is retrieved from the database. There is a table (wp_options by default) that includes option settings and a function to retrieve data from the wp_options table (get_option). You’ll see something similar to the following two lines scattered throughout the WordPress source to retrieve the base URL to be used in redirects or link creation. In Chris’ case, my guess is that the wp-admin page was attempting to redirect him to the secure login page, which uses one of the examples below to get the base URL.
$home = parse_url( get_option('siteurl') );
$home_path = parse_url(get_option('home'));
If we take a look at the …
php
Distributed Transactions and Two-Phase Commit
The typical example of a transaction involves Alice and Bob, and their bank. Alice pays Bob $100, and the bank needs to debit Alice and credit Bob. Easy enough, provided the server doesn’t crash. But what happens if the bank debits Alice, and then before crediting Bob, the server goes down? Or what if they credit Bob first, and then try to debit Alice only to find she doesn’t have enough funds? A transaction allows the debit and credit operations to happen as a package (“atomically” is the word commonly used), so either both operations happen or neither happens, even if the server crashes halfway through the transaction. That way the bank never credits Bob without debiting Alice, or vice versa.
That’s simple enough, but the situation can become more complex. What if, for instance, for buzzword-compliance purposes, the bank has “sharded” its accounts database by splitting it in pieces and putting each piece on a different server (whether this is would be smart or not is outside the scope of this post). The typical transaction handles statements issued only for one database, so we can’t wrap the debit and credit operations within a single BEGIN/COMMIT if Alice’s account information …
database open-source postgres ruby scalability
Spree: Gift Certificates and Coupons
In a recent Spree project, I’ve been working with Bill Bennett to add gift certificate functionality. According to the Spree documentation, gift certificate functionality is trivial to implement using the existing coupon architecture. Here are some of the changes we went through as we tried to use the coupon architecture for gift certificate implementation—we found that it wasn’t so simple after all.
Here is a very simplified visualization of the coupon and adjustment data model in Spree. Coupons use polymorphic calculators to compute the applicable discount.
First, Bill and I brainstormed to come up with an initial set of changes required for implementing gift certificates as coupons after we reviewed the data model shown above:
- Add logic to create a coupon during checkout finalization, which was done with the following:
# coupon object class method
def self.generate_coupon_code
# some method to generate an unused random coupon code beginning in 'giftcert-'
end
# inside order model during checkout finalization
line_items.select { |li| li.variant.product.is_gift_cert? }.each do |line_item|
line_item.quantity.times do
coupon = Coupon.create(:code => Coupon …
ecommerce rails spree
Efficiency of find -exec vs. find | xargs
This is a quick tip for anyone writing a cron job to purge large numbers of old files.
Without xargs, this is a pretty common way to do such a purge, in this case of all files older than 31 days:
find /path/to/junk/files -type f -mtime +31 -exec rm -f {} \;
But that executes rm once for every single file to be removed, which adds a ton of overhead just to fork and exec rm so many times. Even on modern operating systems that are so efficient with fork, it can easily increase the I/O and load and runtime by 10 times or more than just running a single rm command with a lot of file arguments.
Instead do this:
find /path/to/junk/files -type f -mtime +31 -print0 | xargs -0 -r rm -f
That will run xargs once for each very long list of files to be removed, so the overhead of fork & exec is incurred very rarely, and the job can spend most of its effort actually unlinking files. (The xargs -r option says not to run the command if there is no input to xargs.)
How long can the argument list to xargs be? It depends on the system, but xargs –show-limits will tell us. Here’s output from a RHEL 5 x86_64 system (using findutils 4.2.27):
% xargs --show-limits
Your environment …
hosting optimization
PostgreSQL: per-version .psqlrc
File this under “you learn something new every day.” I came across this little tidbit while browsing the source code for psql: you can have a per-version .psqlrc file which will be executed only by the psql associated with that major version. Just name the file .psqlrc-$version, substituting the major version for the $version token. So for PostgreSQL 8.4.4, it would look for a file named .psqlrc-8.4.4 in your $HOME directory.
It’s worth noting that the version-specific .psqlrc file requires the full minor version, so you cannot currently define (say) an 8.4-only version which applies to all 8.4 psqls. I don’t know if this feature gets enough mileage to make said modification worth it, but it would be easy enough to just use a symlink from the .psqlrc-$majorversion to the specific .psqlrc file with minor version.
This seems of most interest to developers, who may simultaneously run many versions of psql which may have incompatible settings, but also could come in handy to regular users as well.
postgres
PostgreSQL: Dynamic SQL Function
Sometimes when you’re doing something in SQL, you find yourself doing something repetitive, which naturally lends itself to the desire to abstract out the boring parts. This pattern is often prevalent when doing maintenance-related tasks such as creating or otherwise modifying DDL in a systematic kind of way. If you’ve ever thought, “Hey, I could write a query to handle this,” then you’re probably looking for dynamic SQL.
The standard approach to using dynamic SQL in PostgreSQL is plpgsql’s EXECUTE function, which takes a text argument as the SQL statement to execute. One technique fairly well-known on the #postgresql IRC channel is to create a function which essentially wraps the EXECUTE statement, commonly known as exec(). Here is the definition of exec():
CREATE FUNCTION exec(text) RETURNS text AS $$ BEGIN EXECUTE $1; RETURN $1; END $$ LANGUAGE plpgsql;
Using exec() then takes the form of a SELECT query with the appropriately generated query to be executed passed as the sole argument. We return the generated query text as an ease in auditing the actually executed results. Some examples:
SELECT exec('CREATE TABLE partition_' || generate_series(1,100) || ' (LIKE …
postgres
Localize $@ in DESTROY
I have been conditioned now for many years in Perl to trust the relationship of $@ to its preceding eval. The relationship goes something like this: if you have string or block eval, immediately after its execution, $@ will either be false or it will contain the die message of that eval (or the generic “Died at …” message if none is provided). Implicit here is that evals contained within an eval have their effects on $@ concealed, unless the containing eval “passes on” the inner eval’s die.
To quickly demonstrate:
use strict;
use warnings;
eval {
print "Some stuff\n";
eval {
die 'Oops. Bad inner eval';
};
printf '$@ in outer eval: %s', $@;
};
printf '$@ after outer eval: %s', $@;
print $/;
produces the following output:
[mark@sokt ~]$ perl demo.pl
Some stuff
$@ in outer eval: Oops. Bad inner eval at demo.pl line 7.
$@ after outer eval:
[mark@sokt ~]$
Only if the containing eval itself dies do we find any data in $@:
use strict;
use warnings;
eval {
print "Some stuff\n";
eval {
die 'Oops. Bad inner eval';
};
printf '$@ in outer eval: %s', $@;
die 'Uh oh. Bad …
perl tips