Apache RewriteRule to a destination URL containing a space
Today I needed to do a 301 redirect for an old category page on a client’s site to a new category which contained spaces in the filename. The solution to this issue seemed like it would be easy and straight forward, and maybe it is to some, but I found it to be tricky as I had never escaped a space in an Apache RewriteRule on the destination page.
The rewrite rule needed to rewrite:
/scan/mp=cat/se=Video Games
to:
/scan/mp=cat/se=DS Video Games
I was able to get the first part of the rewrite rule quickly:
^/scan/mp=cat/se=Video\sGames\.html$
The issue was figuring out how to properly escape the space on the destination page. A literal space, %20 and \s all failed to work properly. Jon Jensen took a look and suggested a standard Unix escape of ‘\ ’ and that worked. Some times a solution is right under your nose and it’s obvious once you step back or ask for help from another engineer. Googling for the issue did not turn up such a simple solution, thus the reason for this blog posting.
The final rule:
RewriteRule ^/scan/mp=cat/se=Video\sGames\.html$ http://www.site.com/scan/mp=cat/se=DS\ Video\ Games.html [L,R=301]
hosting seo
Passenger and SELinux
We recently ran into an issue when launching a client’s site using Phusion Passenger where it would not function with SELinux enabled. It ended up being an issue with Apache having the ability to read/write the Passenger sockets. In researching the issue we found another engineer had reported the problem and there was discussion about having the ability to configure where the sockets could be placed. This solution would allow someone to place the sockets in a directory other than /tmp and set the context on the directory so that sockets created within it have the same context and then grant httpd the ability to read/write to sockets with that specific context. This is a win over granting httpd the ability to read/write to all sockets in /tmp since many other services place their sockets there and you may not want httpd to be able to read/write to those sockets.
End Point had planned to take on the task of patching passenger and submitting the patch. While collecting information about the issue this morning to pass to Max I found this in the issue tracker for Passenger:
Comment 4 by honglilai, Feb 21, 2009 Implemented.
Status: Fixed
Labels: Milestone-2.1.0
Excellent! We’ll be …
environment rails
Search Engine Optimization Thoughts
Search engine optimization and search engine marketing cover a wide range of opportunities to improve a website’s search engine traffic. When performining a search engine site review, here are several questions that I would investigate. Note that the questions are biased towards technical search engine optimization efforts. Some of the questions provide links to help define common search engine optimization terms. Although this is not typical End Point blog fashion, the answers to these questions can potentially lead to search engine optimization improvements.
Technical Topics
- Do the pages indexed by major search engines accurately represent the site’s content?
- Are there duplicate index pages indexed?
- Are there old index pages or domains that aren’t redirected?
- Are there pages missing from major search engine indexes?
- Are there too many pages in major search engine indexes?
- Are 301 redirects used permanently on the site?
- Can rel=“canonical” or the use of 301s be applied as a temporary solution to fix duplicate content issues?
- Is there low hanging fruit to fix duplicate content issues?
- Is there low hanging fruit to fix duplicate content generated by external links?
- Are …
seo
osCommerce dead and reborn
Here’s some interesting history of osCommerce. Sounds like it’s been quite the ride for its users. Congrats to those who finally forked it and started making releases.
ecommerce
Bare Git repositories and newspapers
During a recent discussion about Git, I realized yet again that previous knowledge of a version control system (VCS) actively hinders understanding of Git. This is especially challenging when trying to understand the difference between bare vs non-bare repositories.
An analogy might be helpful: Assume a modern newspaper, where the actual contents of the physical pages are stored in a database; i.e., the database might store contents of articles in one table, author information in another, page layout information in yet another table, and information on how an edition is built in yet another table, or perhaps in an external program. Any particular edition of the paper just happens to be a particular instantiation of items that live in the database.
Suppose an editor walks in and tells the staff “Create a special edition that consists of the front pages of the past week’s papers.” That edition could easily be created by taking all the front page articles from the past week from the database. No new content would be needed in the content tables themselves, just some metadata changes to label the new edition and description of how to build it.
One could consider the database, then, to …
git
Puppet PDX meeting
Even zombies like puppets!
If you live in or near Portland, OR, come join us for our first Puppet PDX meeting. We’re meeting from 6-7pm, at Paddy’s on SW 1st and Yamhill in Downtown Portland. It’s right on the MAX line.
Inspired by Puppet and the Reductive Labs team, we’re gathering people interested in all things related to configuration management. (Not sock puppet-making, sorry!) Cfengine user? Thinking about trying AutomateIt? Just have a pile of obsessively managed scripts? Come on down! We’ll discuss tools, best practices, and generally how to make your systems run so well you can get to the pub by 4 o’clock.
This is the first meetup. Hopefully we can get organized, get to know each other, and decide on what the goals of the group should be.
So if you are interested in automation, configuration management, cloud computing, and large scale computing environments, come join us for a few drinks and some lively chatter.
Please RSVP by sending an email to puppetpdx@reductivelabs.com so that we can get a bigger room if needed.
devops community puppet
Replicate only parts of my table
A day or two ago in #slony, someone asked if Slony would replicate only selected columns of a table. A natural response might be to create a view containing only the columns you’re interested in, and have Slony replicate that. But Slony is trigger-based—the only reason it knows there’s something to replicate is because a trigger has told it so—and you can’t have a trigger on a view. So that won’t work. Greg chimed in to say that Bucardo could do it, and mentioned a Bucardo feature I’d not yet noticed.
Bucardo is trigger-based, like Slony, so defining a view won’t work. But it allows you to specify a special query string for each table you’re replicating. This query is called a “customselect”, and can serve to limit the columns you replicate, transform the rows as they’re being replicated, etc., and probably a bunch of other stuff I haven’t thought of yet. A simple example:
- Create a table in one database as follows:
CREATE TABLE synctest (
id INTEGER PRIMARY KEY,
field1 TEXT,
field2 TEXT,
field3 TEXT
);
Also create this table in the replication destination database; Bucardo won’t replicate schema changes or database structure.
-
Tell Bucardo about the table. I won’t …
postgres bucardo replication
Announcing Release of PostgreSQL System Impact (PGSI) Log Analyzer
The PostgreSQL System Impact (PGSI) log analyzer is now available at https://bucardo.org/Pgsi/.
System Impact (SI) is a measure of the overall load a given query imposes on a server. It is expressed as a percentage of a query’s average duration over the its average interval between successive calls.
Queries are collected into canonical form with respect to literals and bind params; further, IN lists of varying cardinality are collapsed. Thus, queries that differ only in argument composition will be collected together in the evaluation. However, logically equivalent queries that differ in any other manner of structure (say two comparisons between and that are transposed) will be seen as distinct.
The goal of SI is to identify those queries most likely to cause performance degradation on the database during heaviest traffic periods. Focusing exclusively on the least efficient queries can hide relatively fast-running queries that saturate the system more because they are called far more frequently. By contrast, focusing only on the most-frequently called queries will tend to emphasize small, highly optimized queries at the expense of slightly less popular queries that spend much more …
postgres