Our Blog

Ongoing observations by End Point Dev people

Extending Your Jetty Distribution’s Capabilities

Kürşat Kutlu Aydemir

By Kürşat Kutlu Aydemir
March 31, 2022

Jetty Logo

What is Jetty?

“Jetty is a lightweight highly scalable Java-based web server and servlet engine.” (Jetty Project)

Jetty can run standalone or embedded in a Java application and the details about running a Jetty webserver can be found in the Jetty Project Git repository and documentation as well. The Jetty project has been hosted at the Eclipse Foundation since 2009 (Jetty, Eclipse).

Know Your Jetty

In many legacy environments using the Jetty web server there may be an older version of Jetty. If you know the version of the Jetty distribution in your environment then you can find its source code in the Jetty project GitHub repo. Some of the distributions are in project releases but most of the distributions can be found in the tags as well.

For instance jetty-9.4.15.v20190215 distribution can be found in the Jetty project tags at this URL: https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.15.v20190215

When you clone the jetty.project Git repo, you can then easily switch to any specific release tag:

$ git clone git@github.com:eclipse/jetty.project.git
$ git checkout jetty-9.4.15.v20190215

Then you can build or add your custom code in that version.

Extending Your Jetty Capabilities

The reason you might want to build Jetty yourself is that you have a specific Jetty version in your environment and want to add some custom handlers or wrappers so that you can add additional capabilities in your environment.

Jetty is written in Java and you can add new features or patch your own fork like other open-source Java projects.

Build

Once you have your target version code base you can just work on that individually. This is one way to add new features to your Jetty distribution.

After you add your custom code you’ll need to build. You can find the building instructions on Jetty Project GitHub home, which is simply:

$ mvn clean install

If you want to skip the tests the option below is your friend:

$ mvn clean install -DskipTests

Compile Classes Individually

This is a tricky way to inject your newly created custom classes into your Jetty distribution. In this way, instead of building the whole Jetty project, you can just create individual custom Java classes consuming Jetty libraries and compile them manually. You don’t need the whole project this way.

If we come back to the question: what new features would I want to add to my new or ancient local Jetty distribution? Well, that really depends on the issues you face or improvements you need to add.

For one of our customers, once we needed to log request and response headers in Jetty. We couldn’t find an existing way to do that. So I decided to create a custom RequestLog handler class and inject this into the Jetty deployment we already have rather than building the whole project.

Even if you don’t build the whole project it is still useful and handy to get the whole project code to refer the existing code and prepare your code by learning the existing way things are done in the project.

I found RequestLog interface in jetty-server sub-project and it is created under org.eclipse.jetty.server package. There is also a class RequestLogCollection in the same level implementing RequestLog which may give you some idea about the implementations.

So I followed the structure and created my custom handler in the same level and implemented RequestLog. Below is a part of my CustomRequestLog class:

package org.eclipse.jetty.server;

import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import org.eclipse.jetty.http.pathmap.PathMappings;
import org.eclipse.jetty.util.component.ContainerLifeCycle;
import org.eclipse.jetty.util.log.Log;
import org.eclipse.jetty.util.log.Logger;

import java.io.IOException;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.*;

public class CustomRequestLog extends ContainerLifeCycle implements RequestLog
{
    protected static final Logger LOG = Log.getLogger(CustomRequestLog.class);

    private static ThreadLocal<StringBuilder> _buffers = ThreadLocal.withInitial(() -> new StringBuilder(256));

    protected final Writer _requestLogWriter;

    private String[] _ignorePaths;
    private transient PathMappings<String> _ignorePathMap;

    public CustomRequestLog(Writer requestLogWriter)
    {
        this._requestLogWriter = requestLogWriter;
        addBean(_requestLogWriter);
    }

    /**
     * Is logging enabled
     *
     * @return true if logging is enabled
     */
    protected boolean isEnabled()
    {
        return true;
    }

    /**
     * Write requestEntry out. (to disk or slf4j log)
     *
     * @param requestEntry the request entry
     * @throws IOException if unable to write the entry
     */
    public void write(String requestEntry) throws IOException
    {
        _requestLogWriter.write(requestEntry);
    }

    private void append(StringBuilder buf, String s)
    {
        if (s == null || s.length() == 0)
            buf.append('-');
        else
            buf.append(s);
    }

    /**
     * Writes the request and response information to the output stream.
     *
     * @see RequestLog#log(Request, Response)
     */
    @Override
    public void log(Request request, Response response)
    {
        try
        {
            if (_ignorePathMap != null && _ignorePathMap.getMatch(request.getRequestURI()) != null)
                return;

            if (!isEnabled())
                return;

            StringBuilder buf = _buffers.get();
            buf.setLength(0);

            Gson gsonObj = new GsonBuilder().disableHtmlEscaping().create();
            Map<String, Object> reqLogMap = new HashMap<String, Object>();
            Map<String, String> reqHeaderMap = new HashMap<String, String>();
            // epoch timestamp
            reqLogMap.put("timestamp_epoch", System.currentTimeMillis());

            // timestamp
            DateFormat df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
            String nowAsString = df.format(new Date());
            reqLogMap.put("timestamp", nowAsString);

            // request headers
            List<String> reqHeaderList = Collections.list(request.getHeaderNames());
            for(String headerName : reqHeaderList) {
                reqHeaderMap.put(headerName.toLowerCase(), request.getHeader(headerName));
            }
            reqLogMap.put("request_headers", reqHeaderMap);

            // response headers
            Map<String, String> resHeaderMap = new HashMap<String, String>();
            for(String headerName : response.getHeaderNames()) {
                resHeaderMap.put(headerName.toLowerCase(), response.getHeader(headerName));
            }
            reqLogMap.put("response_headers", resHeaderMap);

            // http method
            reqLogMap.put("http_method", request.getMethod());

            // original URI
            reqLogMap.put("original_uri", request.getOriginalURI());

            // protocol
            reqLogMap.put("protocol", request.getProtocol());

            // http status
            reqLogMap.put("http_status", response.getStatus());

            // query string
            reqLogMap.put("query_string", request.getQueryString());

            String reqJSONStr = gsonObj.toJson(reqLogMap);
            buf.append(reqJSONStr);

            String log = buf.toString();
            write(log);
        }
        catch (IOException e)
        {
            LOG.warn(e);
        }
    }
}

In this custom RequestLog class the most interesting part is public void log(Request request, Response response) method where the logging operation is actually done. You can simply override the existing logging behaviour and put anything you want. Here I added the raw request and response headers coming and going through Jetty server.

Now it is time to compile this class. You can find many tutorials about compiling a single Java class using classpath. Here’s how I did it:

$ javac -cp ".:$JETTY_HOME/lib/jetty-server-9.4.15.v20190215.jar:$JETTY_HOME/lib/jetty-http-9.4.15.v20190215.jar:$JETTY_HOME/lib/jetty-util-9.4.15.v20190215.jar:$JETTY_HOME/lib/servlet-api-3.1.jar:$JETTY_HOME/lib/gson-2.8.2.jar" CustomRequestLog.java

If you look at my classpath I even added a third party library gson-2.8.2.jar since I also used this in my custom code. Remember to put this in your $JETTY_HOME directory as well.

The command above generates the CustomRequestLog.class file which is now available to be injected. So where do you need to inject this?

Since I followed where the RequestLog interface is located and packaged we better inject this into the same project JAR file, which is jetty-server.jar. In my environment it is jetty-server-9.4.15.v20190215.jar. I also added other required dependencies in the classpath to compile this code.

Now, I want to inject CustomRequestLog.class into jetty-server-9.4.15.v20190215.jar. I copied this jar into a temporary directory and I extracted the content of jetty-server-9.4.15.v20190215.jar into the temp directory using this command:

$ jar xf jetty-server-9.4.15.v20190215.jar

This command extracts all the content of the jar file including resource files and the classes in their corresponding directory structure org/eclipse/jetty/server. You would see RequestLog.class also extracted in this directory.

So what we need to do is now simply copy our CustomRequestLog.class into this extracted org/eclipse/jetty/server directory and pack up the JAR file again by running this command:

$ jar cvf jetty-server-9.4.15.v20190215.jar org/ META-INF/

This command re-bundles compiled code along with the other extracted resources (in this case the META-INF/ directory only) and creates our injected JAR file. You’d better create this injected Jetty JAR file in the temp directory so that you can control the backup of existing original JAR files.

For this specific case I added this custom RequestLog handler in my Jetty config file jetty.xml. It may not be the case for all the custom changes or extensions you’d add to your Jetty instance.

Here is an example RequestLog config entry for this custom handler:

<Set name="RequestLog">
  <New id="RequestLog" class="org.eclipse.jetty.server.CustomRequestLog">
    <!-- Writer -->
    <Arg>
      <New class="org.eclipse.jetty.server.AsyncRequestLogWriter">
        <Arg>
          <Property name="jetty.base" default="." />/
          <Property>
            <Name>jetty.requestlog.filePath</Name>
            <Default>
              <Property name="jetty.requestlog.dir" default="logs"/>/yyyy_mm_dd.request.log
            </Default>
          </Property>
        </Arg>
        <Arg/>
        <Set name="filenameDateFormat">
          <Property name="jetty.requestlog.filenameDateFormat" default="yyyy_MM_dd"/>
        </Set>
        <Set name="retainDays">
          <Property name="jetty.requestlog.retainDays" default="90"/>
        </Set>
        <Set name="append">
          <Property name="jetty.requestlog.append" default="false"/>
        </Set>
        <Set name="timeZone">
          <Property name="jetty.requestlog.timezone" default="GMT"/>
        </Set>
      </New>
    </Arg>
  </New>
</Set>

That’s all.


java jetty development

Working around SPF problems delivering to Gmail

Jon Jensen

By Jon Jensen
March 30, 2022

Hand-drawn signs reading “Someplace”, “Any pla…”, “No place”, with arrows pointing variously, attached to a leaning signpost in front of a high mountain desert scene with snow-topped peaks and sagebrush Photo by Garrett Skinner

Email deliverability

Legitimate email delivery keeps getting harder. Spammers and phishers never stop flooding everyone’s inboxes with unwanted and harmful email, so automated defenses against junk mail are necessary. But they are not perfect, and good email sometimes gets flagged as spam.

When sending important “transactional” email such as for account confirmations, password resets, and ecommerce receipts, it is often worth using a paid email delivery service to increase deliverability. Those typically cost a flat amount per month for up to a certain quota of outgoing email, with overage charges for messages beyond that.

Many of our clients use one of those services and generally they have all worked well and differ mostly in pricing and feature set. Popular choices include SendGrid, Mandrill, Postmark, Mailgun, and Amazon SES.

We continue to have many cases where we want to be able to send potentially large amounts of automated email to ourselves, our clients, or our systems. This is usually for testing, notifications, or internal delivery to special mailboxes separate from our main mailboxes.

These other uses for sending email keep us involved in the fight for good email deliverability from our own servers, which we have worked at over many years, long predating these paid email delivery services.

Sender Policy Framework

One of the longest-running tools to fight spam is SPF, the Sender Policy Framework.

SPF is an open standard that provides a way for a receiving mail server to verify that the sending server is authorized to send email for the message’s “envelope” sender domain. The envelope sender address or “return-path” is not normally seen by email recipients, but is used behind the scenes by servers. It may or may not be the same as the sender seen in the “From” header.

The SPF policy for each domain is set in a special DNS TXT record for that domain.

The important thing is that each sender’s email belongs to a domain with a valid SPF record showing that the sending servers are allowed to send for that domain, and that all other servers should not be allowed to send email for that domain.

For example, our endpointdev.com domain currently has this TXT record to define its SPF policy:

v=spf1 a:maildrop.endpointdev.com include:_spf.google.com include:servers.mcsv.net -all

Let’s look at each of those space-separated elements:

  • v=spf1 designates this TXT record as an SPF policy, version 1 (the only one so far).
  • a:maildrop.endpointdev.com means to allow the A (IPv4) and/or AAAA (IPv6) IP address(es) of hostname maildrop.endpointdev.com as a valid source.
  • include:_spf.google.com means to look up another DNS TXT record at _spf.google.com (for Gmail, our main email provider here) and add its SPF policy to ours.
  • include:servers.mcsv.net is the same thing, but for servers.mcsv.net (for Mailchimp, to allow it to deliver email newsletters for our domain).
  • -all means to disallow any other senders.

With such a policy, receiving mail servers can immediately reject any incoming email claiming to be sent by us for our domain if it didn’t come from one of our designated servers.

This obviously doesn’t stop all spam, but it stops a whole class of forged senders, which is very helpful.

The key point to note is that SPF applies to the sending email server at the moment it connects to the receiving email server. It doesn’t deal with anything else.

One other point to note is that SPF policies are limited to a fairly small total number of DNS lookups via include elements, so we can’t endlessly add new valid sending servers to our list.

Email server trails

Based on the above SPF policy, if we want to send email from address notifier@endpointdev.com, it will have to be sent through maildrop.endpointdev.com, Gmail, or Mailchimp. Messages coming from any other sending server should be rejected by the receiving server. They don’t have to behave that way, but it is in their interest to do so if they don’t like spam.

We have an internal server we’ll call dashboard.endpointdev.com, which sends email notifications from address notifier@endpointdev.com.

Since we don’t want to bloat our SPF policy, we’ll have our server dashboard.endpointdev.com route its outgoing email through our mail forwarding service called maildrop, which lives on two or more servers behind the DNS name maildrop.endpointdev.com.

This is a good idea for several reasons:

  • It keeps all our outgoing email flowing through a few places so we can easily monitor them for any problems.
  • We don’t need to have SMTP daemons running on all our servers just to send outbound email.
  • We don’t need to worry about the quotas or pricing of commercial emailing services when sending less-important or internal-only email.

Since SPF is designed for a receiving email server to check that the server connecting to it to send email is authorized to do so for that email address’s domain, it shouldn’t matter what server the email originated on.

Gmail misuses header information in SPF checks

We recently discovered that Gmail has been misusing email header information in its SPF checks.

When one of our outgoing emails originated from server dashboard.endpointdev.com and was then forwarded to maildrop.endpointdev.com which then delivered it to Gmail, Gmail looked at the earliest sender server it could find in the Received headers of the email message, found dashboard.endpointdev.com, and flagged it as an SPF failure because our SPF policy didn’t include dashboard.endpointdev.com [206.191.128.233].

This can be seen in this excerpt of relevant email headers. (Some specific details here were changed to protect the innocent.) Note that email headers appear in reverse chronological order, so the most recent events are at the top:

Received: from maildrop14.epinfra.net (maildrop14.epinfra.net. [69.25.178.35])
        by mx.google.com with ESMTPS id l20si5561179oos.78.2022.01.25.10.52.05
        for <notifications@endpointdev.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 25 Jan 2022 10:52:05 -0800 (PST)
Received-SPF: fail (google.com: domain of notifier@endpointdev.com does not designate 206.191.128.233 as permitted
    sender) client-ip=206.191.128.233;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@endpointdev.com header.s=maildrop header.b=hR445V77;
       spf=fail (google.com: domain of notifier@endpointdev.com does not designate 206.191.128.233 as permitted
    sender) smtp.mailfrom=notifier@endpointdev.com
Received: from dashboard.endpointdev.com (dashboard.endpointdev.com [206.191.128.233])
    by maildrop14.epinfra.net (Postfix) with ESMTP id A2AA03E8A7
    for <notifications@endpointdev.com>; Tue, 25 Jan 2022 18:52:05 +0000 (UTC)
To: <notifications@endpointdev.com>

That is wrong! The SPF check should have been done against maildrop14.epinfra.net [69.25.178.35] because that is the IP address that actually connected to Gmail to send the email. That server is one of our infrastructure hostnames allowed to send email as part of the maildrop.endpointdev.com DNS record, so checking it would have led Gmail to give a passing SPF result.

Why did Gmail do this? I don’t know, and at the time didn’t find any public discussion that would explain it. I suspect it has something to do with Gmail’s internal systems being comprised of many, many servers, and the SPF check being done long after the email was passed on from the initial receiving point through various other servers. Then Gmail parses the headers to find out who the sender was, and gets confused.

Don’t share TMI

We can avoid this problem by not having maildrop mention our original sending server dashboard.endpointdev.com at all.

Why should it mention it in the first place? It’s helpful for tracing problems when debugging, but really is TMI (too much information) for normal email sending, and exposes internal infrastructure details that would be better omitted anyway.

Since dashboard.endpointdev.com is running the very flexible and configurable Postfix email server, we can direct it to remove any Received headers that mention our internal hostnames.

By default Postfix in /etc/postfix/main.cf has the header_checks directive set to look at a table to match regular expressions and take specified actions.

So we added a regular expression to match and designated the action IGNORE, to the file /etc/postfix/header_checks:

/^Received:\ (from|by)\ .*(epinfra\.net|endpointdev\.com|localhost|localdomain)/  IGNORE

Then we update the map database file so that it takes immediate effect for new email flowing through Postfix:

postmap /etc/postfix/header_checks

When we sent another notification email from dashboard.endpointdev.com and received it in Gmail we saw the email’s headers look like this:

Received: from maildrop14.epinfra.net (maildrop14.epinfra.net. [69.25.178.35])
        by mx.google.com with ESMTPS id g72si1894187vke.271.2022.01.25.11.03.37
        for <notifications@endpointdev.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 25 Jan 2022 11:03:37 -0800 (PST)
Received-SPF: pass (google.com: domain endpointdev.com configured 69.25.178.35 as internal address)
Authentication-Results: mx.google.com;
       dkim=pass header.i=@endpointdev.com header.s=maildrop header.b=qkccUkkU;
       spf=pass (google.com: domain endpointdev.com configured 69.25.178.35 as internal address)
    smtp.mailfrom=notifier@endpointdev.com
To: <notifications@endpointdev.com>

There is no more mention of dashboard or its IP address, so Gmail runs its SPF check against the proper IP address 69.25.178.35 which belongs to server maildrop14.epinfra.net which is part of the maildrop.endpointdev.com DNS name. Gmail now validates that IP address is allowed to send for the endpointdev.com domain and gives a “pass” result for its SPF check.

Perhaps this will help your legitimate email delivery too!

Reference


email sysadmin hosting

Code Reviews

Kevin Campusano

By Kevin Campusano
March 21, 2022

Winter scene with pine trees behind snow on tall grasses around a winding stream crossed by a primitive bridge of 8 logs, below blue sky with white clouds

Last week, a few End Point team members and I came together to prepare a presentation on code reviews for the whole company. We went through the basics of “what”, “why”, and “how”.

We also, and perhaps most interestingly, made several recommendations that we’ve discovered after years of doing code reviews in a variety of teams and project sizes. A series of “lessons learned” so to speak.

I thought it’d be useful to capture that discussion in written form. Let’s start with the basics.

What is a code review?

Wikipedia’s article on code reviews says that a code review is…

A software quality assurance activity in which one or several people check a program mainly by viewing and reading parts of its source code, and they do so after implementation or as an interruption of implementation.

That is a precise but frankly wordy way to say “having somebody look at the code you’ve written”. This definition, however, touches on a few aspects that give us good insight into what code reviews are and what their purpose is.

First up, it tells us that code reviews are a software quality assurance activity. That is, their main goal is to make sure that the code that’s being produced is of good quality.

Second, it tells us that they are carried out by one or several people, revealing that code reviews are a team exercise. It’s the opposite of coding in isolation. Coding becomes a communal task, with input from other team members.

It also tells us, maybe unsurprisingly, that the main focus of the review is the code itself. As the main deliverable artifact of the software development process, we should strive to make it as good as possible.

Finally, it tells us when code reviews should happen: when implementation is done, or there’s a logical interruption of it. Meaning, once a feature is done, a user story is complete, a bug fixed. That is, when there’s a cohesive chunk of code that has been written.

Why should we do code reviews?

So why are code reviews important? They provide many benefits.

First and foremost, code reviews can help improve the code’s internal quality. Productive discussion around an implementation can help improve maintainability and readability of the code when reviewers, with a fresh set of eyes, spot the potential for such improvements where the original author may have missed them.

Also, and just as important, external quality of the code can be improved. Reviewers can help find bugs or other types of defects like security and performance issues.

Code reviews can also serve as a knowledge sharing tool. Code written by team members of more seniority or who are more knowledgeable about the code area, business domain, or a specific tool or library, can be exemplary for other team members. They can learn from the code when conducting reviews. This has the added perk that it reduces the situations where a single person holds all the knowledge of a given system component or code area. Likewise, code review feedback provided by such an expert can have the same effect.

Another great benefit they can bring to the table is the distribution of code ownership among all of the team members, not only the author of the code in question. When projects have strong code review habits, the code becomes something that the whole team is producing, as every line that gets to production has been seen by, and incorporated input from, many of the members of the team. Everybody owns and can feel proud of the final product.

Finally, reviewers can sometimes just come up with better and/or simpler solutions than the ones the original author implemented. They can come up with these given their fresh perspective and maybe their experience with similar problems in other domains. Code reviews allow for these to be incorporated before the time comes to ship the code.

What should reviewers look for?

Simply put, reviewers should look at every aspect of the code and offer suggestions for improvements where they see fit. Google’s recommended practices compile a somewhat comprehensive list of the elements that reviewers should look for:

  • Design: Is the code well-designed and appropriate for the system?
  • Functionality: Does the code implement the requirements correctly?
  • Complexity: Could the code be made simpler? Is it understandable?
  • Tests: Does the code have correct and well-designed automated tests?
  • Naming: Are clear names for variables, classes, methods, etc. being used?
  • Comments: Are the comments clear, useful and necessary?
  • Style: Does the code follow the project’s style guide?
  • Documentation: Did the relevant documentation also get updated? Any public interface documentation or OpenAPI files for example.

When it comes to code style, something to note is that this is where tools like linters are prettiers can be implemented to reduce human labor. Early in the project, if the team decides on the style, such a tool can be setup to automate the process of making sure that all code that gets written complies with the style guide. Some code repositories even allow for such tools to be automatically run upon every push. This makes style guide compliance not even a concern for the reviewers, because the tooling always makes sure that the code does comply.

Who should review code?
Who should ask for their code to be reviewed?

Everybody in the team should be regularly reviewing code and having their code reviewed. Regardless of seniority or experience in the specific project area, domain, framework, or language. More “junior” team members benefit from reviewing code by learning new techniques, principles, technologies, and the code base itself. More “senior” team members can provide valuable input that improves the code base and other team members' skills.

We also have to realize that the distinction between “junior” and “senior” is often blurry. Most teams have people with a variety of skill sets and experience; so everybody has the ability to offer good insight. One can always be a “senior” in one aspect of the project, and a “junior” in another.

Even if a single reviewer can be good enough, it is beneficial to include as many reviewers as possible, lest we fall into the trap of overloading a small number of individuals by having them be in charge of most or all of the reviews. Also, like I mentioned before, two great benefits that code reviews offer are knowledge sharing and code ownership. The more people you have regularly reviewing code, the bigger they impact will they have in these two aspects.

That said, it is always better to have somebody more experienced in the area of the code that’s changing be among the reviewers.

When should code be reviewed?

As Wikipedia’s definition revealed, it is ideal for code reviews to be done when implementation is done on a feature; before merging the new code into the main development branch. Most modern software development that uses a Git-based code repository uses something like GitHub’s Pull Request mechanism. (GitLab uses Merge Request instead of Pull Request: same concept, different terminology.)

The developer creates a Pull Request when they are done implementing a feature or fixing a bug. The PR compiles all the changes (a series of commits) and turns them into a nice digestible package. This adds great visibility to the changes and makes them very easy to review before merging. Ideally, no patch makes it into the system without first being reviewed.

Code reviews can also happen before the implementation is done. Maybe the developer wants to get the team’s input on a specific function, method, class, component or approach. If the developer asks for it specifically, it can happen at any point in their development process.

Some recommendations

Now that we’ve gone over the basics, let’s discuss some recommendations and pitfalls to avoid, inspired by our experience in conducting code reviews over the years.

Keep pull requests to a manageable size

Bigger patches make code reviews harder to perform because of the sheer volume of code that the reviewers need to work through. More code to read makes it tempting to skim over it rather than reading in depth, and makes it hard to gain a thorough understanding of the changes and offer good insight.

So we feel like it is best to keep them as small as possible. This desire to keep things small may need to affect the overall software development process upstream. For example, making sure user stories or change requests are granular enough so that they can be fulfilled with a reasonable amount of code changes. Consider splitting bigger features into smaller, bite-sized issues to make this possible.

Make the pull request cohesive

PRs are better when their size is manageable, but also we need to make sure that they contain all they need to allow the reviewers to understand them completely and as a whole. There’s no need to split the changes artificially (for example, between back-end and front-end) if ultimately, they need to be pushed together to fulfill the requirement that’s being worked on, and leave the system in a working state.

If you would like a preliminary review, ask specific questions

Sometimes we want to get early reviews even before the implementation is complete or we’re at a logical interruption point. For such cases, a good practice is to come to the reviewers with a specific question that we’d like them to focus on.

It can waste time if the intent of the review is not explicitly communicated: The reviewer could mistakenly do a thorough review and leave feedback on tiny details of code that’s not yet ready and not even address the specific larger aspects that the developer wants help with.

Pair programming has the first code review baked in

Pair programming can expedite the review process by closing the code review feedback loop, compressing it to its fastest form. In pair programming, code effectively gets reviewed as soon as it is written, bit by bit.

If the team has more than two developers, though, there’s still great value in having the other team members, the ones not involved in the pair programming activity, review the code. They will approach it later with fresh eyes and without the shared mental context the pair had.

Code reviews work well asynchronously

In general, don’t make code reviews a synchronous process. Publicly available Git repository cloud hosting services like GitHub and GitLab include great tools for reviewing pull/​merge requests. We should use them to their full potential. There’s no need for a conference call or an in person meeting where everybody blocks a chunk of time to dedicate it to reviewing a PR. Everybody can do it on their own at their convenience.

But if a member of the team is new to the process of reviewing code, it can be good to work through a few pull requests at the same time, together, till they get the hang of it.

Give code reviews high priority

Give code reviews a high priority within your daily tasks. It is counterproductive to let pull requests sit for a long time when a few minutes to an hour of our time can mean that a user story/​ticket can move forward through the process. If you work by organizing your increments via sprints, remember that the goal is to complete the most stories as a team. Reviewing a pull request is actively supporting that goal, even if it isn’t one of the stories/​issues you’re working on yourself.

Make sure your PRs get the attention they need

Don’t just “fire and forget” a pull request. If it happens that any of our patches are taking too long to be seen by other people, we don’t just abandon them and think that they are somebody else’s problem now, that our work is done. In these cases we should feel free to reach out to the reviewers and bring the PRs to their attention.

To this end, we should leverage all the communication tools available, even outside of the code repository or code review tool, by chat, phone, issue tracking system, etc.

Get as many reviewers as you can

A single reviewer on a pull request can be enough. However, it is always beneficial to try to get as many eyes as possible onto a change. It improves ownership and it allows for more effective knowledge sharing. It also has the potential for more improvements on the code base, as more people, with varying strengths, look at the code and offer their feedback.

Also try to avoid having a single person be the gatekeeper of merges. Like I said, everybody can and should participate in the activity of code reviews. Sometimes having a gatekeeper this may be desirable if, for example, CI/CD is in place in such a way that merges produce automatic production deployments. But we can always try to make sure that code that reaches the gatekeeper is already reviewed by other team members by the time it does. That way we avoid overloading them and eliminate the bottleneck or single point of failure.

GitHub, GitLab, and similar tools provide settings to limit which users can commit to certain branches. If a process like that is needed, it can be done with help of such tools while still having PRs that many team members can review and discuss.

Don’t let the perfect be the enemy of the good

If the PR isn’t “perfect” or not “as good as it could be”, but it does not worsen the code base, and implements the changes competently, maybe there’s no need to block it. Code reviews are important, but it also is important to try to maintain momentum of delivery and avoid making developers feel nitpicked. So consider that when reviewing. There is a balancing act between deadlines, paying off and incurring technical debt, and what is “good enough”.

In the same train of thought, it is useful to clearly label code review comments. Specifying whether the reviewer considers each comment as a question, a simple nitpick, or an actually important change.

Top-down imposition of process is often inconvenient

Code reviews don’t need to include a ton of software process overhead. Regardless of your process style, it can be a very lightweight practice and done at the developers’ discretion. It can be as simple as sending a diff file to a fellow developer and asking them for feedback. The tools available today make them very accessible and easy to do.

As such, they can become very effective when the team itself manages and conducts them; as opposed to having it come to them as a predetermined specific process from management. Even a practice as beneficial as code reviews can be soured by a bad, overly strict, or dogmatic implementation.

That’s all for now

At the end of the day, the concept of code reviews is simple: To have somebody else look at our code, in hopes that what eventually makes it to production is as good quality as possible.

In this article we’ve discussed many more aspects to clearly explain the promise of code reviews. We’ve given some details on how to conduct them, who is involved, and what benefits we can get from them. We also captured a series of recommendations that we have learned through experience.

If you’re not into the habit yet, hopefully this article convinced you to give it a try!


development culture

Using pgTAP to automate database testing

Josh Tolley

By Josh Tolley
March 16, 2022

Old piano outdoors, focused on keyboard with most keytops missing and some snow on it Photo from PxHere

Recently I started learning to tune pianos. There are many techniques and variations, but the traditional method, and the one apparently most accepted by ardent piano tuning purists, involves tuning one note to a reference, tuning several other notes in relation to the first, and testing the results by listening closely to different combinations of notes.

The tuner adjusts each new note in relation to several previously tuned notes. Physics being what it is, no piano can play all its tones perfectly, and one of the tricks of it all is adjusting each note to minimize audible imperfections. The tuner achieves this with an exacting series of musical intervals tested against each other.

Databases need tests too

One of our customers needed to add security policies to their PostgreSQL database, to limit data visibility for certain new users. This can quickly become complicated and ticklish, ensuring that the rules work properly for the affected users while leaving other users unmolested.

This struck me as an excellent opportunity to create some unit tests, not that there’s any short supply of good opportunities to add unit tests! This is not just because it helps prove that these security policies really do work properly, but because (confession time) I recently did a similar project for a different database without the help of unit tests, and it wasn’t much fun.

So this seemed like a good time to use pgTAP, a set of database functions designed to allow writing unit tests within the database itself. They produce “Test Anything Protocol” (TAP) output, a simple protocol that displays unit test results in an easily understood report.

What to test?

A good first step in writing unit tests is deciding on something to test. In my case, I figured I should make sure row-level security policies were turned on for the tables I was interested in, which is available from the rowsecurity field in the pg_tables view:

select ok(rowsecurity, tablename || ' has row security enabled') from pg_tables
    where schemaname = 'public'
        and (
            tablename in (
                -- ... some hard coded table names
            ) or tablename like 'some_other_tables_%'
        );

The ok() function comes from pgTAP. It counts as one test each time it’s called; the test passes when the first argument is true, and fails when the argument is something else. The second argument is an optional comment describing what’s being tested. Following a pretty common TAP-related naming convention, I put this in a file called 00-test.sql in a directory under the root of my project, simply called t.

A more complicated set of tests could include several different files, where the numeric part of the name helps sort the tests in the desired run order, and the rest of the filename describes the subject of the tests within. But this will do just to get started. I can run it with pg_prove, included with the pgTAP package:

pg_prove -d mydatabase t/00-test.sql

Iteratively improving

This fails, for several reasons.

First, I haven’t yet installed the pgTAP extension in my database, with CREATE EXTENSION pgtap.

I also haven’t actually done anything in my test to run the code I’m testing. The actual code in this project consists of some database functions, which we need to run to create the database security policies, and I haven’t run any of them yet.

And finally, pgTAP requires me to “plan” my tests first, or in other words, I need to inform pgTAP how many tests I plan to run, before I run them. It’s also nice to call finish() so pgTAP can clean up after itself.

I installed the pgTAP extension in my database, and modified the test as follows:

begin;

\i create_policy.sql

select plan(1);  -- plan for a single test

select ok(rowsecurity, tablename || ' has row security enabled') from pg_tables
    where schemaname = 'public'
        and (
            tablename in (
                -- ... some hard coded table names
            ) or tablename like 'some_other_tables_%'
        );

select finish();

rollback;

This wraps my test in a transaction, so that I can roll everything back to leave the database essentially as I found it. It also calls the actual code I’m testing, in create_policy.sql, and plans one test. And it gives me this new failure:

t/m.sql .. All 1 subtests passed

Test Summary Report
-------------------
t/m.sql (Wstat: 0 Tests: 226 Failed: 225)
  Failed tests:  2-226
Parse errors: Bad plan.  You planned 1 tests but ran 226.
Files=1, Tests=226,  1 wallclock secs ( 0.04 usr  0.00 sys +  0.03 cusr  0.01 csys =  0.08 CPU)
Result: FAIL

The problem here is that each call to ok() counts as one test, and my test apparently found 226 tables to check for row-level security. I can improve the planning like this:

select plan(count(*)::integer)
    from pg_tables where schemaname = 'public'
        and (
            tablename in (
                -- ... some hard coded table names
            ) or tablename like 'some_other_tables_%'
        );

count() returns a bigint, and plan() expects integer, so this requires a typecast, but is otherwise pretty simple. And now my tests pass:

josh@here:~dw$ pg_prove -d nedss t/00-test.sql
t/00-test.sql .. ok
All tests successful.
Files=1, Tests=226,  1 wallclock secs ( 0.03 usr  0.01 sys +  0.03 cusr  0.01 csys =  0.08 CPU)
Result: PASS

Looking back and ahead

Suffice it to say that pgTAP includes many functions similar to ok(), to test various aspects of the database, its structure, and its behavior, and I’d recommend interested users review the documentation for more details. I intended this post only as an introduction.

In its completed state, my test suite comprised several tests ensuring various required preliminaries were in place, a few tests like the one above that check for necessary table-specific settings, others that ensure the affected roles were created, and finally some which create some sample data and use SET ROLE to test the data visibility directly for roles with various policies applied.

And to be honest, I was surprised at the sense of security that came over me with this completed test suite. As I mentioned, I’d done similar work previously, and knew that although I was confident in the code when it was written, that confidence came only through fairly extensive manual testing. I know very well the struggles of bit rot, and I knew it would be at least as hard to repeat that testing regimen by hand sometime down the road after a year or two.

I also recognized that if I ever needed to set up similar policies again, I could use these tests themselves as a reference, because they show exactly how to run the code in question. Though of course I included that information in the project’s associated README file as well … right?

Let us know if you’ve used pgTAP, and what effect it has had on your database development.


sql postgres database testing security

Automating reading the screen and interacting with GUI programs on X Window System

Metal tower with cables in front of overcast sky and muted sun

A while back, Google Earth made some changes to the layer select menu in the sidebar, which broke a program that toggles the 3D imagery on VisionPort systems. These run the X Window System (also known as X11, or just X) on Ubuntu Linux.

In looking for a workaround, I found that shell scripts can fully interact with GUI apps using the xdotool, xwd, and convert commands. This script would send a series of keystrokes to open the sidebar, navigate the layers menu, and toggle the box for the 3D buildings layer.

Changing the series of keystrokes to match the new number of layers should have fixed the issue, but there was more to this script. The next part of the script would take a screenshot, crop the checkbox, and compare it to saved files of other cropped boxes. Fixing this part of the script required correcting the positions of the captures and replacing the reference files with ones that pictured the updated Google Earth checkbox states.

Here I will explain how the script works and how we changed it so that it no longer needs these reference files and ultimately runs faster.

Overview of how the script works

xwd takes a screenshot of a window on the screen. convert transforms the pixel data in the image into lines of text with location and color that we can easily search and read with grep, sed, or similar. xdotool interacts with GUI windows. It can find, focus, and send keystrokes and mouse commands, among other things.

Preparing

To illustrate, let’s make a simple case of looking for a button and clicking on it from the terminal. We will skip making a script for this example as most of this can be accomplished on a single command line with xdotool. The commands here are for the Ubuntu operating system and may be a little different on other systems.

If you would like to try it as you read along, you will need an image editor to quickly look up pixel positions and colors. GIMP works great and is pictured in this example.

You will also need to install some packages. On a terminal:

sudo apt install xdotool x11-apps imagemagick

Working through an example

1. To know which window to interact with, xdotool needs to know the window’s name. So let’s open a browser and navigate to endpointdev.com and then note the page title in the browser tab, “Secure Business Solutions”:

Screenshot of endpointdev.com website home page loaded in a browser

2. On the terminal:

xdotool search "secure business solutions"

Defaulting to search window name, class, and classname
69206019

This reminds us that it can search for windows in many ways and returns a window ID, in this example 69206019. Search results also get stored on the “window stack” and can be referenced as %1, %2, and so on.

3. Use this ID in the next terminal command. If you have more than one ID, you may need to refine your search:

xwd -id 69206019 -out endpointdev.xwd

xwd creates a screenshot of the window that we want to see and saves it to the filename passed with the -out argument.

4. Then run a convert command:

convert endpointdev.xwd endpointdev.txt

By converting the image to text we can use any text tools like grep, cut, diff, or sed to find colors and coordinates on the images. We just need to know how to read it. The first few lines of the endpointdev.txt file look like this:

# ImageMagick pixel enumeration: 1156,638,65535,srgb
0,0: (11565,11565,11565)  #2D2D2D  srgb(45,45,45)
1,0: (10537,10537,10537)  #292929  grey16
2,0: (4369,4369,4369)  #111111  srgb(17,17,17)
3,0: (0,0,0)  #000000  black
4,0: (0,0,0)  #000000  black
5,0: (0,0,0)  #000000  black

Each line represents a pixel:

  • The first two numbers are the x and y positions,
  • in parentheses are the decimal color values per RGB channel,
  • after # is the HTML RGB color value which we will be using in our example,
  • and last the srgb code or name for the color.

We can use any of these color codes for matching. To decode these color codes visually we use an image editor.

5. Open the endpointdev.xwd file that we created in an image editor. Here we will use GIMP:

Screenshot of GIMP image editor open with a screenshot of endpointdev.com website home page loaded in a browser

6. Select the zoom tool (see the mouse pointer in the screenshot above to know which that is). Use it to draw a box around the VisionPort button or something else that we want our script to “see” and zoom in on it:

Screenshot of GIMP image editor zoom tool selected on a screenshot of a tiny part of the endpointdev.com website with a VisionPort link

7. Next, use the color picker tool (check the mouse pointer in the screenshot above for this tool) and click on the circle, or any other spot you want to pick for your script:

Screenshot of GIMP image editor color picker tool selected on a screenshot of a tiny part of the endpointdev.com website with a VisionPort link

8. Notice by the pointer above how the color picker box changed colors. Now clicking on this box brings up a new window:

Screenshot of GIMP image editor “Change Foreground Color” dialog box

Here we can find the HTML notation of the color we selected: 00ffcc (see the pointer in the screenshot above). We’ll call this “VisionPort button green”.

In my experience even things that look like the same color can have a unique pixel caused by a different shade on an edge or something else. Our old script checked a 25×25 pixel image to guarantee what it had was a check mark, but finding one of these colors unique enough to use as a key ensures that the check mark is next to it, so we can check the state on a single pixel too.

If one pixel is not unique enough we can look for a pattern of pixels and still store them as variables on the script to avoid saving and comparing files.

9. With this HTML hex RGB color code we go back to the terminal and search our file for it. Use capital letters for the color or else grep -i to make it case-insensitive:

grep 00FFCC endpointdev.txt

This will return a list of pixels with this color, such as:

827,107: (0,65535,52428)  #00FFCC  srgb(0,255,204)
839,107: (0,65535,52428)  #00FFCC  srgb(0,255,204)
853,107: (0,65535,52428)  #00FFCC  srgb(0,255,204)
878,107: (0,65535,52428)  #00FFCC  srgb(0,255,204)
883,107: (0,65535,52428)  #00FFCC  srgb(0,255,204)

10. For our purpose we can use any of those, so let’s try the first result. In the terminal run:

xdotool windowfocus 69206019 mousemove 827 107 click 1

Your browser should then follow the link to visionport.com and load that website’s home page in your browser:

Screenshot of visionport.com website loaded in a web browser

Did it work for you too? 🎉

That is all we need to build a script that interacts with GUI apps. If you would like to practice, take another screenshot and use xdotool to look for and click the “contact” button. 😁

How it works in a little more detail

The last xdotool command chain’s steps are:

  • windowfocus 69206019 selects the ID that we searched for before,
  • mousemove moves the mouse pointer to the specified absolute coordinates, in this case the first pixel that matched the VisionPort button green color in the endpointdev.txt file,
  • click 1 sends a mouse left click at the mouse’s current position.

Updating our script to use this instead of the series of keystrokes took some investigation and testing but made it faster in the end.

One of the issues to overcome was that the VisionPort custom layers capture xdotool’s window and mouse commands, so to target a specific window on the VisionPort, we have to offset the coordinates of our commands according to the position of the window on the screen.

Chaining xdotool actions

Here is another example of chaining several xdotool actions in a single command:

xdotool search "secure business solutions" windowfocus key Alt+F4

That should close the browser window on most systems. If any of these commands failed, try replacing windowfocus with windowactivate; some xdotool commands depend on system support. Check xdotool’s manual for more options.

Command chain steps:

  • search <name>: Finds windows with that name, and stores the results in the “window stack” memory.
  • windowfocus: Selects a window. Defaults to the first window from xdotool’s window stack; no need to supply the window ID if following a search.
  • key Alt+F4: Sends the Alt+F4 keypress to the selected window. The + here means at the same time. Use spaces to separate a list of keys. By default there is a delay between keystrokes, and that is configurable.

Use the standard Unix sleep <seconds> in between other xdotool commands that may need delays.

xdotool takes a list of commands naturally, which can even be stored and read directly from text files. They can also be executed directly by making the file executable (with chmod +x) and setting the shebang line (first line of the file) to:

#!/usr/bin/xdotool

Such executable scripts are commonly named using a .xdo extension.

Fine-tuning

While we can convert the whole screenshot to text and search on it like we did in the example above, the files are large, so cropping a smaller section before converting to text is recommended.

We can crop a section, save it to a file and compare it to another file, like our script did. We can also crop an image to 1 pixel wide across a menu bar if that’s what we’re aiming for, or a column of icons in our case, to find all the button positions without much overhead. Doing this instead of only checking the area hard-coded for a check box gave our script the ability to find the check box based on the icon next to it when changing positions due to other tabs expanding on the menu.

It is also possible to use the multiprocessing features of shell scripts: When working on multiple window screenshots or crops, we can have them all run at the same time using the & operator at the end of the commands, then use the wait command to let them all finish before performing the text searches. These are very fast on small cropped files.

Learning more

The xdotool documentation has information on other ways to identify and interact with windows and is a great place to go to learn more.


development testing automation graphics

Database Design: Using Documents

Emre Hasegeli

By Emre Hasegeli
March 9, 2022

Angle view of square paving stones in two colors, in a pattern

Using documents in relational databases is increasingly popular. This technique can be practical and efficient when used in fitting circumstances.

Example

Let’s start with an example. Imagine we are scraping web sites for external URLs and store them in a table. We’ll have the web sites table to store the scrape timestamp and another table to store all of the references.

CREATE TABLE web_sites (
    web_site_domain text NOT NULL,
    last_scraped_at timestamptz,

    PRIMARY KEY (web_site_domain)
);

CREATE TABLE refs (
    web_site_domain text NOT NULL,
    ref_location text NOT NULL,
    link_url text NOT NULL,

    PRIMARY KEY (web_site_domain, ref_location, link_url),
    FOREIGN KEY (web_site_domain) REFERENCES web_sites
);

We do not need to bother adding an id to the web_sites table, because we assume there won’t be too many of them. The domain is small and more practical to use as an identifier. If you are curious about advantages of using natural keys, see my previous article.

Normalized Tables

There may be many thousands of unique URLs for a single web site and other web sites may refer to the same URLs. To try to minimize the storage, we can keep the locations and the URLs in separate tables, give them integer identifiers, and have another table for the many-to-many relations.

CREATE TABLE locations (
    location_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,
    web_site_domain text NOT NULL,
    ref_location text NOT NULL,

    PRIMARY KEY (location_id),
    UNIQUE (web_site_domain, ref_location),
    FOREIGN KEY (web_site_domain) REFERENCES web_sites
);

CREATE TABLE links (
    link_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,
    link_url text NOT NULL,

    PRIMARY KEY (link_id),
    UNIQUE (link_url)
);

CREATE TABLE locations_links (
    location_id bigint NOT NULL,
    link_id bigint NOT NULL,

    PRIMARY KEY (location_id, link_id),
    FOREIGN KEY (location_id) REFERENCES locations,
    FOREIGN KEY (link_id) REFERENCES links
);

The idea in here is to keep our biggest table narrow. It’d pay off to refer to the locations and links just with integer identifiers when we’ll have very many of them.

Table Sizes

We’ll have many web sites and many URLs, so our lookup tables would be big and the relation table would be even bigger. That is going to be a major problem because of many reasons, one of them being space efficiency. Narrow tables with many rows are not very space efficient especially on Postgres where there’s 24 bytes per-row overhead. To demonstrate it being significant, let’s add one row to each of the tables to see the sizes compared with the overhead.

INSERT INTO web_sites (web_site_domain)
VALUES ('example.com');

INSERT INTO refs (web_site_domain, ref_location, link_url)
VALUES ('example.com', '/source', 'http://example.net/target.html');

INSERT INTO locations (web_site_domain, ref_location)
VALUES ('example.com', '/source');

INSERT INTO links (link_url)
VALUES ('http://example.net/example.html');

INSERT INTO locations_links
VALUES (1, 1);

SELECT table_name, tuple_len, (100.0 * 24 / tuple_len)::int AS overhead_perc
FROM information_schema.tables, LATERAL pgstattuple(table_name)
WHERE table_schema = 'public';

   table_name    | tuple_len | overhead_perc
-----------------+-----------+---------------
 web_sites       |        36 |            67
 refs            |        75 |            32
 locations       |        52 |            46
 links           |        64 |            38
 locations_links |        40 |            60

With just one row and such short domains and URLs, we wouldn’t be gaining anything by having the links on a lookup table. In fact, it’s a lot less efficient. Although, the situation may change with many rows.

Composite Types

We can combine multiple fields in a column to avoid the overhead of many rows in narrow tables. Postgres has good support for composite types and arrays over them. They are useful if you need strict data type checking.

CREATE TYPE web_site_page AS (
    ref_location text,
    link_urls text[]
);

ALTER TABLE web_sites ADD COLUMN pages web_site_page[];

UPDATE web_sites
SET pages = ARRAY[ROW(ref_location, ARRAY[link_url])::web_site_page]
FROM refs;

SELECT pg_column_size(pages) FROM web_sites;

 pg_column_size
----------------
            117

As you see, the new column is not exactly small because composite types and arrays come with their own overheads. Still it could be much better when a web site has many references.

JSON Document Column

The composite types are not very easy to work with. We can store the pages in another format. JSON is the most popular one nowadays. There are multiple options to store JSON in a column in Postgres. Let’s start with the text-based data type.

ALTER TABLE web_sites ADD COLUMN pages_json json;

UPDATE web_sites SET pages_json = to_json(pages);

SELECT pg_column_size(pages_json) FROM web_sites;

 pg_column_size
----------------
             76

What is perhaps also surprising is the JSON being smaller than the array over composite type. This is because Postgres data types are a bit wasteful because of alignment and storing oids of the enclosed data types repeatedly.

Binary JSON

Another JSON data type, jsonb is available in Postgres. It is a binary structure that allows faster access to the values inside the document.

ALTER TABLE web_sites ADD COLUMN pages_jsonb jsonb;

UPDATE web_sites SET pages_jsonb = to_jsonb(pages);

SELECT pg_column_size(pages_jsonb) FROM web_sites;

 pg_column_size
----------------
             98

As you see the text-based JSON is still smaller. This is due to the binary offsets stored in the binary structure for faster access to the fields. The text-based JSON also compresses better.

Size Comparison

Very simple single row data shows the text-based json as the smallest, binary jsonb a bit larger, and the composite type to be the largest. However, the differences would be a lot smaller with more realistic sample data with many items inside the documents. I generated some data and gathered these results (the sizes excluding the indexes).

    model         |   size
------------------|---------
single table      |  436 MiB
normalized tables |  387 MiB
composite type    |  324 MiB
json              |  318 MiB
jsonb             |  320 MiB

You may expect the normalized tables to be the smallest as otherwise the metadata is stored repeatedly in the columns. Though, the results reveal it’s the other way around. This is because of the per-row overhead in Postgres. It becomes noticeable in narrow tables as in this example.

It’s also useful to notice that the most of the size is occupied by the TOAST tables when the data is stored in a single column. TOAST is a mechanism in Postgres that kicks in when a column is large enough which would often be the case when you design your tables this way. In this case, the table excluding the TOASTed part is pretty small which would help any query that doesn’t touch the large column.

Usability

What matters more than size is the usability of having it all in a single column. It is really practical to send and receive this column as a whole all the way from the database to the frontend. Compare this with dealing with lots of small objects used with the normalized tables.

It’s also possible in Postgres to index the document columns to allow efficient searches in them. There are many options to do so with advantages and disadvantages over each other. Though this post is getting long. I’ll leave indexing to another one.


database development performance postgres sql

Using a YubiKey as authentication for an encrypted disk

Zed Jensen

By Zed Jensen
March 7, 2022

Keys hanging on a wall Image by Silas Köhler on Unsplash

Recently I built a small desktop computer to run applications that were a bit much for my laptop to handle, intending to bring it with me when I work outside my apartment. However, there was an immediate issue with this plan. Because this computer was intended for use with sensitive information/​source code, I needed to encrypt the disk, which meant that I’d need to enter a passphrase before I could boot it up.

I didn’t really want to haul a keyboard and monitor around with me, so I came up with an alternative solution: using a YubiKey as my method of authentication. This allowed me to avoid the need to type a password without giving up security. In this post I’ll show you how you can do the same.

Preparation

First off, you need a YubiKey, if you don’t have one already. I ended up getting the YubiKey 5C NFC.

While I waited for my YubiKey to arrive, I installed Ubuntu 20.04 with full-disk encryption (using the default option of LUKS, or Linux Unified Key Setup) on the computer. I set a passphrase like normal—the process I describe in this post allows access with either this passphrase or the YubiKey.

Next, there were two packages that I needed to configure everything:

  • yubikey-personalization allows you to change the settings on your YubiKey. I installed it from the Ubuntu repository and had no problems.
  • yubikey-luks is what lets you use the YubiKey as an authentication method for a LUKS setup. I initially installed this from Ubuntu’s repository as well, but the version they’ve got is fairly out of date and required both a YubiKey and passphrase instead of just the YubiKey. As I mentioned earlier, the main objective of setting this up was booting without a keyboard, so I installed the tool from source as detailed in its README.

Setup

Once you’ve got the above libraries installed, setup is simple. Step by step:

1. Configure your YubiKey to use challenge-response mode

A YubiKey has at least 2 “slots” for keys, depending on the model.

We will change only the second YubiKey slot so you will still be able to use your YubiKey for two-factor auth like normal.

Plug in your YubiKey and run the following command:

ykpersonalize -2 -ochal-resp -ochal-hmac -ohmac-lt64 -oserial-api-visible
2. Find a free LUKS slot to use for your YubiKey

LUKS also allows for multiple key slots so that you can have different passphrases to unlock the encrypted data. Up to 8 key slots are available for LUKS1, and up to 32 for LUKS2.

Most setups only use the first slot for the main passphrase, but we can check by following these steps:

  • First run lsblk and figure out the name of your LUKS-encrypted disk partition. Mine was nvme0n1p3.
  • Now run sudo cryptsetup luksDump /dev/nvme0n1p3. The output should look something like this:
LUKS header information
Version:        2
Epoch:          11
Metadata area:  [a smallish number] [bytes]
Keyslots area:  [a medium number] [bytes]
UUID:           [a UUID]
Label:          (no label)
Subsystem:      (no subsystem)
Flags:          (no flags)

Data segments:
  0: crypt
        offset: [a big number] [bytes]
        length: (whole device)
        cipher: aes-xts-plain64
        sector: 512 [bytes]

Keyslots:
  0: luks2
				[Lots of information about this slot]
Tokens:
Digests:
  0: pbkdf2
        Hash:       sha256
        Iterations: 370259
        Salt:       [A bunch of bytes in hex format]
        Digest:     [A bunch of bytes in hex format]

You’re looking specifically for a free keyslot, and the output here only shows anything in slot 0, so slot 1 should be free.

3. Assign your YubiKey to a free slot

You can do this with the following command (substituting in your own partitition name and slot number):

sudo yubikey-luks-enroll -d /dev/nvme0n1p3 -s 1

This command will ask you for a passphrase. It doesn’t need to be a particularly complex one, because it’ll only work with your YubiKey.

4. Update crypttab and ykluks.cfg

Now you need to add keyscript=/usr/share/yubikey-luks/ykluks-keyscript to /etc/crypttab. For example, mine started as:

nvme0n1p3_crypt UUID=[uuid-here] none luks,discard

After the change, it should look like this:

nvme0n1p3_crypt UUID=[uuid-here] none luks,discard,keyscript=/usr/share/yubikey-luks/ykluks-keyscript

Finally, you need to configure yubikey-luks to give the passphrase you just set so you don’t have to. Open /etc/ykluks.cfg and add the line

YUBIKEY_CHALLENGE="[your new passphrase here]"

Once you’ve added this line, run sudo update-initramfs -u and you’re done!

Conclusion

Now if you shut your machine off, plug in your YubiKey, and turn it on, it should boot all the way without needing a passphrase. If you forget to plug in the YubiKey before turning the computer on, you’ll probably need to hold the contact button on it for a second or two and then it should boot just the same.

And there you go! A YubiKey provides neat way to securely start up a computer with an encrypted disk without needing a passphrase.


security sysadmin tips

Optimizing media delivery with Cloudinary

Juan Pablo Ventoso

By Juan Pablo Ventoso
March 1, 2022

Beautiful cloudy mountain scene with river flowing by lush banks with people swimming, relaxing, and walking towards multistory buildings

I remember how we needed to deal with different image formats and sizes years ago: From using the WordPress-style approach of automatically saving different resolutions on the server when uploading a picture, to using a PHP script to resize or crop images on the fly and return the result as a response to the frontend. Of course, many of those approaches were expensive, and not fully optimized for different browsers or device sizes.

With those experiences in mind, it was a nice surprise for me to discover Cloudinary when working on a new project a couple of months ago. It’s basically a cloud service that saves and delivers media content with a lot of transformations and management options for us to use. There is a free version with a usage limit: Up to 25K transformations or 25 GB of storage/​bandwidth, which should be enough for most non-enterprise websites. The cheapest paid service is $99 per month.

Here’s a list of the image features we used on that project. I know they offer many other things that can be used as well, but I think this is a good start for anyone who hasn’t used this service yet:

Resizing and cropping

When you make a request for an image, you can instruct the Cloudinary API to retrieve it with a given size, which will trigger a transformation on their end before delivering the content. You can also use a cropping method: fill, fill with padding, scale down, etc.

Gravity position

When we specify a gravity position to crop an image, the service will keep the area of the image we decide to use as the focal point. We can choose a corner (for example, top left), but also—and this is probably one of the most interesting capabilities on this service—we can specify “special positions”: By using machine learning, we can instruct Cloudinary to use face detection, or even focus on other objects, like an animal or a flower in the picture.

Automatic format

Another cool feature is the automatic format, which will use your request headers to find the most efficient picture format for your browser type and version. For example, if the browser supports it, Cloudinary will return the image in WebP format, which is generally more efficient than standard JPEG, as End Point CTO Jon Jensen demonstrates on his recent blog post.

Screenshot of Chrome browser dev tools showing network response for a WebP image
Automatic format in action: Returning a WebP image in Chrome

Other features

There are many other options for us to choose when querying their API, like setting up a default placeholder when we don’t have an image, applying color transformations, removing red eyes, among other things. The Transformation reference page on their documentation section is a great resource.

NuxtJS integration

The project I mentioned above was a NuxtJS application with a Node.js backend. And since there’s a NuxtJS module for Cloudinary, it made sense to use it instead of building the queries to the API from scratch.

The component works great, except for one bug that we found that didn’t allow us to fully use their image component with server-side rendering enabled. Between that drawback and some issues trying to use the lazy loading setting, we ended up creating a Vue component ourselves that used a standard image tag instead. But we still used their component to generate most of the API calls and render the results.

Below is an example of using the Cloudinary Image component on a Vue template:

<template>
  <div>
    <cld-image
      :public-id="publicId"
      width="200"
      height="200"
      crop="fill"
      gravity="auto:subject"
      radius="max"
      fetchFormat="auto"
      quality="auto"
      alt="An image example with Cloudinary"
    />
  </div>
</template>

Alternatives

Of course, Cloudinary is not the only image processing and CDN company out there: There are other companies offering similar services, like Cloudflare images, Cloudimage, or imagekit.io.

Do you know any other good alternatives, or have you used any other Cloudinary feature that is not listed here? Feel free to add a comment below!


compression graphics browsers optimization saas
Previous page • Page 2 of 203 • Next page