https://www.endpointdev.com/blog/tags/elasticsearch/2020-08-27T00:00:00+00:00End Point DevContainerizing Magento with Docker Compose: Elasticsearch, MySQL and Magentohttps://www.endpointdev.com/blog/2020/08/containerizing-magento-with-docker-compose-elasticsearch-mysql-and-magento/2020-08-27T00:00:00+00:00Kevin Campusano
<p><img src="/blog/2020/08/containerizing-magento-with-docker-compose-elasticsearch-mysql-and-magento/banner.jpg" alt="Banner"></p>
<p><a href="https://business.adobe.com/products/magento/open-source.html">Magento</a> is a complex piece of software, and as such, we need all the help we can get when it comes to developing customizations for it. A fully featured local development environment can do just that, but these can often times be very complex as well. It’d be nice to have some way to completely capture all the setup for such an environment and be able to get it all up and running quickly, repeatably… even with a single command. Well, <a href="https://www.docker.com/">Docker</a> containers can help with that. And they can be easily provisioned with the <a href="https://docs.docker.com/compose/">Docker Compose</a> tool.</p>
<p>In this post, we’re going to go in depth into how to fully containerize a Magento 2.4 installation for development, complete with its other dependencies <a href="https://www.elastic.co/">Elasticsearch</a> and <a href="https://www.mysql.com/">MySQL</a>. By the end of it, we’ll have a single command that sets up all the infrastructure needed to install and run Magento, and develop for it. Let’s get started.</p>
<h3 id="magento-24-application-components">Magento 2.4 application components</h3>
<p>The first thing that we need to know is what the actual components of a Magento application are. Starting with 2.4, <a href="https://devdocs.magento.com/guides/v2.4/install-gde/prereq/elasticsearch.html">Magento requires access to an Elasticsearch</a> service to power catalog searches. Other than that, we have the usual suspects for typical PHP applications. Here’s what we need:</p>
<ol>
<li>MySQL</li>
<li>Elasticsearch</li>
<li>A web server running the Magento application</li>
</ol>
<p>In terms of infrastructure, this is pretty straightforward. It would cleanly translate into three separate machines talking to each other via the network, but in the Docker world, each of these machines become containers. Since we need multiple containers for our infrastructure, things like Docker Compose can come in handy to orchestrate the creation of all that. So let’s get to it.</p>
<h3 id="creating-a-shared-network">Creating a shared network</h3>
<p>Since we want to create three separate containers that can talk to each other, we need to ask the Docker engine to create a network for them. This can be done with this self-explanatory command:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker network create magento-demo-network
</code></pre></div><p><code>magento-demo-network</code> is the name I’ve chosen for my network but you can choose whatever is most appropriate.</p>
<p>You can run the following command to check your newly created network:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker network ls
</code></pre></div><p>Output usually looks like this:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">$ docker network ls
NETWORK ID NAME DRIVER SCOPE
bd562b9cf5a4 bridge bridge local
adb9ec2365c5 host host local
2dba8d97410e magento-demo-network bridge local
c3473c60ed52 none null local
</code></pre></div><p>There’s our <code>magento-demo-network</code> network among other networks that Docker creates by default.</p>
<h3 id="containerizing-mysql">Containerizing MySQL</h3>
<p>Getting a MySQL instance up and running is super easy these days thanks to Docker. There’s already <a href="https://hub.docker.com/_/mysql">an official image for MySQL</a> in <a href="https://hub.docker.com/">Docker Hub</a> so we will use that. We can set it up with this command:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker run -d \
--name magento-demo-mysql \
--network magento-demo-network \
--network-alias mysql \
-p 3306:3306 \
-v magento-demo-mysql-data:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=password \
-e MYSQL_USER=kevin \
-e MYSQL_PASSWORD=password \
-e MYSQL_DATABASE=magento_demo \
mysql:5.7
</code></pre></div><p>And just like that, we have a running MySQL instance. Running <code>docker ps</code> can get you a list of currently running containers. The one we just created should show up there.</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b73739ad5d66 mysql:5.7 "docker-entrypoint.s…" 22 seconds ago Up 21 seconds 0.0.0.0:3306->3306/tcp, 33060/tcp magento-demo-mysql
</code></pre></div><p>Let’s go through each one of the options from that command now to understand it better.</p>
<ul>
<li><code>docker run -d</code>: Runs the container in detached mode. This means that it’s run in the background as a daemon. Control is returned to the console immediately.</li>
<li><code>--name magento-demo-mysql</code>: This is the name of our container. Normally, Docker will generate random names for containers. In this case, we want to give it a name to refer to it with other Docker commands.</li>
<li><code>--network magento-demo-network</code>: Tells Docker to run the container as part of the <code>magento-demo-network</code> network that we created earlier. This is the network that we will use for all of our containers.</li>
<li><code>--network-alias mysql</code>: This is the name of this container within the network. This is how other containers in the network will be able to reference it. We’ll see that come to life a bit later.</li>
<li><code>-p 3306:3306</code>: Sets up our new MySQL container to allow connections over port <code>3306</code>. This is MySQL’s default port, which Magento will use to connect to it. This basically says “requests coming over the network to port <code>3306</code> of this container are going to be handled by the service installed in this container that listens to port <code>3306</code>”. That service happens to be MySQL.</li>
<li><code>-v magento-demo-mysql-data:/var/lib/mysql</code>: Creates a Docker volume. Specifically, we’re setting this one up to store the data files from MySQL. We need to do this so that the data stored in our MySQL container is persisted across shutdowns. <code>magento-demo-mysql-data</code> is the name of the volume and <code>/var/lib/mysql</code> is the directory within the MySQL container where that volume is mounted. In other words, any files stored in that directory are going to be stored within the volume instead. The volume is stored by Docker in the host machine, outside the container. <code>/var/lib/mysql</code> is the default directory where MySQL stores databases.</li>
<li><code>-e MYSQL_ROOT_PASSWORD=password</code>: Is the password for the root user for MySQL. This is passed into the containerized MySQL via environment variables. Hence the <code>-e</code> option.</li>
<li><code>-e MYSQL_USER=kevin</code>: Creates a new login in MySQL with <code>kevin</code> as its username.</li>
<li><code>-e MYSQL_PASSWORD=password</code>: Sets the word <code>password</code> as the password for that <code>kevin</code> user.</li>
<li><code>-e MYSQL_DATABASE=magento_demo</code>: Creates a database named <code>magento_demo</code>.</li>
<li><code>mysql:5.7</code>: This is the image that we’re using for our container. <code>5.7</code> specifies the version that we want to run. <a href="https://hub.docker.com/_/mysql">The <code>mysql</code> image in Docker Hub</a> contains a few more versions. Or “tags”, in Docker words.</li>
</ul>
<h3 id="connecting-to-this-container">Connecting to this container</h3>
<p><code>docker ps</code> showed us that our container was running. We can also interact with it. Here are a couple of ways of doing it:</p>
<h4 id="connecting-from-within-the-container">Connecting from within the container</h4>
<p>The easiest way of connecting to the MySQL instance is by running <code>mysql</code> CLI client from within the container itself. You can do that with:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker exec -it magento-demo-mysql mysql -u kevin -p
</code></pre></div><p>Here’s how that command works:</p>
<ul>
<li><code>docker exec -it</code> is used to run commands inside a container in interactive mode. Just what we need here in this case because we’re running <code>mysql</code>, which is an interactive CLI.</li>
<li><code>magento-demo-mysql</code> is the name we gave our container in the <code>docker run</code> command from before via the <code>--name magento-demo-mysql</code> option. This is why it’s useful to give names to containers: so we can use them in commands like this.</li>
<li><code>mysql -u kevin -p</code> is the command that’s run within the container. This is just the usual way of connecting to a MySQL server instance using the <code>mysql</code> CLI client. We use <code>kevin</code> because that’s what we set <code>MYSQL_USER</code> to when we created our container before.</li>
</ul>
<p>After running the previous command, the console will ask you for your password. We set that to <code>password</code> via <code>MYSQL_PASSWORD</code> so that’s what we need to type in. This will eventually result in the <code>mysql</code> prompt showing up. Run <code>show databases</code> to confirm that the <code>magento_demo</code> database that we specified via <code>MYSQL_DATABASE</code> got created.</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| magento_demo |
+--------------------+
2 rows in set (0.00 sec)
</code></pre></div><p>You can <code>Ctrl + D</code> your way out of that when you’re done exploring the containerized MySQL instance.</p>
<h4 id="connecting-directly-from-the-host-machine">Connecting directly from the host machine</h4>
<p>We can also connect to the MySQL instance running in the container, directly from our host machine. We can use:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">mysql -h localhost -P 3306 --protocol=tcp -u kevin -p
</code></pre></div><blockquote>
<p>Note that it is required that the <code>mysql</code> CLI client is installed in the host machine for this to work.</p>
</blockquote>
<p>Same as before, <code>mysql</code> will ask you for the password and, once typed in, it will give you its prompt.</p>
<h3 id="containerizing-elasticsearch">Containerizing Elasticsearch</h3>
<p>Like MySQL, there’s an official <a href="https://hub.docker.com/_/elasticsearch">Elasticsearch Docker image up in Docker Hub</a>. As a result, getting a working Elasticsearch installation is a piece of cake. It’s done with a command like this:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker run -d \
--name magento-demo-elasticsearch \
--network magento-demo-network \
--network-alias elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
elasticsearch:7.8.1
</code></pre></div><p>You can validate that the Elasticsearch is running with <code>curl localhost:9200/_cat/health</code>. That should return something like this:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">$ curl localhost:9200/_cat/health
1597622135 23:55:35 docker-cluster green 1 1 0 0 0 0 0 0 - 100.0%
</code></pre></div><p>Alright! That was easy enough. Again, thanks to Docker, we have an application that’s somewhat complex to set up, up and running in a matter of seconds.</p>
<p>Like before, let’s dissect that command that we used. Very similar to the MySQL one, only with some Elasticsearch specific settings:</p>
<ul>
<li><code>docker run -d</code>: Same as with the MySQL container, runs it in detached mode.</li>
<li><code>--name magento-demo-elasticsearch</code>: Gives the container a friendly name.</li>
<li><code>--network magento-demo-network</code>: Puts the container in the same network as the rest of our infrastructure.</li>
<li><code>--network-alias elasticsearch</code>: Is the name by which other containers in the network can refer to this contianer.</li>
<li><code>-p 9200:9200</code>: Opens port <code>9200</code> so that other containers within the network can talk to this one.</li>
<li><code>-p 9300:9300</code>: Same thing but for a different port.</li>
<li><code>-e "discovery.type=single-node"</code>: Sets up the <code>discovery.type</code> environment variable that the image uses to configure Elasticsearch with.</li>
<li><code>elasticsearch:7.8.1</code>: Specifies that our container will be running version <code>7.8.1</code> of Elasticsearch.</li>
</ul>
<h3 id="containerizing-magento">Containerizing Magento</h3>
<p>Now this is the step where things get a little bit more involved. Nothing crazy however, so let’s get into it.</p>
<h3 id="the-dockerfile">The Dockerfile</h3>
<p>There’s no image of Magento 2 that would be able to get us up and running as quickly as with MySQL or Elasticsearch, at least not that I could find, so we’re going to have to create our own. We can create our own images with the help of <a href="https://docs.docker.com/engine/reference/builder/">Dockerfiles</a>. A Dockerfile is a file that contains all the specifications needed for a container. The Docker engine uses it to create images which can then be used as basis for running containers.</p>
<p>Here’s a Dockerfile for Magento 2.4 that I came up with:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain"># /path/to/project/Dockerfile
# Our image is based on Ubuntu.
FROM ubuntu
# Here we define a few arguments to the Dockerfile. Specifically, the
# user, user id and group id for a new account that we will use to work
# as within our container.
ARG USER=docker
ARG UID=1000
ARG GID=1000
# Install PHP, composer and all extensions needed for Magento.
RUN apt-get update && apt-get install -y software-properties-common curl
RUN add-apt-repository ppa:ondrej/php
RUN apt-get update && apt-get install -y php
RUN apt-get update && apt-get install -y \
php-mysql php-xml php-intl php-curl \
php-bcmath php-gd php-mbstring php-soap php-zip \
composer
# Install Xdebug for a better developer experience.
RUN apt-get update && apt-get install -y php-xdebug
RUN echo "xdebug.remote_enable=on" >> /etc/php/7.4/mods-available/xdebug.ini
RUN echo "xdebug.remote_autostart=on" >> /etc/php/7.4/mods-available/xdebug.ini
# Install the mysql CLI client.
RUN apt-get update && apt-get install -y mysql-client
# Set up a non-root user with sudo access.
RUN groupadd --gid $GID $USER \
&& useradd -s /bin/bash --uid $UID --gid $GID -m $USER \
&& apt-get install -y sudo \
&& echo "$USER ALL=(root) NOPASSWD:ALL" > /etc/sudoers.d/$USER \
&& chmod 0440 /etc/sudoers.d/$USER
# Use the non-root user to log in as into the container.
USER ${UID}:${GID}
# Set this as the default directory when we connect to the container.
WORKDIR /workspaces/magento-demo
# This is a quick hack to make sure the container has something to run
# when it starts, preventing it from closing itself automatically when
# created. You could also remove this and run the container with `docker
# run -t -d` to get the same effect. More on `docker run` further below.
CMD ["sleep", "infinity"]
</code></pre></div><p>Feel free to go through the comments in the file above for more details, but essentially, this Dockerfile describes what a machine ready to run Magento should look like. It’s got PHP and all the necessary extensions, <a href="https://xdebug.org/">Xdebug</a>, and <a href="https://getcomposer.org/">Composer</a>. It also includes the <code>mysql</code> CLI client.</p>
<p>Importantly, it allows for creating a user account with sudo access. Later, we’ll use this capability to create a user account, inside the container that mimics the one we’re using in our host machine, effectively using the same user both inside and outside the container. The purpose of this is to make it possible to work on the Magento source code files from inside the container without having to deal with Linux permission issues when we try to do the same from outside the container (that is, directly via the host machine).</p>
<h3 id="the-image">The image</h3>
<p>Alright, now that we have our image defined in the form of our Dockerfile, let’s create it. To do that, we go into our project directory, create a new file named <code>Dockerfile</code>:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">cd /path/to/project
touch Dockerfile
</code></pre></div><p>Then use a text editor to save the contents from above into it, and finally run this command:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker build \
--build-arg USER=kevin \
--build-arg UID=$(id -u) \
--build-arg GID=$(id -g) \
-t magento-demo-web .
</code></pre></div><p>Here’s what this all means:</p>
<ul>
<li><code>docker build</code>: Is the command to build images from Dockerfiles.</li>
<li><code>--build-arg USER=kevin</code>: Specifies the username for the account with sudo access that we will log into our container as. I’ve chosen <code>kevin</code> here but you should use the one you’re logged in as on your machine.</li>
<li><code>--build-arg UID=$(id -u)</code>: Uses the <code>id -u</code> to pass in the Id of the currently logged in user.</li>
<li><code>--build-arg GID=$(id -g)</code>: Uses the <code>id -g</code> to pass in the Group Id of the currently logged in user.</li>
<li><code>-t magento-demo-web .</code>: Specifies the name of the resulting image to be <code>magento-demo-web</code>. The <code>.</code> is a reference to the current working directory from where we’re running the command, which is where our Dockerfile is located.</li>
</ul>
<p>Run <code>docker image ls</code> and you should see our new home grown <code>magento-demo-web</code> image along with the other ones that we’ve downloaded from Docker Hub:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">REPOSITORY TAG IMAGE ID CREATED SIZE
magento-demo-web latest 90d311df434f 22 minutes ago 452MB
mysql 5.7 718a6da099d8 12 days ago 448MB
ubuntu latest 1e4467b07108 3 weeks ago 73.9MB
elasticsearch 7.8.1 a529963ec236 3 weeks ago 811MB
</code></pre></div><h3 id="the-container">The container</h3>
<p>Ok, now that we have an image that’s capable of running Magento, let’s put it to work by creating a container based on it. We do that with:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker run -d \
--name magento-demo-web \
--network magento-demo-network \
--network-alias web \
-p 5000:5000 \
-v ${PWD}:/workspaces/magento-demo \
magento-demo-web
</code></pre></div><p>Line by line, this is telling the Docker engine to:</p>
<ul>
<li><code>docker run -d</code>: Run the container in detached mode. You could also add the <code>-t</code> argument which makes sure the container stays up and running even if there’s no program or service running within it. We don’t need that in this case though, because we defined our Dockerfile with that nifty <code>sleep infinity</code> command.</li>
<li><code>--name magento-demo-web</code>: Set the name of our container to <code>magento-demo-web</code>.</li>
<li><code>--network magento-demo-network</code>: Make our container part of the same network as the MySQL and Elasticsearch ones.</li>
<li><code>--network-alias web</code>: Set our container’s name within the network.</li>
<li><code>-p 5000:5000</code>: Open port <code>5000</code> to access our soon-to-be-running Magento app.</li>
<li><code>-v ${PWD}:/workspaces/magento-demo</code>: Create a new volume that makes our current working directory the same as the <code>/workspaces/magento-demo</code> directory within the container. This is where we’ll store all the Magento files. Binding these directories makes it possible to access and modify the Magento files both from the container and from the host machine. This just makes things easier and more convenient for development purposes.</li>
<li><code>magento-demo-web</code>: Use this image.</li>
</ul>
<p>Running <code>docker container ls</code> will show a list of all running containers, including the one we just created:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4af35c42e0bb magento-demo-web "/bin/bash" 5 minutes ago Up 5 minutes 0.0.0.0:5000->5000/tcp magento-demo-web
6c5ea65a7bd6 elasticsearch:7.8.1 "/tini -- /usr/local…" 2 hours ago Up 2 hours 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp magento-demo-elasticsearch
b73739ad5d66 mysql:5.7 "docker-entrypoint.s…" 3 hours ago Up 3 hours 0.0.0.0:3306->3306/tcp, 33060/tcp magento-demo-mysql
</code></pre></div><h3 id="connecting-to-the-container">Connecting to the container</h3>
<p>With the container up and running, we can connect to it with:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker exec -it magento-demo-web bash
</code></pre></div><p>You may remember this as the same command we used before to connect to the MySQL container. This time, however, we’re using it to connect to our <code>magento-demo-web</code> container, referenced by the name we gave it, and running <code>bash</code> on it in order to open a shell.</p>
<p>After that, a prompt like this should show up:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">kevin@4af35c42e0bb:/workspaces/magento-demo$
</code></pre></div><p>We’re now inside our container. Notice how we’re automatically taken to <code>/workspaces/magento-demo</code>. This is just like we specified in our Dockerfile with the <code>WORKDIR</code> command. Feel free to run <code>php -v</code> or <code>composer -V</code> to validate that the setup from our Dockerfile got all the way into our container:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">kevin@4af35c42e0bb:/workspaces/magento-demo$ php -v
PHP 7.4.9 (cli) (built: Aug 7 2020 14:30:01) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
with Zend OPcache v7.4.9, Copyright (c), by Zend Technologies
with Xdebug v2.9.6, Copyright (c) 2002-2020, by Derick Rethans
kevin@4af35c42e0bb:/workspaces/magento-demo$ composer -V
Composer 1.10.1 2020-03-13 20:34:27
</code></pre></div><h3 id="talking-to-other-containers-in-the-network">Talking to other containers in the network</h3>
<p>We also need to validate that our containers are actually able to talk to each other via the network that we set up. If all went according to plan, still from within our <code>magento-demo-web</code> container, this command should open a <code>mysql</code> session:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">mysql -h mysql -u kevin -p
</code></pre></div><p>Notice how this time we don’t use <code>localhost</code> or <code>127.0.0.1</code> to connect to our MySQL instance. This time, we use <code>mysql</code>. This is the network alias we gave out MySQL container, so this is how our <code>magento-demo-web</code> sees it. To <code>magento-demo-web</code>, the MySQL container is just another machine in the same network.</p>
<p>Same deal for the Elasticsearch container. We can do something like this to talk to it:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">curl elasticsearch:9200/_cat/health
</code></pre></div><p>Again, from the perspective of <code>magento-demo-web</code>, this is just another machine in the network which it can reach by using the <code>elasticsearch</code> network alias that we gave it when creating it.</p>
<h3 id="installing-magento-in-our-container">Installing Magento in our container</h3>
<p>Now that we have our environment ready for Magento, let’s install it. First order of business is to create the Composer project:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">composer create-project --repository-url=https://repo.magento.com/ magento/project-community-edition ./install
</code></pre></div><p>If you’re familiar with Composer, then this should look very familiar to you. This command will download all the Magento files as specified by the <code>magento/project-community-edition</code> project from the <code>https://repo.magento.com/</code> repository. There are a few gotchas though:</p>
<ol>
<li>First, Magento is not openly available to download just like that. As such, Composer will ask for authentication in order to do so. Follow <a href="https://devdocs.magento.com/guides/v2.4/install-gde/prereq/connect-auth.html">this guide</a> to obtain the authentication keys from the Magento Marketplace. When Composer asks for a username, type in the public key; when it asks for password, type in the private key.</li>
<li>Second, you’ll notice that I specified <code>./install</code> at the end of that command. This is where all the files will be downloaded. I’ve chosen this (an <code>install</code> directory inside our current one) because <code>composer create-project</code> will refuse to download the files in a directory that’s not empty. Ours isn’t, because we’ve got our Dockerfile in it. But that’s nothing to worry about, once Composer finishes downloading everything, we’ll just copy the files over to their rightful location at <code>/workspaces/magento-demo</code>. You can do so with some Linux sorcery like this:</li>
</ol>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">(shopt -s dotglob; mv -v ./install/* .)
</code></pre></div><p>This Composer operation will take a good while, but when it’s done, make sure to move all the contents of <code>./install</code> into <code>/workspaces/magento-demo</code>. We now need to actually install Magento:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">bin/magento setup:install \
--base-url=http://localhost:5000 \
--db-host=mysql \
--db-name=magento_demo \
--db-user=kevin \
--db-password=password \
--admin-firstname=admin \
--admin-lastname=admin \
--admin-email=admin@admin.com \
--admin-user=admin \
--admin-password=admin123 \
--language=en_US \
--currency=USD \
--timezone=America/New_York \
--use-rewrites=1 \
--elasticsearch-host=elasticsearch \
--elasticsearch-port=9200
</code></pre></div><p>Even if you have never installed Magento before, the command above should be pretty straightforward. An interesting thing to note is how we’ve set up our database and Elasticsearch settings here:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain"> --db-host=mysql \
--db-name=magento_demo \
--db-user=kevin \
--db-password=password \
</code></pre></div><p>and</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain"> --elasticsearch-host=elasticsearch \
--elasticsearch-port=9200
</code></pre></div><p><code>--db-host</code> is the hostname of the machine where the MySQL server is running. We use our container’s network alias here. <code>--db-name</code> is the name of the database we created when initializing our container via the <code>MYSQL_DATABASE</code> environment variable. <code>--db-user</code> and <code>--db-password</code> are the credentials for the login that we created in the same manner. <code>--elasticsearch-host</code> is the network alias of our Elasticsearch container, and finally <code>--elasticsearch-port</code> is the port that we configured it to listen to.</p>
<p>As you can see, these are the same settings that we used to configure our MySQL and Elasticsearch containers. So make sure to do the same if you’ve been following along and decided to go with different values.</p>
<p>Once that command is done, we’re ready. We have a working Magento. Try it out by running this:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">php -S 0.0.0.0:5000 -t ./pub/ ./phpserver/router.php
</code></pre></div><p>And navigating to <code>localhost:5000</code> in your browser. You should see your empty Magento homepage.</p>
<h3 id="optional-installing-the-sample-data">Optional: Installing the sample data</h3>
<p>If you’re planning some custom extension, or to just play with Magento to get to know it better, you may want to add some sample data. Luckily, the Magento devs have graciously provided such a thing in the form of a Composer package. If you want, you can install it with this recipe:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">bin/magento sampledata:deploy
bin/magento setup:upgrade
bin/magento indexer:reindex
bin/magento cache:flush
</code></pre></div><p><code>bin/magento sampledata:deploy</code> will also ask you for your Magento Makerplace keys so have them ready.</p>
<p>So turn off the built-in PHP server, run these, wait a good while, and fire up the built in server once more. Your Magento app should now have a catalog and all sorts of other data loaded in.</p>
<h3 id="composing-it-all-together">Composing it all together</h3>
<p>Now that was a lot. It was much easier than having to set everything up from scratch without Docker, but still, I promised a minimal setup overhead. A single command. With Docker Compose we can do just that.</p>
<p>For containers, the usual workflow is a three step process:</p>
<ol>
<li>Create the Dockerfile (sometimes omitted if we have a readily available image like it was the case with MySQL and Elasticsearch).</li>
<li>Create or download an image.</li>
<li>Run the container.</li>
</ol>
<p>Docker Compose can help us by capturing all the settings needed to create containers in a single YAML file; which then can be taken by a CLI tool (i.e. <code>docker-compose</code>) and it can set up the complete infrastructure. This single file is named <code>docker-compose.yml</code> and this is what it may look like for our current setup:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-yaml" data-lang="yaml"><span style="color:#b06;font-weight:bold">version</span>:<span style="color:#bbb"> </span><span style="color:#d20;background-color:#fff0f0">"3.8"</span><span style="color:#bbb">
</span><span style="color:#bbb">
</span><span style="color:#bbb"></span><span style="color:#888"># Listing our three containers. Or "services", as known by Docker Compose.</span><span style="color:#bbb">
</span><span style="color:#bbb"></span><span style="color:#b06;font-weight:bold">services</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#888"># Defining our MySQL container.</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#888"># "mysql" will be the network alias for this container.</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">mysql</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">image</span>:<span style="color:#bbb"> </span>mysql:5.7<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">container_name</span>:<span style="color:#bbb"> </span>magento-demo-mysql<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">networks</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- magento-demo-network<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">ports</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- <span style="color:#d20;background-color:#fff0f0">"3306:3306"</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">volumes</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- magento-demo-mysql-data:/var/lib/mysql<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">environment</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">MYSQL_ROOT_PASSWORD</span>:<span style="color:#bbb"> </span>password<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">MYSQL_USER</span>:<span style="color:#bbb"> </span>kevin<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">MYSQL_PASSWORD</span>:<span style="color:#bbb"> </span>password<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">MYSQL_DATABASE</span>:<span style="color:#bbb"> </span>magento_demo<span style="color:#bbb">
</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#888"># Defining our Elasticsearch container</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#888"># "elasticsearch" will be the network alias for this container.</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">elasticsearch</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">image</span>:<span style="color:#bbb"> </span>elasticsearch:7.8.1<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">container_name</span>:<span style="color:#bbb"> </span>magento-demo-elasticsearch<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">networks</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- magento-demo-network<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">ports</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- <span style="color:#d20;background-color:#fff0f0">"9200:9200"</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span>- <span style="color:#d20;background-color:#fff0f0">"9300:9300"</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">environment</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">discovery.type</span>:<span style="color:#bbb"> </span>single-node<span style="color:#bbb">
</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#888"># Defining our custom Magento 2 container.</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#888"># "web" will be the network alias for this container.</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">web</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#888"># The build section tells Docker Compose how to build the image.</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#888"># This essentially runs a "docker build" command.</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">build</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">context</span>:<span style="color:#bbb"> </span>.<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">dockerfile</span>:<span style="color:#bbb"> </span>Dockerfile<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">args</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">USER</span>:<span style="color:#bbb"> </span>kevin<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">UID</span>:<span style="color:#bbb"> </span><span style="color:#00d;font-weight:bold">1000</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">GID</span>:<span style="color:#bbb"> </span><span style="color:#00d;font-weight:bold">1000</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">container_name</span>:<span style="color:#bbb"> </span>magento-demo-web<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">networks</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- magento-demo-network<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">ports</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- <span style="color:#d20;background-color:#fff0f0">"5000:5000"</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">volumes</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- .:/workspaces/magento-demo<span style="color:#bbb">
</span><span style="color:#bbb">
</span><span style="color:#bbb"></span><span style="color:#888"># The volume that is used by the MySQL container</span><span style="color:#bbb">
</span><span style="color:#bbb"></span><span style="color:#b06;font-weight:bold">volumes</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">magento-demo-mysql-data</span>:<span style="color:#bbb">
</span><span style="color:#bbb">
</span><span style="color:#bbb"></span><span style="color:#888"># The network where all the containers will live</span><span style="color:#bbb">
</span><span style="color:#bbb"></span><span style="color:#b06;font-weight:bold">networks</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#b06;font-weight:bold">magento-demo-network</span>:<span style="color:#bbb">
</span></code></pre></div><p>As you can see, most of <code>docker-compose.yml</code> is more or less rewriting the <code>docker run</code> commands in a YAML format. With the exception of the <code>web</code> container/service which includes a <code>build</code> section that reflects the <code>docker build</code> command that was used to take the Dockerfile and turn it into an image.</p>
<p>If you want to try it out, make sure to remove all the infrastructure we’ve created, to avoid any conflicts. You can do so from your host machine with these commands:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker container rm -f magento-demo-web magento-demo-elasticsearch magento-demo-mysql
docker image rm magento-demo-web
docker network rm magento-demo-network
docker volume rm magento-demo-mysql-data
</code></pre></div><p>Make sure you’re in the directory where the Dockerfile lives in the host machine. Then create a new <code>docker-compose.yml</code> file and put all the content above into it. Finally, run:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker-compose up -d
</code></pre></div><p>This will take a little while, but by the end of it, you’ll have a complete infrastructure with the three containers that we’ve created step by step throughout this article. With the <code>docker-compose.yml</code> file, <code>docker-compose up</code> essentially takes care of running all of our <code>docker build</code> and <code>docker run</code> commands.</p>
<p>The <code>-d</code> option means that the the command will run in the background and give you back control of your console. You can also run it without it if you want the console to show the logs from the containers.</p>
<p>You can still see the logs even in detached mode with:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker-compose logs
</code></pre></div><p>You can also inspect the running containers. For that, you can use:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker-compose ps
</code></pre></div><p>Output will look something like this:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">$ docker-compose ps
Name Command State Ports
--------------------------------------------------------------------------------------------------------------------
magento-demo-elasticsearch /tini -- /usr/local/bin/do ... Up 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp
magento-demo-mysql docker-entrypoint.sh mysqld Up 0.0.0.0:3306->3306/tcp, 33060/tcp
magento-demo-web sleep infinity Up 0.0.0.0:5000->5000/tcp
</code></pre></div><p>Notice how <code>docker-compose ps</code> gives us our container names just as we specified them in the <code>docker-compose.yml</code> file.</p>
<p><code>docker-compose</code> has many other utilities, check them out with <code>docker-compose --help</code>.</p>
<p>Now, same as before, we still need to open a terminal into our Magento container to run some installation commands on it. To do so, we can run the following command:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker-compose exec web bash
</code></pre></div><p>Notice how with <code>docker-compose</code> we refer to the container via its service name. That is, the name we gave the container under the <code>services</code> section of <code>docker-compose.yml</code>.</p>
<p>Of course, we can still use the same command that we used before, when we created our container directly with <code>docker</code>:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">docker exec -it magento-demo-web bash
</code></pre></div><p>Now, once inside our container we need to install Magento again. Remember that we wiped out all the infrastructure we created manually, so these are fresh new containers; akin to new machines.</p>
<p>If you were running this from scratch you would just go ahead and do…</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">composer create-project --repository-url=https://repo.magento.com/ magento/project-community-edition ./install
</code></pre></div><p>and</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">(shopt -s dotglob; mv -v ./install/* .)
</code></pre></div><p>In this case, however, we already have all the Magento files in our directory, So we can save time and skip this step. We can reuse these files and just run <code>bin/magento setup:install</code>.</p>
<p>But since this is a new Magento installation, we do need to remove the config file before <code>setup:install</code>’ing. So go ahead and…</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">rm app/etc/env.php
</code></pre></div><p>…then:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">bin/magento setup:install \
--base-url=http://localhost:5000 \
--db-host=mysql \
--db-name=magento_demo \
--db-user=kevin \
--db-password=password \
--admin-firstname=admin \
--admin-lastname=admin \
--admin-email=admin@admin.com \
--admin-user=admin \
--admin-password=admin123 \
--language=en_US \
--currency=USD \
--timezone=America/New_York \
--use-rewrites=1 \
--elasticsearch-host=elasticsearch \
--elasticsearch-port=9200
</code></pre></div><p>After a while, Magento will be fully installed in our new infrastructure created by Docker Compose and ready to be fired up via the PHP built in server:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-plain" data-lang="plain">php -S 0.0.0.0:5000 -t ./pub/ ./phpserver/router.php
</code></pre></div><h3 id="bonus-interactive-debugging-with-visual-studio-code">Bonus: Interactive debugging with Visual Studio Code</h3>
<p>So this is a fully functioning Magento installation with files that we can edit to our heart’s content. In terms of a “fully featured” development environment, however, we need to spruce it up a bit.</p>
<p>So install VS Code from <a href="https://code.visualstudio.com/">https://code.visualstudio.com/</a> and install the <a href="https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.vscode-remote-extensionpack">Remote Development plugin</a>.</p>
<p>Open a new VS Code window and open the command palette with <code>Ctrl + Shift + P</code>. In there, type in <code>Remote-Containers: Attach to Running Container...</code> and press <code>Enter</code>. In the menu that shows up, select our <code>magento-demo-web</code> container.</p>
<p>That will result in a new VS Code instance that is connected to the container. Open an integrated terminal in VS Code and you’ll see:</p>
<p><img src="/blog/2020/08/containerizing-magento-with-docker-compose-elasticsearch-mysql-and-magento/vscode.png" alt="VS Code with Remote Development"></p>
<p>Now, install the <a href="https://marketplace.visualstudio.com/items?itemName=felixfbecker.php-debug">PHP Debug extension</a> so that we can take advantage of that Xdebug that we installed in our container via our Dockerfile.</p>
<p>Create a new launch configuration for interactive debugging with PHP by clicking on the “Run” button in the action bar to the left (<code>Ctrl + Shift + D</code> also works). Click the “create a launch.json file” link in the pane that appears. Then, in the resulting menu at the top of the window, select the “PHP” option. Here’s a screen capture for guidance:</p>
<p><img src="/blog/2020/08/containerizing-magento-with-docker-compose-elasticsearch-mysql-and-magento/opening_debug.jpg" alt="Setting up debugging in VS Code"></p>
<p>That will result in a new <code>.vscode/launch.json</code> file created that contains the launch configuration for the PHP debugger.</p>
<p>Now let’s put a breakpoint anywhere, like in line 13 of the <code>pub/index.php</code> file; press the “Start debugging” button in the “Run” pane, near the top left of the screen (making sure that the “Listen to XDebug” option is selected), and start up the PHP built in server from VS Code’s integrated terminal with <code>php -S 0.0.0.0:5000 -t ./pub/ ./phpserver/router.php</code>. Now navigate to <code>localhost:5000</code> in your browser and enjoy VS Code’s interactive debugging experience:</p>
<p><img src="/blog/2020/08/containerizing-magento-with-docker-compose-elasticsearch-mysql-and-magento/debugging.png" alt="Debugging Magento in VS codeCode"></p>
<h3 id="summary">Summary</h3>
<p>Whew! That was quite a bit. In this blog post, we’ve done a deep dive into how to set up all the pieces of a Magento application using Docker containers: MySQL, Elasticsearch, and Magento itself. Then, we captured all that knowledge into a single <code>docker-compose.yml</code> file which can be run with a single <code>docker-compose up</code> command to provision all the infrastructure in our local machine. As a cherry on top, we set up interactive debugging of our brand new Magento application with VS Code. Thanks to the safety net provided by these tools, I feel like I’m ready to really dig into Magento and start developing customizations, or debugging existing websites. If you’ve been following along this far, dear reader, I hope you do too.</p>
Riding the Elasticsearch River on a CouchDB: Part 1https://www.endpointdev.com/blog/2015/01/riding-elasticsearch-river-on-bigcouch/2015-01-12T00:00:00+00:00Brian Gadoury
<p>As you may guessed from my perfect tan and rugged good looks, I am Phunk, your river guide. In this multi-part series, I will guide us through an exploration of <a href="https://www.elastic.co/">Elasticsearch</a>, its <a href="https://github.com/elastic/elasticsearch-river-couchdb">CouchDB/BigCouch River plugin</a>, its source, the <a href="http://couchdb.apache.org/">CouchDB document store</a>, and the surrounding flora and fauna that are the Ruby on Rails based tools I created to help <a href="https://dp.la/">the DPLA</a> project manage this ecosystem.</p>
<p>Before we get our feet wet, let’s go through a quick safety briefing to discuss the terms I’ll be using as your guide on this trip. Elasticsearch: A schema-less, JSON-based, distributed RESTful search engine. The River: An Elasticsearch plugin that automatically indexes changes in your upstream (heh) document store, in real-time. CouchDB: The fault-tolerant, distributed NoSQL database / document store. DPLA: The Digital Public Library of America open source project for which all this work was done.</p>
<p>Let’s put on our flotation devices, don our metaphor helmets and cast off.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="/blog/2015/01/riding-elasticsearch-river-on-bigcouch/image-0-big.jpeg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="/blog/2015/01/riding-elasticsearch-river-on-bigcouch/image-0.jpeg"/></a></div>
<p>In an Elasticsearch + River + CouchDB architecture, all things flow from the CouchDB. For the DPLA project, we wanted to manage (create, update and delete) documents in our CouchDB document repository and have those changes automagically reflected in our Elasticsearch index. Luckily, CouchDB publishes a real-time stream (heh) of updates to its documents via its cleverly named “_changes” feed. Each change in that feed is published as a stand-alone JSON document. We’ll look at that feed in more detail in a bit.</p>
<p>The River bridges (heh) the gap between CouchDB’s _changes feed and Elasticsearch index. The plugin runs inside Elasticsearch, and makes a persistent TCP connection to CouchDB’s _changes endpoint. When a new change is published to that endpoint, the River passes the relevant portions of that JSON up to Elasticsearch, which then makes the appropriate change to its index. Let’s look at a simple timeline of what the River would see from the _changes feed during the creation of a new document in CouchDB, and then an update to that document:</p>
<p>A document is created in CouchDB, the _changes feed emits:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-javascript" data-lang="javascript">{
<span style="color:#d20;background-color:#fff0f0">"seq"</span>:<span style="color:#00d;font-weight:bold">1</span>,
<span style="color:#d20;background-color:#fff0f0">"id"</span>:<span style="color:#d20;background-color:#fff0f0">"test1"</span>,
<span style="color:#d20;background-color:#fff0f0">"changes"</span>:[{<span style="color:#d20;background-color:#fff0f0">"rev"</span>:<span style="color:#d20;background-color:#fff0f0">"1-967a00dff5e02add41819138abb3284d"</span>}],
<span style="color:#d20;background-color:#fff0f0">"doc"</span>:{
<span style="color:#d20;background-color:#fff0f0">"_id"</span>:<span style="color:#d20;background-color:#fff0f0">"test1"</span>,
<span style="color:#d20;background-color:#fff0f0">"_rev"</span>:<span style="color:#d20;background-color:#fff0f0">"1-967a00dff5e02add41819138abb3284d"</span>,
<span style="color:#d20;background-color:#fff0f0">"my_field"</span>:<span style="color:#d20;background-color:#fff0f0">"value1"</span>
}
}
</code></pre></div><p>That same document is updated in CouchDB, the _changes feed emits:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-javascript" data-lang="javascript">{
<span style="color:#d20;background-color:#fff0f0">"seq"</span>:<span style="color:#00d;font-weight:bold">2</span>,
<span style="color:#d20;background-color:#fff0f0">"id"</span>:<span style="color:#d20;background-color:#fff0f0">"test1"</span>,
<span style="color:#d20;background-color:#fff0f0">"changes"</span>:[{<span style="color:#d20;background-color:#fff0f0">"rev"</span>:<span style="color:#d20;background-color:#fff0f0">"2-80647a2a9498f5c124b1b3cc1d6c6360"</span>}],
<span style="color:#d20;background-color:#fff0f0">"doc"</span>:{
<span style="color:#d20;background-color:#fff0f0">"_id"</span>:<span style="color:#d20;background-color:#fff0f0">"test1"</span>,
<span style="color:#d20;background-color:#fff0f0">"_rev"</span>:<span style="color:#d20;background-color:#fff0f0">"2-80647a2a9498f5c124b1b3cc1d6c6360"</span>,
<span style="color:#d20;background-color:#fff0f0">"my_field"</span>:<span style="color:#d20;background-color:#fff0f0">"value2"</span>
}
}
</code></pre></div><p>It’s tough to tell from this contrived example document, but the _changes feed actually includes the entire source document JSON for creates and updates. (I’ll talk more about that in part 2.) From the above JSON examples, the River would pass the inner-most document containing the _id, _rev and my_field data up to Elasticsearch. Elasticsearch uses that JSON to update the corresponding document (keyed by _id) in its search index and voila, the document you updated in CouchDB is now updated in your Elasticsearch search index in real-time.</p>
<p>We have now gotten our feet wet with how a document flows from one end to the other in this architecture. In part 2, we’ll dive deeper into the DevOps-heavy care, feeding, monitoring and testing of the River. We’ll also look at some slick River tricks that can transform your documents before Elasticsearch gets them, and any other silly River puns I can come up with. I’ll also be reading the entire thing in my best <a href="http://www.bbc.co.uk/nature/collections/p0048522">David Attenborough</a> impression and posting it on SoundCloud.</p>
Elasticsearch: Give me object!https://www.endpointdev.com/blog/2013/04/elasticsearch-object-mapping-eof-400/2013-04-30T00:00:00+00:00Miguel Alatorre
<p>I’m currently working on a project where Elasticsearch is used to index copious amounts of data with sometimes deeply nested JSON. A recurring error I’ve experienced is caused by a field not conforming to the type listed in the mapping. Let’s reproduce it on a small scale.</p>
<p>Assuming you have Elasticsearch installed, let’s create an index and mapping:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">$ curl -XPUT <span style="color:#d20;background-color:#fff0f0">'http://localhost:9200/test'</span> -d <span style="color:#d20;background-color:#fff0f0">'
</span><span style="color:#d20;background-color:#fff0f0">{
</span><span style="color:#d20;background-color:#fff0f0"> "mappings": {
</span><span style="color:#d20;background-color:#fff0f0"> "item": {
</span><span style="color:#d20;background-color:#fff0f0"> "properties": {
</span><span style="color:#d20;background-color:#fff0f0"> "state": {
</span><span style="color:#d20;background-color:#fff0f0"> "properties": {
</span><span style="color:#d20;background-color:#fff0f0"> "name": {"type": "string"}
</span><span style="color:#d20;background-color:#fff0f0"> }
</span><span style="color:#d20;background-color:#fff0f0"> }
</span><span style="color:#d20;background-color:#fff0f0"> }
</span><span style="color:#d20;background-color:#fff0f0"> }
</span><span style="color:#d20;background-color:#fff0f0"> }
</span><span style="color:#d20;background-color:#fff0f0">}
</span><span style="color:#d20;background-color:#fff0f0">'</span>
{<span style="color:#d20;background-color:#fff0f0">"ok"</span>:true,<span style="color:#d20;background-color:#fff0f0">"acknowledged"</span>:true}
</code></pre></div><p>Since we’ve defined properties for the “state” field, Elasticsearch will automatically treat it as an object.* Let’s now add some documents:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">$ curl -XPUT <span style="color:#d20;background-color:#fff0f0">'http://localhost:9200/test/item/1'</span> -d <span style="color:#d20;background-color:#fff0f0">'
</span><span style="color:#d20;background-color:#fff0f0">{
</span><span style="color:#d20;background-color:#fff0f0"> "state": {
</span><span style="color:#d20;background-color:#fff0f0"> "name": "North Carolina"
</span><span style="color:#d20;background-color:#fff0f0"> }
</span><span style="color:#d20;background-color:#fff0f0">}
</span><span style="color:#d20;background-color:#fff0f0">'</span>
{<span style="color:#d20;background-color:#fff0f0">"ok"</span>:true,<span style="color:#d20;background-color:#fff0f0">"_index"</span>:<span style="color:#d20;background-color:#fff0f0">"test"</span>,<span style="color:#d20;background-color:#fff0f0">"_type"</span>:<span style="color:#d20;background-color:#fff0f0">"item"</span>,<span style="color:#d20;background-color:#fff0f0">"_id"</span>:<span style="color:#d20;background-color:#fff0f0">"1"</span>,<span style="color:#d20;background-color:#fff0f0">"_version"</span>:1}
</code></pre></div><p>Success! Let’s now get into trouble:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">$ curl -XPUT <span style="color:#d20;background-color:#fff0f0">'http://localhost:9200/test/item/2'</span> -d <span style="color:#d20;background-color:#fff0f0">'
</span><span style="color:#d20;background-color:#fff0f0">{
</span><span style="color:#d20;background-color:#fff0f0"> "state": "California"
</span><span style="color:#d20;background-color:#fff0f0">}
</span><span style="color:#d20;background-color:#fff0f0">'</span>
{<span style="color:#d20;background-color:#fff0f0">"error"</span>:<span style="color:#d20;background-color:#fff0f0">"MapperParsingException[object mapping for [state] tried to parse as object, but got EOF, has a concrete value been provided to it?]"</span>,<span style="color:#d20;background-color:#fff0f0">"status"</span>:400}
</code></pre></div><p>The solution: check any non-objects in your data against your mapping schema and you’ll be sure to find a mismatch.</p>
<p>One thing to note is that the explicit creation of the mapping is unnecessary since Elasticsearch creates it using the first added document. Try this:</p>
<div class="highlight"><pre tabindex="0" style="background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">$ curl -XPUT <span style="color:#d20;background-color:#fff0f0">'http://localhost:9200/test2/item/1'</span> -d <span style="color:#d20;background-color:#fff0f0">'
</span><span style="color:#d20;background-color:#fff0f0">{
</span><span style="color:#d20;background-color:#fff0f0"> "state": {
</span><span style="color:#d20;background-color:#fff0f0"> "name": "North Carolina"
</span><span style="color:#d20;background-color:#fff0f0"> }
</span><span style="color:#d20;background-color:#fff0f0">}
</span><span style="color:#d20;background-color:#fff0f0">'</span>
{<span style="color:#d20;background-color:#fff0f0">"ok"</span>:true,<span style="color:#d20;background-color:#fff0f0">"_index"</span>:<span style="color:#d20;background-color:#fff0f0">"test2"</span>,<span style="color:#d20;background-color:#fff0f0">"_type"</span>:<span style="color:#d20;background-color:#fff0f0">"item"</span>,<span style="color:#d20;background-color:#fff0f0">"_id"</span>:<span style="color:#d20;background-color:#fff0f0">"1"</span>,<span style="color:#d20;background-color:#fff0f0">"_version"</span>:1}
$ curl -XGET <span style="color:#d20;background-color:#fff0f0">'http://localhost:9200/test2/_mapping'</span>
{
<span style="color:#d20;background-color:#fff0f0">"test2"</span>: {
<span style="color:#d20;background-color:#fff0f0">"item"</span>: {
<span style="color:#d20;background-color:#fff0f0">"properties"</span>: {
<span style="color:#d20;background-color:#fff0f0">"state"</span>: {
<span style="color:#d20;background-color:#fff0f0">"dynamic"</span>:<span style="color:#d20;background-color:#fff0f0">"true"</span>,
<span style="color:#d20;background-color:#fff0f0">"properties"</span>: {
<span style="color:#d20;background-color:#fff0f0">"name"</span>: {<span style="color:#d20;background-color:#fff0f0">"type"</span>:<span style="color:#d20;background-color:#fff0f0">"string"</span>}
}
}
}
}
}
}
</code></pre></div><p>So, this stays true to the statement: <a href="https://elasticsearch.com/products/elasticsearch/">“Elasticsearch is schema-less, just toss it a typed JSON document and it will automatically index it.”</a> You can throw your car keys at Elasticsearch and it will index, however, as noted above, just be sure to keep throwing nothing but car keys.</p>
<p>*Anything with one or more nested key-value pairs is considered an object in Elasticsearch. For more on the object type, see <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/object.html">here</a>.</p>