Introduction to Docker and Dockerizing Node.js application
According to the official documentation, Docker is an open platform for developing, shipping, and running applications. It enables us to separate the applications from infrastructure so that we can deliver software quickly. It is one of the most popular containerization platform and provides the ability to package and run an application in a loosely isolated environment called containers.
You can click on the following link to get in-depth knowledge about docker and it's architecture.
The easiest way to install docker on our local machine is using Docker Desktop. It's available on all platforms. It's a tool with User Interface which is very easy to use and can be used to managed images, containers, volumes and various others. It includes following components:
- Docker Engine
- Docker CLI client
- Docker Compose
- Credential Helper
Click on the following link to download and install the packages as according to your machine operating system:
To install on linux machines:
To install on macOS machines:
To install on windows machines:
After the installation of docker, verify the installation by issuing following command:
Docker desktop dashboard looks like below:
Following are some of the most important and frequently used components of the docker.
Docker hub is the public repository of docker images containing hundreds of thousands of container images. It is the world's largest library and community for container images. Here, you can search for any kind of container images and chances are that you will find it most of the time. They offer free services as well as paid plans. With free offering, there are limitations that we need to consider. But for individuals and small teams, it has no impact. With paid plans, we can create private repositories.
A docker image contains all the necessary software packages, codes and it's associated dependencies, based on which we can create a container.
A container is a loosely isolated environment which enables us to run docker images containing all the project codes and its entire dependencies. A container is a running instance of an image. A container has it's own file system and it's own networking.
With the help of Docker networks, we can communicate between different containers and also from host machine to the containers. When we start a container, it attaches a network named bride automatically to the container.
We can use volumes to persist data in a docker container. We can create volume using docker volume create volumeName command and it is stored within a directory on the docker host.
A dockerfile is a text document which contains the instructions to assemble a docker image. When we issue the docker build command, it reads the instructions from the Dockerfile line by line and executes them to create an image with all the necessary resources and it's dependencies required to start an application.
By default, when we run the docker build command, it looks for a file called Dockerfile in a directory, from where the build command is executed and this whole project directory acts as a build context for the docker. So, let's create a file named Dockerfile in the root of our project directory.
If we want to use some other name for the docker file, then while issuing docker build command, we need to also specify the filename. Suppose we have a docker file named travel_app.Dockerfile
docker build -t travel-app -f travel_app.Dockerfile .
-f specify the filename.
-t specify the tag name for the image, a unique text identifier for that image which will be constant
Let's create a Dockerfile. The first thing we need to add in the file is the base image. For this application, we have used alpine linux. We could have chosen any other linux distribution but choosing alpine linux has many benefits over others. It is simple, highly secured and extremely light-weight compared to others and has one of the fastest boot times of any operating system. For production systems, alpine is the preferred choice.
As of writing this article, 3.16 is the latest version of alpine image hosted in docker hub. Specifying the particular version rather than latest keyword when using base images has many benefits as it can prevent unseen errors due to version mismatch/upgrades.
Let's define some variables. In docker, we can declare variables using ARG instruction.
ARG WORKING_DIR=$HOME/travel-app/ ARG PORT=3000
WORKING_DIR variable is for storing all of our application codes, from where we will start the node.js server
PORT is the port number that our application uses to listen to the client requests.
Next, we need to install the latest version of Node.js and npm package manager.
RUN apk add --update-cache nodejs npm
We then need to define a working directory using WORKDIR instruction which instructs the subsequent docker commands to use the defined path as default location from where commands were to execute.
We are assigning a variable to the WORKDIR instruction. Now all the instructions in the dockerfile after it will use $HOME/travel-app/ as the base directory.
Copy everything from the current build context to the working directory as defined with $WORKING_DIR variable. To copy, we can use COPY instruction.
COPY . $WORKING_DIR/
Install all the required npm packages using npm install command:
RUN npm install
Then, let's use EXPOSE instruction to inform Docker that the container listens on the specified port at runtime. TCP protocol is used by default. Note that the EXPOSE instruction does not actually publish the port. We need to publish the specified port when using docker run command at the time of starting a container. It functions just as a documentation between the person who build the image and the person who runs the container.
Finally, we need to provide a node command which will start the node.js server using CMD instruction. CMD instruction provides defaults for an executing container. In our case, once container starts, it will execute node index.js command.
CMD [ "node", "index.js" ]
Our Dockerfile is ready. Before building a docker image based out of this, we need to know something about .dockerignore file as well.
Just like with .gitignore, which we discussed in detail in our earlier chapter, we do not want docker to copy everything from the build context, when building an image. To instruct a docker for ignoring certain files/folders from copying, we need a file called .dockerignore which we should create in the same directory as Dockerfile.
Copy the following contents to the .dockerignore file so that the mentioned files and folders won't be copied into docker image. Here, we don't want node_modules directory to be copied to the docker image since we can use npm install command to create node_modules directory easily. Also copying code editor specific files and folders is unnecessary for running our project in a docker container.
# Dependency directories node_modules ### VisualStudioCode ### .vscode/* !.vscode/settings.json !.vscode/tasks.json !.vscode/launch.json !.vscode/extensions.json !.vscode/*.code-snippets
Now our directory structure looks like below:
Our Dockerfile content with a very basic implementation looks like below:
FROM alpine:3.16 ARG WORKING_DIR=$HOME/travel-app/ ARG PORT=3000 RUN apk add --update-cache nodejs npm WORKDIR $WORKING_DIR COPY . $WORKING_DIR/ RUN npm install EXPOSE $PORT CMD [ "node", "index.js" ]
Building Docker Images
Now, it's time to start building a docker image using above Dockerfile. First, let's check the existing docker images in the system:
If this is the first time you are using docker in your system, then there will not be any images listed for the above command. If you have already used docker to build images, then you will see the list of images with various informations.
To build a docker image, we need to navigate to the project directory where Dockerfile resides and then issue the following command:
docker build .
Here, dot(.) denotes the current directory as a build context for the docker command. With the above command, it creates an image but without any tagging information which we can use to identify this particular image. In this scenario, we can interact with this image using only Image ID as shown in the image below. Image ID can change on every docker build process if there are changes detected in the working directory. So, it's really not user friendly to use Image ID as a unique identifier for interacting with images.
Let's see the detailed information about this image:
docker image inspect IMAGE_ID
Here, we have to replace IMAGE_ID with the real one.
To create a docker image using tagged value which in our case is travel-app, issue the following command:
docker build -t travel-app .
We can interact with this image using travel-app text and it will be constant no matter how many times you build a docker image.
Run Docker Container
Now that our docker image is ready, we can use this image to create a container. To start a container from the above docker image, we can run the following command:
docker run -p HOST_PORT:CONTAINER_PORT IMAGE_NAME
Here, -p shortcut for --publish is used for publishing a port from the container. We will use 3000 for both host port and container port. You can modify host port as per your need. Replace IMAGE_NAME with the name of the docker image.
docker run -p 3000:3000 travel-app
This will run the Node.js server inside of docker container and then start listening for client requests in port 3000 as specified in our Dockerfile. Run the following command to list the running docker containers:
As you can see, the name of the container is some random value. As with image, it's easier to interact with containers with a specific name. Let's first kill (stop and remove) the existing container using container id value:
docker kill CONTAINER_ID
Run the following command to start the container in detached mode using a specific name:
docker run -d --name CONTAINER_NAME -p HOST_PORT:CONTAINER_PORT IMAGE_NAME
Replace CONTAINER_NAME with travel-app as docker container name and also replace IMAGE_NAME with travel-app as image name:
docker run -d --name travel-app -p 3000:3000 travel-app
Run docker ps command to check the updated list of docker containers:
We can use following command to check the logs from docker container in watch mode - replace CONTAINER_NAME with travel-app in our case:
docker logs -f CONTAINER_NAME
To stop a container:
docker stop CONTAINER_NAME
To start a container:
docker start CONTAINER_NAME
To restart a container:
docker restart CONTAINER_NAME
To get inside of the docker container:
docker exec -it CONTAINER_NAME /bin/sh
To remove a container, first it needs to be stopped and then removed using following command:
docker rm CONTAINER_NAME
After removing a container, we can also remove docker image using following command:
docker rmi IMAGE_NAME
The above Dockerfile implementation is a very basic one. We can do a lot of enhancements in terms of performance and security in our Dockerfile. Let's implement some enhancements in this chapter.
As you are well aware that doing npm install command is one of the most time consuming tasks when dealing with Node.js applications. It downloads all the project dependencies and can be of huge size. So, let's focus on the enhancements on this part.
1> To run the application inside of a docker container, we only need core npm package dependencies. We can safely ignore dev dependencies and to do that we can issue following command:
RUN npm install --only=production
2> As of now, when you change anything in the project directory and try to execute docker build command, you will see npm install command always running even though there were no new additions/removal of npm packages. And we just discussed that npm install is one of the most time consuming action. In large projects, this is a major problem as it can extremely slow down the build process of an application. So, to solve this issue, we need to figure out a way in which we can separate the packages from the code while copying in a Dockerfile.
Every instruction in a Dockerfile creates a new intermediate layer and each layer is automatically cached after build. Docker will try to use existing cached layer whenever possible if there were no changes detected in the associated files.
Using the concept of docker layer caching, we can do certain modifications in our Dockerfile which will increase the performance of our docker build process massively.
First, only copy the package.json and package-lock.json file which can be indicated by package*.json, * matches anything between package and .json extension. Then run npm install command:
COPY ./package*.json $WORKING_DIR/ RUN npm install --only=production
After this, then we copy rest of the other things from the build context to our image working directory.
COPY . $WORKING_DIR/
This way, only if changes were detected in package.json and package-lock.json file, npm install command will be executed else docker build will use cached layer for that step. If changes were detected in any other files, we just issue the copy command to copy the files. After making this changes, our final version of Dockerfile will look like below:
FROM alpine:3.16 ARG WORKING_DIR=$HOME/travel-app/ ARG PORT=3000 RUN apk add --update-cache nodejs npm WORKDIR $WORKING_DIR COPY ./package*.json $WORKING_DIR/ RUN npm install --only=production COPY . $WORKING_DIR/ EXPOSE $PORT CMD [ "node", "index.js" ]
Another best practice to follow is to group related instructions as much as possible to reduce the number of intermediate layers formation. Also, there is a much better way to build images in docker, called multi-stage build process, which we will discuss in our later chapters. In our upcoming chapters, we will also discuss about the security best practices as well for the Dockerfile.
In our next chapter, we will discuss about cloud deployment architecture for Web applications.