Mobile First To AI First: Node JS Must DO in Production Tips

**CreditgGoes to actual Author here - Click Here

At Hashnode we heavily use Node.js. I am a big fan of it and have learned a lot of things while running Hashnode. When I hang out with other developers, I notice that many people don't utilise Node to its full potential and do certain things the wrong way. So, this article is going to be about things which you shouldn't do while running Node in production. Let's get started!

Not using Node.js Cluster

Node.js is single threaded in nature and has a memory limit of 1.5GB. Due to this, it can't take advantage of multiple CPU cores automatically. But the good news is that Cluster module lets you fork multiple child processes which will use IPC to communicate with the parent process. The master process controls the workers and all the incoming connections are distributed across the workers in a round robin fashion.

Clustering improves your app's performance and lets you achieve zero downtime (hot) deployments easily (More on this later). Also keep in mind that number of workers that can be created is not limited by the number of CPU cores of the machine.

I feel clustering is a must-have for any production Node.js app and there is no reason not to use it.

Performing heavy lifting inside web servers

Node/Express servers aren't meant to perform heavy and computationally intensive tasks. For instance, in a typical web app you will have to send bulk emails to users. Although you can perform this task in the Node.js web server itself, it will degrade the performance significantly. It is always better to break these heavy tasks into micro services and deploy them as separate Node apps. Further you can use a message queue like RabbitMQ to communicate with these micro services.

When you post something on Hashnode and tag it with various nodes, we insert it into the feeds of thousands of users. This is a heavy task. Until a few months back, the feed insertion task was being handled by the web servers themselves. As the traffic increased, we started to see performance bottlenecks. Back then if you would tag a question with "General Programming" or "JavaScript" and post it, the whole website would become non-responsive for a few seconds. Proper investigation revealed that it's due to the feed insertion activity. The solution was to move that particular module into a different machine and initiate the task via a message queue.

So, the key take away is that Node.js is best suited for event handling and non blocking I/O. Any task that would take long to complete should be handled by a separate process.

Not using a process manager

While it's obvious that process managers have a lot to offer, many first time Node users deploy their apps to production without a process manager. At Hashnode we have been using pm2, a powerful process manager for Node.js.

Also, if you use pm2 you can start your app in cluster mode very easily.

pm2 start app.js -i 2

In the above example, i specifies the number of workers you want to run in cluster mode. The best part is that you can now reload the workers one after another so that your app doesn't suffer any downtime during deployment. The following command does it :

pm2 reload app

If you happen to use pm2, do check out Keymetrics which is a monitoring service for Node.js (based on pm2).

Not using a reverse proxy

I have seen developers running Node based apps on port 80 and serving static files through it. You should remember that running a Node app on port 80 is not a good idea and is dangerous in most cases. Instead you should run the app on a different port like 3000 and use nginx (or something like HAProxy?) as a reverse proxy in front of the Node.js app.

The above setup protects your application servers from direct exposure to internet traffic and helps you scale the servers and load balance them easily.

Lack of monitoring

Bad things like unexpected errors, exceptions will keep happening all the time. You know what's worse? It's not knowing that something bad happened in your Node process. Now that you are using a process manager, your node process will be reloaded whenever an unhandled exception occurs. So, unless you check the logs you won't find out the issue. The solution is to use a monitoring service and have them alert you via email/sms in case your process gets killed and restarted.

Not removing console.log statements

While developing an app, we use console.log statements to test things out. But sometimes we forget to remove these log statements in production, which consume the CPU time and waste the resources. The best way to avoid this is to use debug module. So, unless you start your app with environment variable DEBUG nothing will be printed to the console.

Maintaining global states inside the Node web processes

Sometimes developers store things like session ids, socket connections etc inside the memory. This is a big NO and should be avoided at all cost. If you do store session ids in memory, you will see that your users are logged out as soon as you restart the server. This also causes problems while scaling the app and adding more servers. Your web servers should just handle web traffic and should not maintain any kind of state in memory.

Not using SSL

For a user facing website, there is no reason not to force SSL by default. Sometimes I also see developers reading SSL keys from a file and using them in the Node process. You should always use a reverse proxy in front of your Node.js app and install SSL on that.

Also, keep checking for latest SSL vulnerabilities and apply fixes ASAP.

Lack of basic security measures

Security is always important, and it's good to be paranoid about your app's security. In addition to the basic security checks, you should use something like NSP to discover vulnerabilities in your project.

Also, don't use outdated versions of Node and Express in production. They are no longer maintained and don't receive security updates.

Not using a VPN

Always deploy your app inside a private network, so that only trusted clients can communicate with your servers. Often while deploying, people forget this simple thing and face a lot of problems later. It's always a good practice to think of the infrastructure and architecture in advance, before deploying your app.

For instance, if your Node server runs on port 8080 and you have setup nginx as a reverse proxy, it's important to make sure that only nginx can connect to your app on the specific port. It should be isolated from the rest of the world.

Serve static assets not minified and gzip'ed (see grunt)
Disallow local caching for those static assets
Keep access logging enabled
Store uploaded files on disk folders instead of a virtual file system like GridFS

Discussion:

Not using a reverse proxy

Most reverse proxies are not able to fully leverage the power of HTTP/2 and other modern standards. Sometimes, there is trouble with reverse-proxies and (web-)socket connections. I think there are cases where you really want to expose your Node.JS application to the internet.

Maintaining global states inside Node web processes

How to maintain state really depends on the kind of application you develop. I work on an application which keeps alive websocket connections with some 500 devices. They have to be manageable over a separate GUI, so I do not have any other choice than to store the connections, if I want to reuse them (and send commands) based on the device-name.

Comments:

yeah, that might be one solution, but why not just write everything you need directly in Node.JS? Even with Go, the service will not perform any better as Node.JS will still be the slowest part in the chain

Using Cluster:

i'm building a node app which is completely stateless and uses JWT for authentication. i'm planning to host the app on 10 servers each having only i cpu CORE (as it is stateless there is no need to bother about session sync) each server has nginx to frontload node app on that particular server . and all these servers will be front loaded by 1 load balancer . so my question is in my scenario do i really require cluster module ?

Comments:

even having only 1 CPU core, the cluster stills brings to you the benefit of having no downtime when reloading the application on a specific server. Yeah, I know... you already have other 9 servers to take care when 1 server is restarting the app... It may be a "micro-benefit", but it is still a benefit. (Specially if you start with one server alone before adding the others.)

Reference:

https://nodejs.org/api/cluster.html#cluster_how_it_works

Node JS Must DO in Production Tips

Not using Node.js Cluster

Performing heavy lifting inside web servers

Not using a process manager

Not using a reverse proxy

Lack of monitoring

Not removing console.log statements

Maintaining global states inside the Node web processes

Not using SSL

Lack of basic security measures

Not using a VPN

No comments:

Post a Comment