Web Application Performance Tuning

September 12th, 2020 - Vincent Luciani

These best practices will help you tuning the web performance of your web application. Use the quick links below to jump directly to the type of information you are interested in.

Choose a chapter below:

General

Use of caching	Implement application caching for your APIs: - in case the same API call was already made and the cache is not expired, answer with the content of the cache. - in the other case, build the answer using the APIs logic, and save the answer in the cache. - implement an API call to be able to flush the cache. This should be accessible to your company only. - implement an API call to be able to get information about the cache. This should be accessible to your company only. Regarding cache expiration (how long the cache is valid): determine the maximum time you can wait for fresh information and setup the cache expiration accordingly.
Different types of caching	If two actions have a different time need before refresh, put them on two separate API calls so that you have the ability to: - have a different caching expirations for each of them. - flush one cache without flushing the other. Example: translation of labels expires after one month, product data expires after one day or twice a day. The aim is to cache each element as long as possible, without impacting the business.
Hosting provider bandwith	Before making a decision on a service provider, test the service provider's network is well connected to the outside world: - check on the offer the guaranteed connection. - purchase a server for the smallest period possible (or purchase with a money back guarantee) and install a sample application. - test a sample with webpagetest.org - in case of an application, have the server near your visitors ( specifically ask ) - in case visitors all around the world, use load balancing based on localization
Less information exchanged	Minimize the amount of data sent at each request between the servers.
Server location	Check you did not put a component of the application on the wrong location: for example database in Asia and API in US-east. In case your company use VPCs, make sure all components are on the same VPC. If not possible, check there is not firewall between them and check the connection speed between the two VPCs ( is there a peering ?).
Testing	Test access to the application from different part of the world ( using webpagetest.org )
Check latest hardware offer from cloud provider	Be aware of new type of hardware offered by your cloud provider. For example, they can plan to offer a new generation of CPU that is either more stable, more efficient, faster... , but you need to specifically request to change your servers to the new type of servers. If it is possible to get the information from the cloud provider: Level of RAM, type of RAM ( SRAM - static RAM faster ), RAM speed (DDR is double data rate and is faster ) CPU ( speed, number of cores, L1 and L2 cache - L1 cache is stored on the chip, L2 is stored outside the chip - , architecture - 64 bit faster than 32 bit etc Disk ( disk speed, type of disk - obviously SSD, but careful some provider may still offer HDD )
Autoscalling	Find out which autoscalling functionality is offered by the cloud provider. Example: elastic beanstalk or lambda for AWS Check autoscalling is ramping up correctly, ramping down correctly, not taking too much ressource
Application using more than one disk in parallel	Especially interesting for databases or processes heavily writing to the disk: have more than one disk and parallelize the writing
Use multicore	If you have more than one core, make sure you actually use them. If server has only one thread, then run several servers in parallel.
Software as a service on the cloud	Especially interesting for databases, but can be used for API too. Look for "best databases as a service" on a search engine. Use software as a service instead of having the application server installed on your own server.
Load balancing	Balance the load between several servers. Be careful if your application has a login. In that case, you need to synchronize the sessions between the two servers
Use the correct version of software	In case you use a serverless architecture or have your database on the cloud, this is not useful. But if you are running the application on your own, make sure you are up to date with the version of your servers (web server, database server, runtime environment, programming language)
Check influence of antimalware or other services	Make sure you choose an antimalware than does not kill your application when running on the same server. Monitor how much CPU, RAM, disk your antimalware is using. Check if other services running on the same server can be suppressed
Check mechanisms to connect to backend are in place and tuned	Set up timeout to avoid blocking the server with transactions that take too long. It is tempting to put a long timeout to avoid returning 404, but by doing this you make all other calls late. In other words, sacrifice the calls that take too long to avoid slowing down all calls. Retry mechanism with a maximum number of retries to avoid blocking the server Make sure timeout on the application is synchronized with the timeout on backend: timeout of the calling party should be greater than the timeout of the called party (otherwise the caller will time out and the called will keep trying). Fine tune max pool size when using connection pools: maximum number of connections that can be created to satisfy client requests
Asynchronous versus synchronous	Everything that does not require an immediate and calculated answer should be done asynchronously. When a transaction is made synchronously, it means your application is not performing any further activity until it gets the answer from the server. This is necessary when you are executing payments with a debit card, but for some activities, it is not necessary. When you think about it, most things can be done asynchronously without affecting the user experience. When the user executes an action that makes him gain points, he can get a congratulations message immediately, but the actual update can be done after a few seconds asynchronously. Some actions can even wait until the evening.
Use of asynchronous programming	Use asynchronous calls when the processing involves a lot of waiting: - when you need to execute database queries or updates - when you need to call external APIs - when you are copying, downloading, uploading data.
Use of parallel processing in batch, interpreted languages	Use parallel processing in batch when : - you use mainly CPU intensive, String operation - graphics processing - number crunching algorithms If you plan to use an interpreted language for this, do it only if it offers a compiled library (for example a C compiled library in Python) that can execute the work fast. Another example: awk script can process a file extremely fast, but a batch script executing actions in a loop slower.
Batch processing with queues	If your process is separated into steps (for example download, then parse, then upload), use queues: each time one document, one record, or one chunk of information is processed by a step, the result of this step is kept in a queue (if possible in an in-memory database like Redis). As a result, instead of processing for example the first step for all records during 10 hours, and then the second step during another 10 hours, each item processed in the first step goes to the second step instead of waiting that all items to pass the first step.
Massive database actions using bulk and chunks	When loading a significant volume of data in a database, find a command to bulk load data in this database instead of using multiple insert or update commands. Try also to execute these loading actions by chunk. For example you separate your data in files of 50000 entries and bulk load each file one by one
Blue green deployment	When you need to execute massive update actions on the database used to answer to read requests: Have two sets of databases instead of one: database A where the reads are done and database B where the updates are done. The process looks like this: - clone database A to create database B. - make massive updates to database B (as a result, you do not affect the performance of reads on database A) - you make tests on database B to validate it is ready to go, if possible by running regression tests on it. - you make the appllication point to B instead of A - you make regression tests to check the application is still functioning - you delete A, and repeat the operation, except that this time A and B are inverted.
Use in-memory database	Use in-memory database like reddis instead of keeping information in large files (useful for batch processing, storing of large amount of settings - like translations, or caching at application level )
Use of zip when processing files	This applies to batch processing: when keeping the information for further processing, zip the kept information when storing it and unzip when reading it
Use of S3	Try to serve static content with S3 buckets or equivalent. If you have a mix of static content and non-static content, you have the following options: - have a reverse proxy that redirects to the S3 bucket or the application server. - serve static content on your main host name and dynamic content on a subdomain. You then have to be careful to setup your dynamic server to accept cores from the main domain.
Login too much information	Double check you setup the login level on your production server to error or critical, and not a lower level. Make sure you are logging information only in case of error on the error and critical levels.
Garbage collection	If you are using java, try to find the garbage collector version the most efficient for your implementation.

Website

Use of CDN to cache your static files	Cache all static file using a CDN (Content Delivery Network). As a result: - static files will be cached all over the world. - you may prevent some DDoS (Distributed Denial of Service) attack. - some CDNs will stream the videos present on your web pages (206 return code) instead of downloading them (200 return code)
In case of CMS, make sure the web pages are pre-cached	There must be a mechanism on the CMS to pre-cache the webpages after creating the corresponding content.
Use of http2	Use a server that supports http2 protocol
Use of AMP	Use AMP for content pages ( articles, presentations ). AMP pages and normal page equivalent must be linked together using a metatag AMP pages have no javascript
Answers from server are zipped	Response header should have: Content-Encoding: gzip Request header have Accept-Encoding: gzip, deflate, br This is set on the application server
Lazy loading of images	Put in place a mechanism in javascript loading images only once the page is loaded ( onload )
Converting your images to progressive images	Convert your images to progressive image (conversion to webp format)
Lazy loading of everything under the fold	Separate what is under the fold ( images, but also CSS ), meaning not seen by visitors. Load first what is above the fold, and then only what is under the fold
Use of videos	If you choose to use videos on your website: - if they are long (over 10 seconds), use a streaming service to display the video on your page - if they are short, the video must be streamed (will return a 206 http code, will start playing the video before it is fully downloaded) instead of downloaded (200 http code, video is fully downloaded before starting playing). With some CDN providers the mp4 videos cached using the CDN are streaming instead of downloaded without any effort from your side.
Reduce the size of image	I use for example tinyPNG.com.
Have smaller images, no videos on mobile	Reduce the height and width of images on mobile (load smaller images on mobile than on desktop). Try to avoid videos on mobile (except for youtube videos). If you avoid mp4 on mobile, load them with javascript only if the width of the screen is greater than a certain threshold.
Google fonts: do not use the entire set of files	When using google fonts, you are offered a link to a file that links to several files. Check which files you really need, and which characters you really need, and link your page to only the necessary links. For example, I use https://fonts.googleapis.com/css?family=Nunito&text=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789.-:%3E instead of https://fonts.googleapis.com/css?family=Nunito which loads 5 woff2 files, including files for Vietnamese, Cyrillic.
Google fonts: css part in your css, woff2 file on your server	The google font is linking to a file containing CSS fragments pointing to URLs of woff2. To gain time: - put the css fragment (starting with @font-face) into your application s CSS. - go to the URL(s) of the woff2 on your browser, that will download the woff2 file(s). Put these files on your server and make the fragment on the CSS point to it instead of pointing to the google font URL. Regarding the second technique, check with a site like webpagetest.org if it goes faster or not (with using google font URL to the woff2, you loose time with the DNS connection, as it is not on the same domain as your application, but it may download faster), and put back the original URL in the CSS fragment.
Minimize scripts	Minimize Javascript files, CSS files, HTML file In the case of javascript files, uglify (which among other things replaces variable and function names by letters to gain even more space) land minify (can be done using gulp)
File expiration	Setup file expiration so that they can stay on the visitor's browser for a file - then when using the website another time, it will load faster for them In .htaccess for apache: ExpiresActive On # Set up 1 week caching on static files <FilesMatch "\.(xml\|txt\|html\|js\|css)$"> ExpiresDefault A604800 Header append Cache-Control "proxy-revalidate" </FilesMatch>
Less requests sent to the server	Bundle js files together Can bundle images files together using sprites Use svg instead of images in the case of small icons
Web server technology	Assess which web server technology is the fastest for your use case. Example of options for static content: S3 bucket versus apache versus nginx
Testing	webpagetest.org https://yellowlab.tools/

SQL

Index	increases retrieving slows down updating less effect is great percentage of rows have the same value for this column
Tune queries included in functions	Check queries used by function and tune them as well
Use indexes when joining two tables	If you are joining two tables using a query. create an index on both columns used to join these two tables.
Composite index: do not omit the first column	In case your index is using two columns: - do not omit the first column in the order by clause - list the columns in the right order in your order by clause
Clustered index for ranges	Create a clustered index on the most used column when selecting ranges using this column. This will physically order the data in the table using this column ( can see it when executing a select without order by). You can only have one clustered index per table. Clustered indexes do not allow duplicates
Use joins instead of correlated subqueries	This is not the case for all DBMSes
Do not use non-deterministic functions on the left hand side of a comparison	Query before tuning: select * from client_status where datediff(day,status_date,getdate())> 50 This query cannot be cached. After tuning: declare @date_threshold date select @date_threshold= dateadd(day,-50,getdate()) select * from client_status where status_date > @date_threshold
Add a column containing reverse names when looking for a string at the end of a column	Create a new column in the same table (let us call it column B) use a trigger so that each time you insert or update a row, you put in this new column B the reverse value of the column you want to use in the Where clause (let us call it column A). When looking for a string at the end of the column A, match the reverse of the string you are looking for to the new column B
Do not need to select all columns	In your query, select only the columns that are necessary for further processing
Execute updates, inserts, deletes in batches	Separate the full list of updates, inserts or deletes into chunks (of 5000 operations for example), and update chunk by chunk
Split voluminous tables into several tables	If you have a table storing a huge amount of information, you can split this table’s data between several tables. Each resulting table will contain less rows than the corresponding original table
Use a temporary table when joining tables	When joining two voluminous tables, you can enhance the query using the following technique: you query one of the two tables you store the result of this query in a temporary table you join this temporary table with the second table To choose which one of the two tables to use in step one, do so that the information in the temporary table is as less voluminous as possible.
Delete records from underlying tables (possibly archive the deleted records)	If you have voluminous tables, you can increase your SQL performance by moving data that is not necessary anymore for your application in separate tables In order to apply this tip, you need to find out rules to determine which data is not useful anymore.
Rebuild Indexes and tables	Use a rebuild function to rebuild the index. Another option is to drop the index and then creating it again:
Drop indexes and foreign keys before loading data	Drop indexes and foreign keys, execute the inserts, delete, updates, then recreate the indexes and foreign keys
Online versus batch	Everytime the user requests an update, an insert or a delete, decide: if it is critical to execute the update, insert or delete as soon as the user requests it. if it is critical to make the user wait for the completion of the update, insert or delete results before the user can use the application further. In case offline processing is possible: communicate to the user that the request is going to be processed execute the operations offline once done with the offline operation, communicate to the user the action is completed
Put the most selective criteria first in the index and in the where clause	If your index includes several columns (composite index), the index should start with the column containing the least duplicate values (most selective column). This technique implies putting the columns in the right order when creating the composite index. You should also order these columns well in your where clause
Use less storage space on each column	Choose the minimum type necessary If the value is just yes or no (0 or 1), choose a bit (the size of a bit is 1 bit). In the case of a number: if you are sure it will be less than 250, will not be negative and will contain no decimal, use a tinyint (the size of a tinyint is 1 byte, which is 8 bits). In the case of a number: if you are sure it will be less than 32767 and will have no decimal use a smallint (size: 2 bytes). In the case of a number: if you are sure it will be less that 2,147,483,647 and will have no decimal use an integer (size: 4 bytes). If not, use a bigint (size: 8 bytes). In the case of a number that has decimals, find out the total number of digits really needed as well as the number of digits needed after the decimal point. The total number of digits determines the storage size: 1-9 digits: 5 bytes. 10-19 digits: 9 bytes. 20-28 digits: 13 bytes. 29-38 digits: 17 bytes. Then, the more digits you use after the decimal point, the less digits you can use before the decimal point. If your store dates without using hour minutes and seconds, use smalldatetime. In the case of a text: when all values in the column have the same length, use char(x). It uses 2 bytes less space for the same data length than varchar(x). Example: airport code which length is always 3 letters. Fields in char(3) will always have a storage size of 3 bytes. Fields in varchar(3) will have a storage size equal to the length of the data in the field + 2 bytes. As the data length is always 3 in our example, then the storage size will always be 3 + 2=5 bytes. In the case of letters: if the values in the fields have various lengths then it is best to use varchar over char. Example: a column contains strings which biggest reasonable length is 100 characters. Let us say that in average the string length is 25 in this column. Then the storage size in the case of varchar(100) is in average equal to 25 + 2=27 bytes. On the other side, the storage size for char(100) is always 100 bytes. If you choose varchar, you could decide what is the biggest reasonable length of the strings contained in the column. But as the storage size in varchar(x) depends on the string size, why bother determining what to put instead of x? Example: you know that the strings stored will never go over 200 characters, but the storage size of each string will be the same whether you declare the column as varchar(200) or varchar(1000). Indeed, the storage size of each string is equal to the string actual size + 2. You will find some experts arguing that it is irrelevant for performance whether you declare varchar(200) or varchar(1000). Some others say that it can impact performance. Their argument is that varchar(200) is an indication provided to the DBMS of the estimated storage size per field in this column. As every indication can influence how the DBMS is planning to execute the query, it can make sense to give an estimate as close as possible to reality to "help” the DBMS doing his job well.
Choose between select count greater than 0 and exists	In some cases a query containing ”where exists” performs differently than a query counting a number of rows and checking if it is greater than 0. Both ways of doing return the same result, but their speed may differ. select * from clients c where (select count(*) from salesperson s where c.salesperson_id =s.id) > 0