Demos
Use of caching | Implement application caching for your APIs: - in case the same API call was already made and the cache is not expired, answer with the content of the cache. - in the other case, build the answer using the APIs logic, and save the answer in the cache. - implement an API call to be able to flush the cache. This should be accessible to your company only. - implement an API call to be able to get information about the cache. This should be accessible to your company only. Regarding cache expiration (how long the cache is valid): determine the maximum time you can wait for fresh information and setup the cache expiration accordingly. |
Different types of caching | If two actions have a different time need before refresh, put them on two separate API calls so that you have the ability to: - have a different caching expirations for each of them. - flush one cache without flushing the other. Example: translation of labels expires after one month, product data expires after one day or twice a day. The aim is to cache each element as long as possible, without impacting the business. |
Hosting provider bandwith | Before making a decision on a service provider, test the service provider's network is well connected to the outside world: - check on the offer the guaranteed connection. - purchase a server for the smallest period possible (or purchase with a money back guarantee) and install a sample application. - test a sample with webpagetest.org - in case of an application, have the server near your visitors ( specifically ask ) - in case visitors all around the world, use load balancing based on localization |
Less information exchanged | Minimize the amount of data sent at each request between the servers. |
Server location | Check you did not put a component of the application on the wrong location: for example database in Asia and API in US-east. In case your company use VPCs, make sure all components are on the same VPC. If not possible, check there is not firewall between them and check the connection speed between the two VPCs ( is there a peering ?). |
Testing | Test access to the application from different part of the world ( using webpagetest.org ) |
Check latest hardware offer from cloud provider | Be aware of new type of hardware offered by your cloud provider. For example, they can plan to offer a new generation of CPU that is either more stable, more efficient, faster... , but you need to specifically request to change your servers to the new type of servers. If it is possible to get the information from the cloud provider: Level of RAM, type of RAM ( SRAM - static RAM faster ), RAM speed (DDR is double data rate and is faster ) CPU ( speed, number of cores, L1 and L2 cache - L1 cache is stored on the chip, L2 is stored outside the chip - , architecture - 64 bit faster than 32 bit etc Disk ( disk speed, type of disk - obviously SSD, but careful some provider may still offer HDD ) |
Autoscalling | Find out which autoscalling functionality is offered by the cloud provider. Example: elastic beanstalk or lambda for AWS Check autoscalling is ramping up correctly, ramping down correctly, not taking too much ressource |
Application using more than one disk in parallel | Especially interesting for databases or processes heavily writing to the disk: have more than one disk and parallelize the writing |
Use multicore | If you have more than one core, make sure you actually use them. If server has only one thread, then run several servers in parallel. |
Software as a service on the cloud | Especially interesting for databases, but can be used for API too. Look for "best databases as a service" on a search engine. Use software as a service instead of having the application server installed on your own server. |
Load balancing | Balance the load between several servers. Be careful if your application has a login. In that case, you need to synchronize the sessions between the two servers |
Use the correct version of software | In case you use a serverless architecture or have your database on the cloud, this is not useful. But if you are running the application on your own, make sure you are up to date with the version of your servers (web server, database server, runtime environment, programming language) |
Check influence of antimalware or other services | Make sure you choose an antimalware than does not kill your application when running on the same server. Monitor how much CPU, RAM, disk your antimalware is using. Check if other services running on the same server can be suppressed |
Check mechanisms to connect to backend are in place and tuned | Set up timeout to avoid blocking the server with transactions that take too long. It is tempting to put a long timeout to avoid returning 404, but by doing this you make all other calls late. In other words, sacrifice the calls that take too long to avoid slowing down all calls. Retry mechanism with a maximum number of retries to avoid blocking the server Make sure timeout on the application is synchronized with the timeout on backend: timeout of the calling party should be greater than the timeout of the called party (otherwise the caller will time out and the called will keep trying). Fine tune max pool size when using connection pools: maximum number of connections that can be created to satisfy client requests |
Asynchronous versus synchronous | Everything that does not require an immediate and calculated answer should be done asynchronously. When a transaction is made synchronously, it means your application is not performing any further activity until it gets the answer from the server. This is necessary when you are executing payments with a debit card, but for some activities, it is not necessary. When you think about it, most things can be done asynchronously without affecting the user experience. When the user executes an action that makes him gain points, he can get a congratulations message immediately, but the actual update can be done after a few seconds asynchronously. Some actions can even wait until the evening. |
Use of asynchronous programming | Use asynchronous calls when the processing involves a lot of waiting: - when you need to execute database queries or updates - when you need to call external APIs - when you are copying, downloading, uploading data. |
Use of parallel processing in batch, interpreted languages | Use parallel processing in batch when : - you use mainly CPU intensive, String operation - graphics processing - number crunching algorithms If you plan to use an interpreted language for this, do it only if it offers a compiled library (for example a C compiled library in Python) that can execute the work fast. Another example: awk script can process a file extremely fast, but a batch script executing actions in a loop slower. |
Batch processing with queues | If your process is separated into steps (for example download, then parse, then upload), use queues: each time one document, one record, or one chunk of information is processed by a step, the result of this step is kept in a queue (if possible in an in-memory database like Redis). As a result, instead of processing for example the first step for all records during 10 hours, and then the second step during another 10 hours, each item processed in the first step goes to the second step instead of waiting that all items to pass the first step. |
Massive database actions using bulk and chunks | When loading a significant volume of data in a database, find a command to bulk load data in this database instead of using multiple insert or update commands. Try also to execute these loading actions by chunk. For example you separate your data in files of 50000 entries and bulk load each file one by one |
Blue green deployment | When you need to execute massive update actions on the database used to answer to read requests: Have two sets of databases instead of one: database A where the reads are done and database B where the updates are done. The process looks like this: - clone database A to create database B. - make massive updates to database B (as a result, you do not affect the performance of reads on database A) - you make tests on database B to validate it is ready to go, if possible by running regression tests on it. - you make the appllication point to B instead of A - you make regression tests to check the application is still functioning - you delete A, and repeat the operation, except that this time A and B are inverted. |
Use in-memory database | Use in-memory database like reddis instead of keeping information in large files (useful for batch processing, storing of large amount of settings - like translations, or caching at application level ) |
Use of zip when processing files | This applies to batch processing: when keeping the information for further processing, zip the kept information when storing it and unzip when reading it |
Use of S3 | Try to serve static content with S3 buckets or equivalent. If you have a mix of static content and non-static content, you have the following options: - have a reverse proxy that redirects to the S3 bucket or the application server. - serve static content on your main host name and dynamic content on a subdomain. You then have to be careful to setup your dynamic server to accept cores from the main domain. |
Login too much information | Double check you setup the login level on your production server to error or critical, and not a lower level. Make sure you are logging information only in case of error on the error and critical levels. |
Garbage collection | If you are using java, try to find the garbage collector version the most efficient for your implementation. |
Use of CDN to cache your static files | Cache all static file using a CDN (Content Delivery Network). As a result: - static files will be cached all over the world. - you may prevent some DDoS (Distributed Denial of Service) attack. - some CDNs will stream the videos present on your web pages (206 return code) instead of downloading them (200 return code) |
In case of CMS, make sure the web pages are pre-cached | There must be a mechanism on the CMS to pre-cache the webpages after creating the corresponding content. |
Use of http2 | Use a server that supports http2 protocol |
Use of AMP | Use AMP for content pages ( articles, presentations ). AMP pages and normal page equivalent must be linked together using a metatag AMP pages have no javascript |
Answers from server are zipped | Response header should have: Content-Encoding: gzip Request header have Accept-Encoding: gzip, deflate, br This is set on the application server |
Lazy loading of images | Put in place a mechanism in javascript loading images only once the page is loaded ( onload ) |
Converting your images to progressive images | Convert your images to progressive image (conversion to webp format) |
Lazy loading of everything under the fold | Separate what is under the fold ( images, but also CSS ), meaning not seen by visitors. Load first what is above the fold, and then only what is under the fold |
Use of videos | If you choose to use videos on your website: - if they are long (over 10 seconds), use a streaming service to display the video on your page - if they are short, the video must be streamed (will return a 206 http code, will start playing the video before it is fully downloaded) instead of downloaded (200 http code, video is fully downloaded before starting playing). With some CDN providers the mp4 videos cached using the CDN are streaming instead of downloaded without any effort from your side. |
Reduce the size of image | I use for example tinyPNG.com. |
Have smaller images, no videos on mobile | Reduce the height and width of images on mobile (load smaller images on mobile than on desktop). Try to avoid videos on mobile (except for youtube videos). If you avoid mp4 on mobile, load them with javascript only if the width of the screen is greater than a certain threshold. |
Google fonts: do not use the entire set of files | When using google fonts, you are offered a link to a file that links to several files. Check which files you really need, and which characters you really need, and link your page to only the necessary links. For example, I use https://fonts.googleapis.com/css?family=Nunito&text=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789.-:%3E instead of https://fonts.googleapis.com/css?family=Nunito which loads 5 woff2 files, including files for Vietnamese, Cyrillic. |
Google fonts: css part in your css, woff2 file on your server | The google font is linking to a file containing CSS fragments pointing to URLs of woff2. To gain time: - put the css fragment (starting with @font-face) into your application s CSS. - go to the URL(s) of the woff2 on your browser, that will download the woff2 file(s). Put these files on your server and make the fragment on the CSS point to it instead of pointing to the google font URL. Regarding the second technique, check with a site like webpagetest.org if it goes faster or not (with using google font URL to the woff2, you loose time with the DNS connection, as it is not on the same domain as your application, but it may download faster), and put back the original URL in the CSS fragment. |
Minimize scripts | Minimize Javascript files, CSS files, HTML file In the case of javascript files, uglify (which among other things replaces variable and function names by letters to gain even more space) land minify (can be done using gulp) |
File expiration | Setup file expiration so that they can stay on the visitor's browser for a file - then when using the website another time, it will load faster for them In .htaccess for apache: ExpiresActive On # Set up 1 week caching on static files <FilesMatch "\.(xml|txt|html|js|css)$"> ExpiresDefault A604800 Header append Cache-Control "proxy-revalidate" </FilesMatch> |
Less requests sent to the server | Bundle js files together Can bundle images files together using sprites Use svg instead of images in the case of small icons |
Web server technology | Assess which web server technology is the fastest for your use case. Example of options for static content: S3 bucket versus apache versus nginx |
Testing | webpagetest.org https://yellowlab.tools/ |
Index | increases retrieving slows down updating less effect is great percentage of rows have the same value for this column |
Tune queries included in functions | Check queries used by function and tune them as well |
Use indexes when joining two tables | If you are joining two tables using a query. create an index on both columns used to join these two tables. |
Composite index: do not omit the first column | In case your index is using two columns: - do not omit the first column in the order by clause - list the columns in the right order in your order by clause |
Clustered index for ranges | Create a clustered index on the most used column when selecting ranges using this column. This will physically order the data in the table using this column ( can see it when executing a select without order by). You can only have one clustered index per table. Clustered indexes do not allow duplicates |
Use joins instead of correlated subqueries | This is not the case for all DBMSes |
Do not use non-deterministic functions on the left hand side of a comparison | Query before tuning: select * from client_status where datediff(day,status_date,getdate())> 50 This query cannot be cached. After tuning: declare @date_threshold date select @date_threshold= dateadd(day,-50,getdate()) select * from client_status where status_date > @date_threshold |
Add a column containing reverse names when looking for a string at the end of a column | Create a new column in the same table (let us call it column B) use a trigger so that each time you insert or update a row, you put in this new column B the reverse value of the column you want to use in the Where clause (let us call it column A). When looking for a string at the end of the column A, match the reverse of the string you are looking for to the new column B |
Do not need to select all columns | In your query, select only the columns that are necessary for further processing |
Execute updates, inserts, deletes in batches | Separate the full list of updates, inserts or deletes into chunks (of 5000 operations for example), and update chunk by chunk |
Split voluminous tables into several tables | If you have a table storing a huge amount of information, you can split this table’s data between several tables. Each resulting table will contain less rows than the corresponding original table |
Use a temporary table when joining tables | When joining two voluminous tables, you can enhance the query using the following technique: you query one of the two tables you store the result of this query in a temporary table you join this temporary table with the second table To choose which one of the two tables to use in step one, do so that the information in the temporary table is as less voluminous as possible. |
Delete records from underlying tables (possibly archive the deleted records) | If you have voluminous tables, you can increase your SQL performance by moving data that is not necessary anymore for your application in separate tables In order to apply this tip, you need to find out rules to determine which data is not useful anymore. |
Rebuild Indexes and tables | Use a rebuild function to rebuild the index. Another option is to drop the index and then creating it again: |
Drop indexes and foreign keys before loading data | Drop indexes and foreign keys, execute the inserts, delete, updates, then recreate the indexes and foreign keys |
Online versus batch | Everytime the user requests an update, an insert or a delete, decide: if it is critical to execute the update, insert or delete as soon as the user requests it. if it is critical to make the user wait for the completion of the update, insert or delete results before the user can use the application further. In case offline processing is possible: communicate to the user that the request is going to be processed execute the operations offline once done with the offline operation, communicate to the user the action is completed |
Put the most selective criteria first in the index and in the where clause | If your index includes several columns (composite index), the index should start with the column containing the least duplicate values (most selective column). This technique implies putting the columns in the right order when creating the composite index. You should also order these columns well in your where clause |
Use less storage space on each column | Choose the minimum type necessary
If you choose varchar, you could decide what is the biggest reasonable length of the strings contained in the column. But as the storage size in varchar(x) depends on the string size, why bother determining what to put instead of x? Example: you know that the strings stored will never go over 200 characters, but the storage size of each string will be the same whether you declare the column as varchar(200) or varchar(1000). Indeed, the storage size of each string is equal to the string actual size + 2. You will find some experts arguing that it is irrelevant for performance whether you declare varchar(200) or varchar(1000). Some others say that it can impact performance. Their argument is that varchar(200) is an indication provided to the DBMS of the estimated storage size per field in this column. As every indication can influence how the DBMS is planning to execute the query, it can make sense to give an estimate as close as possible to reality to "help” the DBMS doing his job well. |
Choose between select count greater than 0 and exists | In some cases a query containing ”where exists” performs differently than a query counting a number of rows and checking if it is greater than 0. Both ways of doing return the same result, but their speed may differ. select * from clients c where (select count(*) from salesperson s where c.salesperson_id =s.id) > 0 |