This article shows Crashtest Security Suite’s Advanced Scan Configuration
Crashtest Security Suite crawlers have various intelligent algorithms that aim to reduce the number of pages crawled automatically. These algorithms are necessary and beneficial because, in web applications, there are often views on data for which scanning one example representative view covers the other views. If you were to think of a webshop, for example, with thousands of products, each with an individual page, scanning each product would be time-consuming and redundant, as the code for adding something to the basket or leaving a product review is the same, regardless of the product. In this case, a security scan that scans one example page would be the ideal solution.
This article explains the existing behavior of our crawlers and scanners and several configuration options to improve the scan speed and coverage.
How do the allowed URLs work?
The allowed URLs will be also be considered for navigational links in your web application during crawling, for redirects, and, as mentioned above, for the check if a request should be scanned or not. You can specify allowed URLs in the preferences of your target under “Configuration.” By default, you will be allowed to add allowed URLs that are subdomains of your target URL. However, if you require a different URL to be allowed for scanning, please contact our support to verify that you are entitled to perform scans against the domain you want to add.
Afterward, the internal check of the crawler will succeed, and the allowed URL will be crawled and scanned as well.
How to improve the scan speed?
Scanning some web applications with a large number of pages can take too long for your use case. There are two options to increase the scan speed by adjusting the scan scope. The first is to exclude certain URLs from the scan using the denied URLs. The second option is to configure a grouped URL pattern which helps the crawler detect pages that should only be crawled and scanned once.
Improve the scan speed with denied URLs
By adding a URL to the denied URLs in the target preferences under “Configuration”, you can make sure that this URL and all subpaths are no longer crawled and scanned. This might, for example, make sense if your application has one or multiple modules which should be excluded:
This will now make sure that the denied URL and all of its subpaths are no longer scanned.
Improve the scan speed with grouped URLs
The grouped URLs allow you to specify a pattern to group certain URLs and only crawl them once. This is especially useful if you have an online shop and the URL has a fixed structure containing a category and item name. Usually, these names are in text form for SEO purposes, which causes the crawler not to group them automatically.
You can specify a grouped URL in the target preferences under “Configuration” and use the star character “*” to define a part that should be grouped. In the following, a concrete example is described to understand better how the grouping works.
Before specifying the grouped URL the following URLs are all treated as being unique and are crawled individually:
After specifying the grouped URL pattern “www.your-app.dev/shop/*/*“ from the above list, only the first one will be crawled, and the remaining URLs will be detected as further URLs of the same group. However, the following URLs will still be scanned, but only for one URL of the group and not repeatedly anymore:
Feel free to contact our security experts if you have further questions on fine-tuning your scan.
When should I use the seed URLs?
Sometimes web applications have pages that are not linked anywhere and cannot be detected by the crawler. For example, the admin login interface (e.g. here under https://your-app.dev/admin/login) of a web application might be intentionally not linked anywhere but should be scanned as well during the security scan.
These pages can be manually added for crawling by specifying them as a seed URL in the preferences of your target under “Configuration”. Please keep in mind that our service will also check if the seed URL is a subpath of your target URL to determine if it is allowed to be scanned. If this is not the case, but the seed URL should be scanned anyways, you will need to add it to the allowed URLs.