Command Line Options¶
These flags allow you to change the behavior of Crawler.
-
-d
<sec>
,
--delay
<sec>
¶ Use a delay in between page fetchs so we don’t overwhelm the remote server. Value in seconds.
Default: 1 second
-
-c
<int>
,
--concurrency
<int>
¶ - Use multiple system processes to crawl a website.
Default: 1
-
-i
<regex>
,
--ignore
<regex>
¶ - Ignore pages that match a specific pattern.
Default: None