示例#1
0
 /**
  * Adds a rule to the list of rules that decides what types of content should be streamed diretly to a temporary file.
  *
  * If a content-type of a page or file matches with one of these rules, the content will be streamed directly into a
  * temporary file without claiming local RAM.
  *
  * It's recommendend to add all content-types of files that may be of bigger size to prevent memory-overflows.
  * By default the crawler will receive every content to memory!
  *
  * The content/source of pages and files that were streamed to file are not accessible directly within the overidden method
  * {@link handleDocumentInfo()}, instead you get information about the file the content was stored in.
  * (see properties {@link PHPCrawlerDocumentInfo::received_to_file} and {@link PHPCrawlerDocumentInfo::content_tmp_file}).
  *
  * Please note that this setting doesn't effect the link-finding results, also file-streams will be checked for links.
  *
  * A common setup may look like this example:
  * <code>
  * // Basically let the crawler receive every content (default-setting)
  * $crawler->addReceiveContentType("##");
  *
  * // Tell the crawler to stream everything but "text/html"-documents to a tmp-file
  * $crawler->addStreamToFileContentType("#^((?!text/html).)*$#");
  * </code>
  *
  * @param string $regex The rule as a regular-expression
  * @return bool         TRUE if the rule was added to the list and the regex is valid.
  * @section 10 Other settings
  */
 public function addStreamToFileContentType($regex)
 {
     return $this->PageRequest->addStreamToFileContentType($regex);
 }