Ontolica Preview has a feature called First Page Preview, which is able to cache a set number of first pages from each document into the Preview database.
This allows to display first page thumbnails on all crawled documents and quickly load the set number of first pages for the Full Document Preview.
The service that fetches the first pages is called the Gatherer. After the document is crawled by the Gatherer, the converted pages are stored in the database as images and text extracts.
Preview uses PNG image format, which typically results in images having the size of 50-60 KB.
The size of the text representation depends on the amount of text present on the pages but is typically smaller than the image representation.
Hence, we can expect that the average size needed per cached page is not more than 60-70 KB.
You can calculate how much space the Preview database will take at most with the following formula:
- (Number of documents in SharePoint)* x (Set number of first pages to be crawled by Preview) x 0,15 MB = The total database size in MB
if you have 100.000 documents and set Preview to crawl only 1 page from each of them, then the total database size should be less than 15 GB.
Please note that the Full Document Preview that performs the live conversion of the documents can also use some resources of the database server, both RAM and storage.
The utilization changes dynamically based on user activity and should be considered separately to the size calculated above.