Documents and Files
Document
A document is an individual piece of content that can be indexed and retrieved by from your search engine. Documents are a representation of your content that is stored in Silverstripe Search. Only content added into documents will be available to search on.
A document is made up of fields which have a key and a value. The key is the name of the field and this is the same for all documents within the engine. A field’s value represents the unique content of the document. For example your documents might have a key called title
which you use to store the title of your pages such as “Home Page”, “About Page” etc. An example document could look like:
Field key | Field value |
---|---|
title |
Home Page |
content |
Welcome to the home page |
url |
/ |
date_published |
2024-07-17 10:00:00 |
Giving your documents a structure like this, combined with a schema, allows you to use the full power of Silverstripe Search. In the above example, having a date_published allows you to create a search for all documents published after a certain date.
Documents are added to an Engine programmatically, find out more in the Developer's guide.
Files
Silverstripe Search can search files such as PDF (.pdf
) and Microsoft Office (.docx
) documents. To do this it extracts the file content into a Silverstripe Search Document. You can send the file, with the help of a developer, in a special binary type field called _attachment
. The service will then extract the content it can process and put in the body
field. There is a 15MB limit on file size.
File content extraction is only supported on Tiers with Analyst features. For more information, see Features
Supported file types
.txt
.py
.rst
.html
.markdown
.json
.xml
.csv
.md
.ppt
.rtf
.docx
.odt
.xls
.xlsx
.rb
.paper
.sh
.pptx
.pdf
.doc