Full Text Search

Introduction

ownCloud offers the ability to use full text search via the Full Text Search app connecting to an Elasticsearch Server. This allows users to search not only for file names but also for content within files stored in ownCloud.

The Full Text Search app integrates full text search into ownCloud, powered by Elasticsearch. This document describes how to setup the ownCloud part of the Full Text Search app.

Prerequisites

  1. A fully functioning Elasticsearch Server 7. Follow the Installation and Upgrade Guide for your environment.

    • Version 1.0.0 of the Full Text Search app only works with Elasticsearch version 5.6.

    • With version >=2.0.0 of the app, Elasticsearch version 7 is supported and required.

  2. The Ingest Attachment Processor Plugin lets Elasticsearch extract metadata and text from over a thousand different file types such as PPT, XLS, PDF and more. To install the processor, run the following command from your Elasticsearch installation directory:

    sudo bin/elasticsearch-plugin install ingest-attachment

    Post installing the Plugin, you need to restart the Elasticsearch server:

    sudo service elasticsearch restart

Installation

To install the app, use the Marketplace app on your ownCloud server or proceed manually:

  1. Download and extract the tarball of the Full Text Search app to the apps directory (or custom apps directory, prefered) of your ownCloud instance.

  2. Use the App Commands to enable the search_elastic application with:

    sudo -u www-data occ app:enable search_elastic

    or enable it via the GUI Settings  Admin  Apps  Full Text Search  Enable.

Configuration

To configure the Full Text Search, go to Settings  Admin  Search.

Authentication Methods

Independent of the authentication method selected below, you need to provide the URL of the Elasticsearch server. For any of the authentication methods selected, your Elasticsearch server must be prepared upfront.

For the URL, both HTTP and HTTPS incuding the address and port can be used.

The app provides several authentication methods. Select the one of your choice and check out the details for the respective authentication method below:

Auth Items

No Authentication

When using No Authentication, just fill in the URL of the ES server.

No Auth

User / Password Authentication

When using User / Password Authentication, enter the credentials set up on the ES server. Note that the password will be stored encrypted in the ownCloud database.

User / Password Auth

API Key Authentication

When using API Key Authentication, enter the API Key with which the ES server was set up.

API Key Auth

The API Key needs to be the encoded one, not the api_key string. For details see the Create API key API at the ES documentation.

Search External Storage

Define if external storage should be included in ES indexing by setting the checkmark accordingly with Scan external Storages.

Save the Configuration

Save the configuration with the Save configuration button.

Set up the ES Index

When everything is set up, you can click the button Setup index which will tell the ES server to create the plain empty index and other related internal settings.

This step is important, because the red dot will turn green showing that everything has been set up correctly.

Resetting the ES Index

You can at any time reset the index if required by clicking on Reset index or with an occ command. The index will be recreated afterwards.

sudo -u www-data occ search:index:reset

Using occ Commands

You can use the:

  • occ Full Text Search command set to manage the app. These commands let administrators create, rebuild, reset, and update the search index. For example, the following command resets and recreates the index for all users:

    sudo -u www-data occ search:index:reset
  • occ Config Commands command set to configure the app.

    Examples:

    List app settings
    sudo -u www-data occ config:list search_elastic
    {
        "apps": {
            "search_elastic": {
                "enabled": "yes",
                "group": "content_searchers",
                "installed_version": "2.1.0",
                "nocontent": "false",
                "scanExternalStorages": "1",
                "servers": "elastic:xxxxxxx@172.17.0.3:9200",
                "types": "filesystem"
            }
        }
    }
    Set app options
    sudo -u www-data occ config:app:set \
        search_elastic scanExternalStorages --value 0

    or

    sudo -u www-data occ config:app:set \
        search_elastic scanExternalStorages --value 1

App Modes

The Full Text Search app provides two modes, which are active and passive.

Active Mode

After enabling the app, it will be by default in active mode
  • File changes will be indexed in background jobs.
    System cron is recommended, otherwise a lot of jobs might queue up.

  • Search results will be based on Elasticsearch.

  • Search functionality based on ownCloud core database queries will no longer be used.

    Active mode can cause a downtime for search when indexing starts on an already heavily used instance, because it takes a while until all files have been indexed.

Passive Mode

To do an initial full indexing without the app interfering, it can be put in passive mode
  • The administrator can run occ commands changing the search configuration without notice to the users.

  • The app will not index any changes by itself.

  • Search results will still be based on ownCloud core database queries.

Changing the App Mode

sudo -u www-data occ config:app:set \
    search_elastic mode --value passive

or

sudo -u www-data occ config:app:set \
    search_elastic mode --value active

Restrict Search Results

Index Metadata Only

If you only want to use the Full Text Search app as a more scalable search on filenames, you can disable content indexing by setting the option nocontent to true, which defaults to false:

sudo -u www-data occ config:app:set \
    search_elastic nocontent --value true
  • You have to reindex all files if you change this back to false. Setting it to true does not require reindexing.

  • It may be a more flexible way to go with limiting full text search to certain groups by setting the option group.nocontent, see below for details.

Limit Metadata Search for Groups

If you only want to use search for shared filenames, you can disable full text search for specific groups by setting the option group.nocontent to the groups whose users should only receive results based on filenames (not the full path), like users in the group nofulltext as in the example below:

sudo -u www-data occ config:app:set \
    search_elastic group.nocontent \
    --value nofulltext

You can also configure multiple groups by separating them with comma:

sudo -u www-data occ config:app:set \
    search_elastic group.nocontent \
    --value nofulltext,anothergroup,"group with blanks"

This allows a scalable search in shared files without clouding the results with content based hits.

Create the Index

When everything has been set up and configured, you can initiate creating the index. This must be done with an occ command. Depending on using active or passive mode, you either have to:

  • active mode: wait until the job has finished and search is available to users, or

  • passive mode: users continue to search with ownCloud embedded search and you switch over to active mode when the occ command has finished indexing.

sudo -u www-data occ search:index:create

Issues

When the Elasticsearch server is down or the index has not been set up, you may get the following message. Check if the ES server is reachable or if the index was set up properly as one solution to fix the issue.

Warning no Index

Warning unknown Key

User Manual

To find out more about the usage, check out the section in the User Manual: Search & Full Text Search.

Known Limitations

Currently, the app has the following known limitations:

  • If a shared file is renamed by the sharee (share receiver), the sharee cannot find the file using the new filename.

  • Search results are not updated when a text file is rolled back to an earlier version.

  • The app does not return results for recieved federated share files.

  • When using encryption, the app only works with the default Master Key encryption module.