Message-ID: <317939532.2748.1485850674370.JavaMail.confluence@ip-10-127-227-164> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_2747_371183315.1485850674370" ------=_Part_2747_371183315.1485850674370 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html Solr Bundle

Solr Bundle

=20
=20
=20
=20

For use with eZ Publish 5.4, go to the corresponding documentation page which covers the v1.= 0 version of the bundle compatible with eZ Publish 5.4.

 

What is Solr Search Eng= ine Bundle?

ezplatform-solr-search-engine&n= bsp;as the package is called, aims to be a transparent drop in replacement = for the SQL based "Legacy" search engine powering eZ Platform Search API by= default. By enabling Solr and re-indexing your content, all your existing = Search queries using SearchService, will be powered by Solr automatically. = This allows you to scale up your eZ Platform installation and be able to co= ntinue development locally against SQL engine, and have a test infrastructu= re, Staging and Prod powered by Solr, thus removing considerable load from = your database so it can focus on more important things, like publishing&nbs= p;3D"(wink)".

See Architecture page for fur= ther information on the architecture of eZ Platform.

How to set up Solr Search = engine

Step 0: Enable Solr Bundle

Not needed with eZ Platform

This step is not needed as of eZ Platform 15.09, however it is kept here= for reference in case you have previously disabled the bundle.

 

  1. Check in composer.json if you have the ezsystems/ezplatform-so= lr-search-engine package, if not add/update composer dependencies:

    command line
    =20
    composer require --no-update ezsystems/ezplatform-solr-search-engi=
    ne:~1.0
    composer update
    =20
  2. Make sure EzPublishSolrSearchEngineBundle is activated with the followi= ng line in app/AppKernel.php file: new Ez= Systems\EzPlatformSolrSearchEngineBundle\EzSystemsEzPlatformSolrSearchEngin= eBundle()

Step 1: Configurin= g & Starting Solr

 

Example here is for single core, look to=20 Solr=20 documentation for configuring Solr in other ways, also= see the provided configuration for some examples.

 

First download and extract Solr, we currently support Solr 4.10.= 4:

 

Secondly, copy configuration files needed for eZ Solr Search Engine bund= le, here from the root of your project to the place you extracted Solr<= /em>:

Command line example
=20
# Make sure to change the /opt/solr/ path with where you have plac=
ed Solr
cp -R vendor/ezsystems/ezplatform-solr-search-engine/lib/Resources/config/s=
olr/* /opt/solr/example/solr/collection1/conf/

/opt/solr/bin/solr start -f 
=20

 

Thirdly, Solr Bundle does not commit solr index changes directly on= repository updates, leaving it up to you to tune this using sol= rconfig.xml as best practice suggests, example config:

solrconfig.xml
=20
<autoCommit>
  <!-- autoCommit is here left as-is like it is out of the box in Solr 4=
.10.4, this controls hard commits for durability/replication -->
  <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>=20
  <openSearcher>false</openSearcher>=20
</autoCommit>

<autoSoftCommit>
  <!-- Soft commits controls mainly when changes becomes visible, by def=
ault we change value from -1 (disabled) to 100ms, to try to strike a balanc=
e between Solr performance and staleness of HttpCache generated by Solr que=
ries -->
  <maxTime>${solr.autoSoftCommit.maxTime:100}</maxTime>=20
</autoSoftCommit>
=20

Step 2: Configuring bundle

The Solr search engine bundle can be configured many ways, in the config= further below it assumes you have parameters setup for solr dsn and search= engine (however both are optional), example:

parameters.yml
=20
    search_engine: solr
    solr_dsn: 'http://localhost:8983/solr'
=20


On to configuring the bundle.

Single Core example (= default)

Out of the box in eZ Platform the following is enabled for simple setup:=

config.yml
=20
ez_search_engine_solr:
    endpoints:
        endpoint0:
            dsn: %solr_dsn%
            core: collection1
    connections:
        default:
            entry_endpoints:
                - endpoint0
            mapping:
                default: endpoint0
=20

Shared Core example

In the following example we have decided to separate one language as the= installation contains several similar languages, and one very different la= nguage that should receive proper language analysis for proper stemming and= sorting behavior by Solr:

config.yml
=20
ez_search_engine_solr:
    endpoints:
        endpoint0:
            dsn: %solr_dsn%
            core: core0
        endpoint1:
            dsn: %solr_dsn%
            core: core1
    connections:
        default:
            entry_endpoints:
                - endpoint0
                - endpoint1
            mapping:
                translations:
                    jpn-JP: endpoint1
                # Other languages, for instance eng-US and other western la=
nguages are sharing core
                default: endpoint0
=20

Multi Core example

If full language analysis features are preferred, then each language can= be configured to separate cores.

Note: Please make sure to test this setup against single core setup,= as it might perform worse than single core if your project uses a lot of l= anguage fallbacks per siteaccess, as queries will then be&nb= sp;performed across several cores at once.

config.yml
=20
ez_search_engine_solr:
    endpoints:
        endpoint0:
            dsn: %solr_dsn%
            core: core0
        endpoint1:
            dsn: %solr_dsn%
            core: core1
        endpoint2:
            dsn: %solr_dsn%
            core: core2
        endpoint3:
            dsn: %solr_dsn%
            core: core3
        endpoint4:
            dsn: %solr_dsn%
            core: core4
        endpoint5:
            dsn: %solr_dsn%
            core: core5
        endpoint6:
            dsn: %solr_dsn%
            core: core6
    connections:
        default:
            entry_endpoints:
                - endpoint0
                - endpoint1
                - endpoint2
                - endpoint3
                - endpoint4
                - endpoint5
                - endpoint6
            mapping:
                translations:
                    - jpn-JP: endpoint1
                    - eng-US: endpoint2
                    - fre-FR: endpoint3
                    - ger-DE: endpoint4
                    - esp-ES: endpoint5
                # Not really used, but specified here for fallback if more =
languages are suddenly added by content admins
                default: endpoint0
                # Also use separate core for main languages (differs from c=
ontent object to content object)
                # This is useful to reduce number of cores queried for alwa=
ys available language fallbacks
                main_translations: endpoint6
=20

 

Step 3: Configuring repository with the specific search engine

The following is an example of configuring Solr Search Engine, whe= re connection name is same as in example above, and = engine is set to solr:

ezplatform.yml
=20
ezpublish:
    repositories:
        default:
            storage: ~
            search:
                engine: %search_engine%
                connection: default
=20

%search_engine% is a parameter that is configured in = app/config/parameters.yml, and should be changed from its default va= lue "legacy" to "solr" to activate Solr as the Se= arch engine.

Step 4: Clear prod cache

While Symfony dev environment keeps track of changes to yml files, prod = does not, so to make sure Symfony reads the new config we clear cache:

=20
php app/console --env=3Dprod cache:clear
=20

Step 5: Run CLI indexing = command

Make sure to configure your setup for indexing

Some exceptions might happen on indexing if you have not configured your= setup correctly, here are the most common issues you may encounter:

  • Exception if Binary files in database have an invalid path prefix
    • Make sure ezplatform.yml configuration  var_dir is configured properly. <= /li>
    • If your database is inconsistent in regards to file paths,= try to update entries to be correct (but make sure to make a backup fi= rst).
  • Exception on unsupported Field Types
    • Make sure to implement all Field Types in your installation, or t= o configure missing ones as  NullType if implement= ation is not needed.
  • Content not immediately available 
    • Solr Bundle is on purpose not committing changes directly on Repo= sitory updates (on indexing), but letting you control this using S= olr configuration.  Adjust Solr autoSoftCo= mmit  visibility of change to search index) a= nd/or autoCommit (hard commit, for durabilit= y and replication) to balance performance and load on your Solr i= nstance against needs you have for "NRT".
  • Running out of memory during indexing
    • In general make sure to run indexing using prod environment to av= oid debuggers and loggers from filing up memory.
    • Stash: Disable in_memory cache as recommended= on P= ersistence cache for long running scripts.
    • Flysystem: An open issue exists where you can find further info https://jira.ez.no/browse/EZP-25325<= /a>

Last step is to execute initial indexation of data:

=20
php app/console --env=3Dprod --siteaccess=3D<name> ezplatfor=
m:solr_create_index
=20

V1.7.0

Since v1.7.0 the ezplatform:solr_create_index command is de= precated, use php app/console= ezplatform:reindex instead:

=20
php app/console --env=3Dprod --siteaccess=3D<name> ezplatfor=
m:reindex
=20

Extending the Solr= Search engine Bundle

Document Field Mappers

V1.2+, AVAILABLE IN EZ PLATFORM V1.7+

As a developer you will often find the need to index some additional dat= a in the search engine. The use cases for this are wide, for example the da= ta could come from an external source (for example from recommendation = system), or from an internal source.

The common use case for the latter is indexing data through the Location= hierarchy, for example from the parent Location to the child Location, or = in the opposite direction, indexing child data on the parent Location. The = reason might be you want to find the content with fulltext search, or you w= ant to simplify search for a complicated data model.

To do this effectively, you first need to understand how the data is ind= exed with Solr Search engine. Documents are indexed per translation, as Con= tent blocks. In Solr, a block is a nested document structure. In our case, = parent document represents Content, and Locations are indexed as child docu= ments of the Content. To avoid duplication, full text data is indexed on th= e Content document only.

Knowing this, you have the option to index additional data on:

  • all block documents (meaning Content and its Locations, all translation= s)
  • all block documents per translation
  • Content documents
  • Content documents per translation
  • Location documents

Indexing additional data is done by implementing a document field mapper= and registering it at one of the five extension points described above. Yo= u can create the field mapper class anywhere inside your bundle, as long as= when you register it as a service, the "class" parameter" in your se= rvices.yml matches the correct path. We have three different field m= appers. Each mapper implements two methods, by the same name, but accepting= different arguments:

  • ContentFieldMapper
    • ::accept(Content $content)
    • ::mapFields(Content $content)
  • ContentTranslationFieldMapper
    • ::accept(Content $content, $languageCode)
    • ::mapFields(Content $content, $languageCode)
  • LocationFieldMapper
    • ::accept(Location $content)
    • ::mapFields(Location $content)

These can be used on the extension points by registering them with the c= ontainer using service tags, as follows:

  • all block documents
    • ContentFieldMapper
    • ezpublish.search.solr.document_field_mapper.block
  • all block documents per translation
    • ContentTranslationFieldMapper
    • ezpublish.search.solr.field_mapper.block_translation
  • Content documents
    • ContentFieldMapper
    • ezpublish.search.solr.document_field_mapper.content
  • Content documents per translation
    • ContentTranslationFieldMapper
    • ezpublish.search.solr.field_mapper.content_translation
  • Location documents
    • LocationFieldMapper
    • ezpublish.search.solr.field_mapper.location

The following example shows how to index data from the parent Location c= ontent, in order to make it available for full text search on the children = content. A concrete use case could be indexing webinar data on the webinar = events, which are children of the webinar. Field mapper could then look som= ething like this:

=20
 <?php

namespace My\WebinarApp;

use EzSystems\EzPlatformSolrSearchEngine\FieldMapper\ContentFieldMapper;
use eZ\Publish\SPI\Persistence\Content\Handler as ContentHandler;
use eZ\Publish\SPI\Persistence\Content\Location\Handler as LocationHandler;
use eZ\Publish\SPI\Persistence\Content;
use eZ\Publish\SPI\Search;

class WebinarEventTitleFulltextFieldMapper extends ContentFieldMapper
{
    /**
     * @var \eZ\Publish\SPI\Persistence\Content\Type\Handler
     */
    protected $contentHandler;

    /**
     * @var \eZ\Publish\SPI\Persistence\Content\Location\Handler
     */
    protected $locationHandler;

    /**
     * @param \eZ\Publish\SPI\Persistence\Content\Handler $contentHandler
     * @param \eZ\Publish\SPI\Persistence\Content\Location\Handler $locatio=
nHandler
     */
    public function __construct(
        ContentHandler $contentHandler,
        LocationHandler $locationHandler
    ) {
        $this->contentHandler =3D $contentHandler;
        $this->locationHandler =3D $locationHandler;
    }

    public function accept(Content $content)
    {
        // ContentType with ID 42 is webinar event
        return $content->versionInfo->contentInfo->contentTypeId =
=3D=3D 42;
    }

    public function mapFields(Content $content)
    {
        $mainLocationId =3D $content->versionInfo->contentInfo->ma=
inLocationId;
        $location =3D $this->locationHandler->load($mainLocationId);
        $parentLocation =3D $this->locationHandler->load($location-&g=
t;parentId);
        $parentContentInfo =3D $this->contentHandler->loadContentInfo=
($parentLocation->contentId);

        return [
            new Search\Field(
                'fulltext',
                $parentContentInfo->name,
                new Search\FieldType\FullTextField()
            ),
        ];
    }
}
=20

Since we index full text data only on the Content document, you would re= gister the service like this:

=20
my_webinar_app.webinar_event_title_fulltext_field_mapper:
    class: My\WebinarApp\WebinarEventTitleFulltextFieldMapper
    arguments:
        - '@ezpublish.spi.persistence.content_handler'
        - '@ezpublish.spi.persistence.location_handler'
    tags:
        - {name: ezpublish.search.solr.field_mapper.content}
=20

Providing feedback

After completing the installation you are now free to use your site as u= sual. If you get any exceptions for missing features, have feedback on perf= ormance, or want to discuss, join our community slack channel at https://ezcommunity.slack.com/messages/ezplatfo= rm-use/

=20
=20
=20
=20

In this topic:

=20
=20
=20
------=_Part_2747_371183315.1485850674370 Content-Type: image/png Content-Transfer-Encoding: base64 Content-Location: file:///C:/450b34994881dc04bd093e73c25eda70 iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAMAAAAoLQ9TAAAAA3NCSVQICAjb4U/gAAAAeFBMVEX/ ///ht0zds0vbsUrZr0rZr0rZr0rXrUr/01H90VH7z1D5zVD2zVP3y1D1yU/xx07tx1XwxU7vw07k wFfjuUzht0zhtkvftUvds0vbsUrQslvZr0rXrUrJrl3Vq0nAqF+3oWGllWScj2aUiWiKgmqBfGx4 dm5wcHAZd7u/AAAAKHRSTlMAEXe7u8zd7v////////////////////////////////////////// apo9sAAAAAlwSFlzAAALEgAACxIB0t1+/AAAABx0RVh0U29mdHdhcmUAQWRvYmUgRmlyZXdvcmtz IENTNui8sowAAAAUdEVYdENyZWF0aW9uIFRpbWUANi8xLzEzOKlF0AAAAJ1JREFUGJVFT1sCgjAM KyClbvIQ0Q0pU1Tk/je02wDztaRL2gAI0owFhxQiEhy6Wuu6GzAJvDCKArQtvIK3I61QBsXP69xa +cMp5J2Q+5toWeRhcnAVTYsnQTg7cFuCt5DivxBjHXBFn9MuiEVCx/nREF2eX6JrLms1NdMsua+R KlkLuB0qCRbD6X0ZedkXsQyyaUvdGo7l9vpZqP8DgbkMiplsfQgAAAAASUVORK5CYII= ------=_Part_2747_371183315.1485850674370--