App Config

The Searching and Indexing web applications can be configured via a wide range of properties. These can either be set using System Properties:

-Drelated-item.max.number.related.item.properties=10

Or by using a yaml configuration file:

related-item:  
  max.related.item.post.data.size.in.bytes: 65536
  additional.prop.key.length: 20
  additional.prop.value.length: 20
  max.number.related.item.properties: 10
  indexing:
    size.of.incoming.request.queue:  32

The following will list all the properties that are available for configuration, along with their use. Some properties are specifically for searching, others specifically for indexing, and others for both. This will be noted.

Indexing

PropertyUse
related-item.safe.to.output.index.request.data=falseWrites to logs (when DEBUG) the index request data
related-item.max.number.related.item.properties=10The max number of properties a related item can have. More properties than this will be silently discarded. There is no guarantee of ordering
related-item.max.number.related.items.per.index.request=10The max number of related items in a single index POST request
related-item.max.related.item.post.data.size.in.bytes=10240Max size in bytes of the POST data for an index request
related-item.min.related.item.post.data.size.in.bytes=4096The minimum size, in bytes, of the POSTed json data for an index request
related-item.indexing.size.of.incoming.request.queue=16384Size of the ring buffer that accepts incoming indexing POST requests
related-item.indexing.size.of.batch.indexing.request.queue=-1The size of the ring buffer for each indexing processor that batch posts indexing requests to elasticsearch
related-item.indexing.batch.size=625The max number of related item objects (a single index request will have many related item objects), that can be sent for batching indexing to elastic search.
related-item.indexing.number.of.indexing.request.processors=2number of processors used to perform indexing (sending batch indexing requests) to elasticsearch
related-item.indexing.replace.old.indexed.content=falsereplace existing content (false)
related-item.use.separate.repository.storage.thread=falseUse a separate thread for performing indexing
related-item.indexing.discard.storage.requests.with.too.many.relations=falseSilently discard related items in the indexing request it there are too many. Indexes up to the max, discards the others

Searching

PropertyUse
related-item.searching.size.of.related.content.search.request.queue=16384Size of the ring buffer that accepts incoming search requests
related-item.searching.size.of.related.content.search.request.handler.queue=-1Size of the ring buffer for each search processor that submits search requests to elasticsearch
related-item.searching.size.of.related.content.search.request.and.response.queue=-1Size of the ring buffer that is used to store incoming Request AsyncContext objects for later retrieval
related-item.searching.max.number.of.search.criteria.for.related.content=10number of additional properties that will be searched on
related-item.searching.number.of.expected.like.for.like.requests=10The number of search request that we expect to be similar
related-item.searching.key.for.frequency.result.id=idThe key used for the id field in the search result json
related-item.searching.key.for.frequency.result.occurrence=frequencyThe key used for the frequency in the search results json
related-item.searching.key.for.storage.response.time=storage_response_timeKey used to represent how long the elasticsearch request took, in the json response doc
related-item.searching.key.for.search.processing.time=response_timeKey used to represent how long the complete search request took. It is the key used in the response json
related-item.searching.key.for.frequency.result.overall.no.of.related.items=sizeKey in the search response used to represent the number of frequencies returned
related-item.searching.key.for.frequency.results=resultsKey in the search response json under which the frequencies are found
related-item.searching.request.parameter.for.size=maxresultsRequest parameter used to specify the max number of frequencies to return
related-item.searching.request.parameter.for.id=idParameter used to associate the id in a map of request parameters
related-item.searching.default.number.of.results=4Default number of search result (frequencies) to return
related-item.searching.size.of.response.processing.queue=-1Size of ring buffer for processing search results and sending json response to the awaiting AsyncContext
related-item.searching.number.of.searching.request.processors=2The number of ring buffers (processors) that will be sending search requests to elasticsearch
related-item.storage.frequently.related.items.facet.results.facet.name=frequently-related-withThe property used for naming the facet during the search request to elastic search
related-item.storage.searching.facet.search.execution.hint=mapUsed during search request to elastic search. The setting of ‘map’ is the default. Makes request much much faster
related-item.searching.frequently.related.search.timeout.in.millis=5000Timeout in millis for elasticsearch requests
related-item.searching.timed.out.search.request.status.code=504The http status code when a timeout occurs
related-item.searching.failed.search.request.status.code=502The http status code when a search request fails to talk to elasticsearch
related-item.searching.not.found.search.request.status.code=404The http status code when no search result is found
related-item.searching.found.search.results.handler.status.code=200The http status code when a match is found
related-item.searching.missing.search.results.handler.status.code=500The http status code when we cannot handle the json search response
related-item.searching.use.shared.search.repository=falseWhether the search processors use a shared connection to elastic search
related-item.searching.response.debug.output.enabled=falseOutput the response json being sent to the client, also to a log file.

Both

PropertyUse
related-item.related.item.id.length=36The max number of characters that the “id” of a related items can have
related-item.additional.prop.key.length=30The max number of characters a property name can have
related-item.additional.prop.value.length=30The max number of characters a property value can have
related-item.storage.index.name.prefix=relateditemsThe name of the index used in elasticsearch for storing related item documents (i.e. relateditems-YYYY-MM-DD)
related-item.storage.index.name.alias= (no alias)The name of the index alias against which to search (http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/)
related-item.storage.content.type.name=relatedThe index type
related-item.storage.cluster.name=relateditemsThe name of the elasticsearch cluster
related-item.indexing.key.for.index.request.related.with.attr=itemsThe key used in the indexed document for the storing the related ids
related-item.indexing.key.for.index.request.date.attr=dateThe key used in the indexed document for the date attribute
related-item.indexing.key.for.index.request.id.attr=idThe key against which the id is stored in the indexed document
related-item.indexing.key.for.index.request.item.array.attr=itemsThe key in the incoming user json indexing request that contains the list of items
related-item.elastic.search.client.default.transport.settings.file.name=default-transport-elasticsearch.ymlname of the elastic search file containing the transport client settings (defaults)
related-item.elastic.search.client.default.node.settings.file.name=default-node-elasticsearch.ymlname of the elasticsearch file containing the node client settings (defaults)
related-item.elastic.search.client.override.settings.file.name=elasticsearch.ymlname of the elasticsearch file than can be distributed to override the default node/transport settings
related-item.storage.location.mapper=dayday/hour/min used to convert date to a string used for creating the index name in which documents are stored
related-item.wait.strategy=yieldThe type of ring buffer wait strategy: yield/busy/sleep/block
related-item.es.client.type=transportThe type of elasticsearch client to use
related-item.indexing.indexname.date.caching.enabled=truecaching of index date
related-item.indexing.number.of.indexname.to.cache=365number of index names to cache
related-item.elastic.search.transport.hosts=127.0.0.1:9300The host:port,host:port contain the unicast addresses of the search nodes in elastic search to talk to
related-item.elastic.search.default.port=9300The default port if not specified to talk to in elasticsearch

By default the Searching and Indexing web applications will look for a yaml configuration file from which to load the configuration details. Any settings in the configuration file, override the defaults. Any system properties set will override the settings that are contained within the yaml configuration.

By default the yaml file related-items.yaml is looked for on the class path. The location of the file can be specified by the property, related-items.settings.file, for example:

  • -Drelated-items.settings.file=/etc/relateditems.yml

The yaml file, may look like the following:

related-item:
       searching:
              number.of.searching.request.processors: 16
              size.of.related.content.search.request.handler.queue: 1024

       indexing:
              size.of.batch.indexing.request.queue: 4096

With the above in place the following properties are overridden:

related-item.searching.number.of.searching.request.processors
related-item.searching.size.of.related.content.search.request.handler.queue
related-time.indexing.size.of.batch.indexing.reqeust.queue

If a system properties was set -Drelated-item.searching.number.of.searching.request.processors=2, that would override the setting in the yaml file.