Discussion:
How to Alphabetical index of pages with template and level limitations
PIRONET Benoît
2008-10-30 09:31:19 UTC
Permalink
Hello,

I need to develop a alphabetical index of pages limited to pages of a certain template type ( theme ) and that are only at level 4.

Which method is the best?

- Rewrite EnginesRegistry.java to add my engine and develop custom code to create a new engine
- Create a new SearchViewHandler and use Lucene Query to retrieve the results with a abcindex.jsp view
- Create a normal template and use custom tags
- Create a normal template and use Container Query Tags ? If this solution, do you have an example with level and template filtering ?

Please consider the cache issues since our jahia implementation will need to support heavy loads.

Thank you in advance,

Benoit Pironet
Benjamin Papez
2008-11-07 12:01:23 UTC
Permalink
Hi Benoit,

you made a nice investigation exercise and have good questions, but I'd
say none of the listed is the best.
- Rewrite EnginesRegistry.java to add my engine and develop custom
code to create a new engine

Sounds too complicated for me
- Create a new SearchViewHandler and use Lucene Query to retrieve the
results with a abcindex.jsp view

A new SearchViewHandler is not necessary. As the search query is
hardcoded and does not need features like "saved search, search mode
selection, update search options") it could be done without a view
handler and you could use the JahiaSearchService directly.

Are you showing the index of all themes from A-Z on one page, or are you
splitting them on many pages, or in other words, can all the pages be
retrieved with one query, or do you want to have query per character
(range)?
If you display the whole index at once, you can do the sorting yourself,
otherwise yes the index would also need to store the whole title
untokenized in order to make sorting and queries on the entire title as
one term, and I believe additional untokenized indexing of the title is
a good idea to implement into the product. I have done it for the tests
of your use case already.
Otherwise you may also add additional fields in the container structure
or metadata fields in pages on your own, and configure in
jahiaresource.cpm.xmp that the indexation of that field must be
un_tokenized with keyword analyzer.
- Create a normal template and use custom tags
- Create a normal template and use Container Query Tags ? If this
solution, do you have an example with level and template filtering ?

This may be the way to go in the mid term. In the short term however it
will not work yet, as the Query tags at this time are just for container
and not for page search. However we are working this month and next on a
specificataion/implementation to improve the tags and have a unified way
for container, page and file search. For now I would suggest to use the
JahiaSearchService directly, but the API there is undergoing a
refactoring, so you may have to change your implementation in future
versions hopefully towards a simpler tag version.

Level filtering could be done with the jcpid occurence. In case this
does not perform you could perhaps add some metadata field for the level.

The problem with LIKE and Sorting now is that the title is tokenized and
if the second term of the title is beginning with a* you will also get
it when searching for titles with A. And as the query tag is not working
for pages yet, you cannot search on the title. We sometimes do Lucene/DB
query combinations ourselves in some query tags, but for performance it
would be better to just have one (and I believe Lucene would be the best
here).
Please consider the cache issues since our jahia implementation will
need to support heavy loads.

I think the best for this use case is to use cached Lucene filters. This
way the query performs very fast, afterwards Jahia checks for access
rights for the current user. The filters are automatically re-run in the
background after each index update and before exposing the new
IndexSearcher to the system (autowarming). A Lucene filter is used if
you use the USE_BACKEND_CACHE property in the query tags, or if you use
the JahiaSearchService API where the query parameter is a String array,
the first item is the query which is not filtered/cached, but all the
next items in the array are. You can also have an empty String as first
query in the array.

I have tested your use case with the following code:

String[] queries = {"", "jahia.content_type:jahia.content_type.page AND
jahia.definition_name:Theme AND
(jahia.metadata_pagepath:jcpid*jcpid*jcpid*jcpid* NOT
jahia.metadata_pagepath:jcpid*jcpid*jcpid*jcpid*jcpid*)"};
String[] searchHandlers = {jParams.getSiteKey()};
JahiaSearchService searchService =
ServicesRegistry.getInstance().getJahiaSearchService();
SearchResult searchResult = searchService.search(queries,
searchHandlers, jParams,
new PageSearchResultBuilderImpl(), new JahiaLuceneSort(new
SortField("jahia.title_untok", jParams.getLocale())));
for (ParsedObject parsedObject :
SearchTools.getParsedObjects(searchResult)) {
String objectID = parsedObject.getValue(JahiaSearchConstant.ID);
%><%=ContentPage.getPage(Integer.parseInt(objectID)).getTitle(jParams.getEntryLoadRequest())+"
// "%><%
}

This is already based on a local patch, which stores an untokenized
title in the jahia.title_untok field. With such a patch you could also add

AND jahia.title_untok:a*

in the query to get just pages beginning with A.

Not having such a patch you could still use that code (having null
instead of JahiaLuceneSort), but will be able to only retrieve all pages
on 4thlevel with template Theme and have to do the sorting on your own.
Or you could alternatively also add a metadata title_untok field for
pages on your own and configure the used index="un_tokenized" and
analyzer="keyword" for that field.

Alternatively if all your Theme pages would be created in a
themePageContainerList instead of the general navigationContainerList,
you could there offer just the Theme template for creation of new pages
and you could also add a nonTokenizedPageTitleField in that container
(hidden to the edit user and filled via an eventlistener). Then you
could already use the query tags for that.

Regards,
Benjamin

PS: We will also soon provide a better documentation, when we finish the
refactoring of the search module. Sp4 was already a first refactoring
step with the query tags for containers and some Lucene performance
improvements with using autowarmed cached Lucene filters and better
abilities to customize per field indexation behaviour via Compass. As
you can see from this mail, there are quite many undocumented
possibilities you could already use with Sp4.
Hello,
I need to develop a alphabetical index of pages limited to pages of a certain template type ( theme ) and that are only at level 4.
Which method is the best?
- Rewrite EnginesRegistry.java to add my engine and develop custom code to create a new engine
- Create a new SearchViewHandler and use Lucene Query to retrieve the results with a abcindex.jsp view
- Create a normal template and use custom tags
- Create a normal template and use Container Query Tags ? If this solution, do you have an example with level and template filtering ?
Please consider the cache issues since our jahia implementation will need to support heavy loads.
Thank you in advance,
Benoit Pironet
_______________________________________________
template_list mailing list
http://lists.jahia.org/cgi-bin/mailman/listinfo/template_list
Loading...