Field Collapsing
Field collapsing allows something akin to a “group by” in SOLR, so that the number of results returned reflect a logical grouping rather than another total.??
Faceting can be used in conjunction. Facet counts reflect subsets within results, where-as collapse counts are group by counts.
This means that Field Collapsing could be used for certain analytics, as well as the common use-case of nesting and grouping results. To use effectively, I found it helpful to “pre-collapse” certain fields, so that a new, unique string was created that could be used to easily group, since I believe you can only field collapse on a single field. (If I’m wrong, please let me know!)
Special Setup
This assumes you are or will be running a development version of SOLR (trunk via SVN).
Field collapsing is not yet available in SOLR trunk and you must apply a patch file to SOLR and build again.
If pulling in from trunk, download the patch found at https://issues.apache.org/jira/browse/SOLR-236 in to your solr source code directory.
wget https://issues.apache.org/jira/secure/attachment/12440108/SOLR-236-trunk.patch
patch -p 1 -i SOLR-236-trunk.patch
And rebuild using supplied Apache Ant scripts.
Read more on these sites or in the comprehensive “Solr 1.4, Enterprise Search Server” page 191.
Why SOLR
SOLR has a been a great tool for BetterLesson.org. Because our primary database is MySQL, we looked at around 8 full-text indexers – but the two finalists were Sphinx [1] and SOLR. Sphinx had very tight integration with MySQL, so the learning curve seemed less. ??SOLR required a JVM, an app server, and quite a lot of configuration.
When we were deciding, an excellent SOLR book came out just when we were choosing. Further the SOLR IRC channel and mailing list for SOLR are friendly and quite active. We even had the option for commercial support through Massachusetts’ own Lucid Imagination. So I dove in.??While the configuration is non-trivial, but the configuration parameters have proven very powerful.??
More background:
I had written a half-dozen or so custom faceted search interfaces – almost entirely using MySQL, and even one used Sesame (an RDF store – and it eventually worked pretty well). Skipping the stories of pain, confusion and suffering on the road to enlightenment – SOLR has been great.??Used extensively at Netflix.com, Zappos.com, CitySearch.com, Reddit.com, Wego.com, Whitehouse.gov, Drupalgardens.com and others [2], ??supported by Apache, based on Lucene, SOLR provides a scalable, distributed search and has good data import from MySQL, including delta queries.
[1] – Sphinx is used by Craigslist and http://www.sphinxsearch.com/powered.html
If you’ve downloaded the patch into the solr directory, I believe the patch command that people want is: "patch -p 1 -i SOLR-236-trunk.patch"
Thanks Eric. I’ve updated above. Any idea when the patch will be included in SOLR?
I feel like it will be the first Thursday after never. I still can’t find anyone to answer my newest question on https://issues.apache.org/jira/browse/SOLR-236. They’re probably all so busy with the Lucene merge that support is not on the radar.
Slated for release in 1.5 [1]. Busy indeed, but I think they’ll do it. :)[1] – https://issues.apache.org/jira/browse/SOLR/fixforversion/12313566
Hi, I followed the instructions posted here and when i apply the patch, i get the following errorthe source code i applied it on is the latest source code from the trunk. # patch -p 1 -i SOLR-236-trunk.patchpatching file src/java/org/apache/solr/handler/component/CollapseComponent.javapatching file src/test/test-files/solr/conf/solrconfig-fieldcollapse.xmlpatching file src/java/org/apache/solr/search/fieldcollapse/collector/DocumentGroupCountCollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/FieldValueCountCollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/DocSetHitCollector.javaHunk #1 FAILED at 17.Hunk #2 succeeded at 29 (offset 1 line).1 out of 2 hunks FAILED — saving rejects to file src/java/org/apache/solr/search/DocSetHitCollector.java.rejpatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/AggregateFunction.javapatching file src/test/test-files/solr/conf/solrconfig.xmlHunk #1 FAILED at 406.1 out of 1 hunk FAILED — saving rejects to file src/test/test-files/solr/conf/solrconfig.xml.rejpatching file src/java/org/apache/solr/handler/component/QueryComponent.javaHunk #1 FAILED at 523.1 out of 1 hunk FAILED — saving rejects to file src/java/org/apache/solr/handler/component/QueryComponent.java.rejpatching file src/java/org/apache/solr/search/fieldcollapse/collector/CollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/AverageFunction.javapatching file src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTest.javapatching file src/test/org/apache/solr/search/fieldcollapse/AdjacentCollapserTest.javapatching file src/solrj/org/apache/solr/client/solrj/response/FieldCollapseResponse.javapatching file src/test/test-files/fieldcollapse/testResponse.xmlpatching file src/test/org/apache/solr/search/fieldcollapse/DistributedFieldCollapsingIntegrationTest.javapatching file src/java/org/apache/solr/search/DocSetAwareCollector.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/CollapseCollector.javapatching file src/test/org/apache/solr/client/solrj/response/FieldCollapseResponseTest.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/AggregateCollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/SumFunction.javapatching file src/java/org/apache/solr/search/fieldcollapse/CollapseGroup.javapatching file src/test/org/apache/solr/handler/component/CollapseComponentTest.javapatching file src/java/org/apache/solr/search/fieldcollapse/AbstractDocumentCollapser.javapatching file src/test/org/apache/solr/search/fieldcollapse/NonAdjacentDocumentCollapserTest.javapatching file src/common/org/apache/solr/common/params/CollapseParams.javapatching file src/java/org/apache/solr/search/fieldcollapse/AdjacentDocumentCollapser.javapatching file src/java/org/apache/solr/search/fieldcollapse/util/Counter.javapatching file src/solrj/org/apache/solr/client/solrj/response/QueryResponse.javaHunk #1 FAILED at 47.Hunk #2 FAILED at 63.Hunk #3 succeeded at 122 with fuzz 2 (offset -8 lines).Hunk #4 succeeded at 328 with fuzz 2 (offset 25 lines).2 out of 4 hunks FAILED — saving rejects to file src/solrj/org/apache/solr/client/solrj/response/QueryResponse.java.rejpatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/MinFunction.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/DocumentFieldsCollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/AbstractCollapseCollector.javapatching file src/java/org/apache/solr/util/DocSetScoreCollector.javapatching file src/java/org/apache/solr/search/SolrIndexSearcher.javaHunk #8 succeeded at 873 (offset -2 lines).patching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/MaxFunction.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/CollapseContext.javapatching file src/java/org/apache/solr/search/fieldcollapse/DocumentCollapser.javapatching file src/solrj/org/apache/solr/client/solrj/SolrQuery.javaHunk #1 FAILED at 17.Hunk #2 FAILED at 50.Hunk #3 FAILED at 76.Hunk #4 FAILED at 148.Hunk #5 FAILED at 197.Hunk #6 succeeded at 510 (offset -155 lines).5 out of 7 hunks FAILED — saving rejects to file src/solrj/org/apache/solr/client/solrj/SolrQuery.java.rejpatching file src/test/test-files/solr/conf/schema-fieldcollapse.xmlpatching file src/java/org/apache/solr/search/fieldcollapse/DocumentCollapseResult.javapatching file src/java/org/apache/solr/search/fieldcollapse/NonAdjacentDocumentCollapser.java[root@germinait06 Solr1.4]# patch -p 1 -i SOLR-236-trunk.patchpatching file src/java/org/apache/solr/handler/component/CollapseComponent.javapatching file src/test/test-files/solr/conf/solrconfig-fieldcollapse.xmlpatching file src/java/org/apache/solr/search/fieldcollapse/collector/DocumentGroupCountCollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/FieldValueCountCollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/DocSetHitCollector.javaHunk #1 FAILED at 17.Hunk #2 FAILED at 28.2 out of 2 hunks FAILED — saving rejects to file src/java/org/apache/solr/search/DocSetHitCollector.java.rejpatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/AggregateFunction.javapatching file src/test/test-files/solr/conf/solrconfig.xmlHunk #1 FAILED at 406.1 out of 1 hunk FAILED — saving rejects to file src/test/test-files/solr/conf/solrconfig.xml.rejpatching file src/java/org/apache/solr/handler/component/QueryComponent.javaHunk #1 FAILED at 523.1 out of 1 hunk FAILED — saving rejects to file src/java/org/apache/solr/handler/component/QueryComponent.java.rejpatching file src/java/org/apache/solr/search/fieldcollapse/collector/CollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/AverageFunction.javapatching file src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTest.javapatching file src/test/org/apache/solr/search/fieldcollapse/AdjacentCollapserTest.javapatching file src/solrj/org/apache/solr/client/solrj/response/FieldCollapseResponse.javapatching file src/test/test-files/fieldcollapse/testResponse.xmlpatching file src/test/org/apache/solr/search/fieldcollapse/DistributedFieldCollapsingIntegrationTest.javapatching file src/java/org/apache/solr/search/DocSetAwareCollector.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/CollapseCollector.javapatching file src/test/org/apache/solr/client/solrj/response/FieldCollapseResponseTest.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/AggregateCollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/SumFunction.javapatching file src/java/org/apache/solr/search/fieldcollapse/CollapseGroup.javapatching file src/test/org/apache/solr/handler/component/CollapseComponentTest.javapatching file src/java/org/apache/solr/search/fieldcollapse/AbstractDocumentCollapser.javapatching file src/test/org/apache/solr/search/fieldcollapse/NonAdjacentDocumentCollapserTest.javapatching file src/common/org/apache/solr/common/params/CollapseParams.javapatching file src/java/org/apache/solr/search/fieldcollapse/AdjacentDocumentCollapser.javapatching file src/java/org/apache/solr/search/fieldcollapse/util/Counter.javapatching file src/solrj/org/apache/solr/client/solrj/response/QueryResponse.javaHunk #1 FAILED at 47.Hunk #2 FAILED at 63.Hunk #3 FAILED at 130.Hunk #4 succeeded at 84 with fuzz 2 (offset -219 lines).3 out of 4 hunks FAILED — saving rejects to file src/solrj/org/apache/solr/client/solrj/response/QueryResponse.java.rejpatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/MinFunction.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/DocumentFieldsCollapseCollectorFactory.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/AbstractCollapseCollector.java
patching file src/java/org/apache/solr/util/DocSetScoreCollector.javapatching file src/java/org/apache/solr/search/SolrIndexSearcher.javaReversed (or previously applied) patch detected! Assume -R? [n] nApply anyway? [n] yHunk #1 FAILED at 17.Hunk #2 FAILED at 530.Hunk #3 FAILED at 586.Hunk #4 FAILED at 610.Hunk #5 FAILED at 663.Hunk #6 succeeded at 798 with fuzz 2 (offset 93 lines).Hunk #7 FAILED at 809.Hunk #8 FAILED at 968.Hunk #9 succeeded at 1260 (offset 2 lines).7 out of 9 hunks FAILED — saving rejects to file src/java/org/apache/solr/search/SolrIndexSearcher.java.rejpatching file src/java/org/apache/solr/search/fieldcollapse/collector/aggregate/MaxFunction.javapatching file src/java/org/apache/solr/search/fieldcollapse/collector/CollapseContext.javapatching file src/java/org/apache/solr/search/fieldcollapse/DocumentCollapser.javapatching file src/solrj/org/apache/solr/client/solrj/SolrQuery.javaHunk #1 FAILED at 17.Hunk #2 FAILED at 50.Hunk #3 FAILED at 76.Hunk #4 FAILED at 148.Hunk #5 FAILED at 197.Hunk #6 FAILED at 665.Hunk #7 succeeded at 620 with fuzz 1 (offset -101 lines).6 out of 7 hunks FAILED — saving rejects to file src/solrj/org/apache/solr/client/solrj/SolrQuery.java.rejpatching file src/test/test-files/solr/conf/schema-fieldcollapse.xmlpatching file src/java/org/apache/solr/search/fieldcollapse/DocumentCollapseResult.javapatching file src/java/org/apache/solr/search/fieldcollapse/NonAdjacentDocumentCollapser.java
I need to integrate faceted search with my Java+Sesame app. Can you give me some advice to start?