-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathRELEASE_NOTES
1474 lines (1193 loc) · 68.5 KB
/
RELEASE_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Google Search Appliance Connector Manager
Release Notes
This document contains the release notes for Google Search Appliance
Connector Manager. The following sections describe the release in
detail and provide information that supplements the main documentation.
See the Issues Tab on the Code Site for the current list of known
issues and work-arounds.
Web Site: http://code.google.com/p/google-enterprise-connector-manager/
Release 2.8.6, 5 May 2012
=========================
Introduction
------------
This is a patch release that fixes a few small problems discovered in the
the previous release. Users of previous releases are encouraged to review
the changes below to determine whether to upgrade.
Summary of Changes
------------------
* Fix Issue 6305209 - Text conversion fails on PDF files when skipping
the content. Handle PDF documents that are zero-length or too long more
gracefully. Rather than skip the document entirely, feed a stub
document with just the document's title, if available.
* Fix Issue 5599305 - Retry Connector startup if instantiation fails.
The FileSystem Connector fails Connector bean instantiation if the file
share is off-line. The other connectors that use the Diffing package
(LDAP and Database) can suffer similar failures. This fix allows a
failed Connector instantiation to be retried after a period, in hopes
that any transient errors may have been corrected.
* Fix file system connector code site issue 32 - Initial snapshot fails
in Java 7 with error "two snapshots with the same number". Note that
Java 7 is not officially supported.
* Differentiate between no password and empty-string password in the
user authentication servlet.
* Remove the google:feedid property from records in the feeds.
Release 2.8.4, 23 February 2012
===============================
Introduction
------------
This is a patch release that fixes a few small problems discovered in the
the previous release. Users of the previous release are encouraged to upgrade.
Summary of Changes
------------------
* Fix Issue 5973714: Exceptions thrown while prefetching the Authorization
Manager on connector startup would leave the connector instance in an
inconsistent state.
* Fix Issue 5723358 - Escape special characters in user and group names
returned in an Authentication response. User or group names that contained
certain characters that have special meaning in XML syntax would cause
failures reading the Authentication response.
* Fix Issues 5370948 and 5481676 - Better recovery from FeedExceptions.
If submitting a feed to the GSA fails for some reason, the Connector
Manager retrys the feed after 15 minutes. But first, the Connector
Manager would test the GSA to verify that it is accepting feeds.
Unfortunately, that test would actually kill a functioning GSA
feedergate, disabling feeds for a short period of time while it
restarts. Effectively, the feed problem recovery strategy would
kill feeds every 15 minutes. This problem only affects GSA version
6.12. This fix avoids the problem, using a slightly different strategy
to check for GSA feed availability.
* Address Issue 5382030 - If Flexible Authorization is misconfigured
to use connector authorization with a credential group which has no
authentication rules defined, the GSA sends a null Identity to the
Connector Manager during Authorization. This was handled poorly
by most Connectors. Although Issue 5382030 is actually a problem
with the Security Manager, the Connector Manager now considers a null
Identity to be an error, and returns an error status code to the
GSA.
* Adds rudimentary GData configuration for Connectors. The new
googleFeedHost property supplied to Connectors may be used to
access the GData interface on the GSA. This should be considered,
at best, a temporary solution. This change also removes the
googleWorkDir and googleConnectorWorkDir properties from the saved
properties files, to avoid problems when moving connector instances
to a different directory. The properties still appear in the Properties
objects in the SPI.
* Adds several improvements to the Document Filters.
The ModifyPropertyFilter adds support for modifying the CONTENT
property of text documents (determined according to MIME type).
A SkipDocumentFilter can force a document to be skipped (or not)
based upon the presence/abscence of a specific Property, or based
upon a match on one of the values of that property.
The JavaDoc documentation for the Document Filters has been
improved, including example configurations.
* Various improvements in diagnostic logging.
* Fix Issues 233, 5028655, 6019938 - Fix logic bug in diffing where
recovery-files' age comparison was broken. This could lead to the
connector resending the same files again after Tomcat was restarted.
* Fix Issue 232 - A small memory leak in ThreadPool would leak
QueryTraversers (and all the objects they held).
Version Compatibility
=====================
The diffing library has a change effecting diffing connectors (File System,
LDAP, and Database). The method for assigning file name extensions to
recovery files has changed. This change causes no issues migrating forward to
this release, but reverting to an earlier release after running 2.8.4 requires
diffing connectors to be reset.
Release 2.8.2, 10 October 2011
==============================
Introduction
------------
This is a patch release that fixes a few small problems discovered in the
the previous release. Users of the previous release are encouraged to upgrade.
Summary of Changes
------------------
* Enable editing of Connector Advanced configuration XML from the
GSA Admin console.
* Adds support for Flexible Authorization for Connectors.
* Improves MIME type recognition for many Microsoft Office file formats
when using the third-party mime-util detector.
* Reduces the likelihood of search result authorization timeouts.
Release 2.8.0, 08 July 2011
===========================
Introduction
------------
This release has significant infrastructure changes, fixes several
problems, and adds several new utility classes to the connector SPI
for the benefit of connector developers.
Summary of Changes
------------------
* Issue 104 - Servlet to dynamically change logging levels. New servlets
getConnectorLogLevel, setConnectorLogLevel, getFeedLogLevel and
setFeedLogLevel allow the connector administrator to adjust the connector
and feed logging verbosity without shutting down the connector.
These are especially useful in conjunction with the existing
getConnectorLogs and getFeedLogs servlets.
* Issue 168 - Make Base64 encode/decode available to the connector developer.
Base64, Base64FilterInputStream, and Base64ChecksumGenerator are several
of the new utility classes made available to connector developers.
* Issue 199 - SPI enhancement to expose JDBC to the connector developer.
Enabled by the new ConnectorPersistentStore SPI interface. Connectors
that wish to be given access to the JDBC DataSource should implement the
ConnectorPersistentStoreAware interface in their Connector implementation.
Given the DataSource, the connector developer may also take advantage
of several new database access utility classes, such as JdbcDatabase,
DatabaseConnectionPool, and DatabaseResourceBundle.
Note that ConnectorPersistentStore.getLocalDocumentStore() is disabled
in this release.
* Fixed Issue 4062256 - Failure to delete snapshot files would throw
IllegalStateException. This affected the File System, LDAP, and
Database Connectors.
* Fix Issue 4524076 - Backward compatibility issue in diffing connectors
for recovery files. This affected the File System, LDAP, and
Database Connectors.
* Fix Issues 4581062, 4613042 - Add configurable diffing connector delay
interval after each scan: 'introduceDelayAfterEachScan'. This should
relieve some of the continuous file system scanning behaviour in the
File System Connector.
* The SPI AuthenticationManager and AuthenticationResponse classes have
been enhanced to allow the connector to return repository local groups
for a user.
* A new pagerank document property is now supported.
The SpiConstants.PROPNAME_PAGERANK property allows the connector to
recommend a pagerank (0-100) for the document if it matches queries.
For more information on pagerank see:
http://code.google.com/apis/searchappliance/documentation/610/feedsguide.html#defining_the_xml
* Fixed a minor problem that prevented connectors from running in the
JBoss Application Server. See the JBoss deployment wiki page:
http://code.google.com/p/google-enterprise-connector-manager/wiki/JBossCM
* Added support for these Microsoft Office 2007 and later media types:
- application/vnd.ms-outlook
- application/vnd.ms-excel.sheet.12
- application/vnd.ms-powerpoint.presentation.12
- application/vnd.ms-word.document.12
* Added support for Secure Socket Layer (SSL) feeds to the Google Search
Appliance. At the present time, SSL feeds must be manually configured.
For additional details, see the Advanced Configuration wiki page:
http://code.google.com/p/google-enterprise-connector-manager/wiki/AdvancedConfiguration
* New Document Filters utility package additions to the SPI for use by
connector developers, connector administrators, and systems integrators.
Document filters act to transform their source Document's Properties.
Document filters can add, remove, or modify a document's properties,
including the document content. Properties in which the filter has
no interest are passed through unmodified. A document filter might
even throw a SkippedDocumentException to prevent a document from being
fed to the Google Search Appliance.
Multiple document filters may be chained together, forming
a transformational document processing pipeline. Similar to a
Unix command pipeline, the filters are linked together, each using
the previous one as its source Document.
For more information see the Document Filters wiki page:
http://code.google.com/p/google-enterprise-connector-manager/wiki/DocumentFilters
* New additions to the SPI:
- ConnectorPersistentStore - Provides access to the LocalDatabase
- ConnectorPersistentStoreAware - Advertises that the Connector
wishes access to the LocalDatabase
- DatabaseResourceBundle - Vendor-specific SQL language translations
- LocalDatabase - Provides access to the configured JDBC DataSource and
DatabaseResourceBundles
- SpiConstants.PROPNAME_PAGERANK
For additional information, please refer to the JavaDoc at:
http://google-enterprise-connector-manager.googlecode.com/svn/docs/javadoc/2.8.0/index.html
* New utility package additions to the SPI for use by connector developers:
(available in package com.google.enterprise.connector.util)
- Base64 - Base64 encode/decode utility
- Base64DecoderException
- Base64FilterInputStream - InputStream filter that Base64 encodes
data read from its input
- ChecksumGenerator - Interface for checksum generators
- BasicChecksumGenerator - Generates MD2, MD5, SHA-1, SHA-256, SHA-384
and SHA-512 message digest checksums of data from an InputStream
- Base64ChecksumGenerator - Derived from BasicChecksumGenerator, but
returns Base64 encoded checksums
- Clock - a interface for getting the time; useful to replace for testing
- SystemClock - a Clock implementation using System.getCurrentTimeMillis()
- EofFilterInputStream - InputStream filter that avoids a read at
end-of-file problem with Apache Commons IO AutoCloseInputStream
- IOExceptionHelper - creates IOExceptions with a root cause on Java 5
- UniqueIdGenerator - Interface for producing unique IDs
- UuidGenerator - UniqueIdGenerator implementation based on UUID
- XmlParseUtil - utility methods for parsing XML data
- SAXParseErrorHandler
For additional information, please refer to the JavaDoc at:
http://google-enterprise-connector-manager.googlecode.com/svn/docs/javadoc/2.8.0/index.html
* New utility database package additions to the SPI for use by connector
developers:
(available in package com.google.enterprise.connector.util.database)
- JdbcDatabase - database info, utilities for creating and maintaining
database tables for connector instances.
- DatabaseConnectionPool - a pool of connections to the JDBC DataSource
- DatabasePropertyResourceBundle - DatabaseResourceBundles implemented
as properties files
- DatabaseResourceBundleManager - loads DatabaseResourceBundles
For additional information, please refer to the JavaDoc at:
http://google-enterprise-connector-manager.googlecode.com/svn/docs/javadoc/2.8.0/index.html
* New utility diffing package addition to the SPI provides a snapshot
diffing connector framework for use by connector developers:
(available in package com.google.enterprise.connector.util.diffing)
- Change, ChangeQueue, ChangeSource
- CheckpointAndChange, CheckpointAndChangeQueue
- DeleteDocumentHandle, DeleteDocumentHandleFactory
- DocumentHandle, DocumentHandleFactory
- DiffingConnector, DiffingConnectorTraversalManager
- DiffingConnectorCheckpoint, DiffingConnectorDocumentList
- DocIdUtil
- FilterReason
- GenericDocument
- DocumentSink, LoggingDocumentSink
- DocumentSnapshot, DocumentSnapshotFactory
- DocumentSnapshotRepositoryMonitor
- DocumentSnapshotRepositoryMonitorManager
- DocumentSnapshotRepositoryMonitorManagerImpl
- MonitorCheckpoint
- SnapshotRepository, SnapshotRepositoryRuntimeException
- SnapshotStore, SnapshotStoreException
- SnapshotReader, SnapshotReaderException
- SnapshotWriter, SnapshotWriterException
- TraversalContextManager
For additional information, please refer to the JavaDoc at:
http://google-enterprise-connector-manager.googlecode.com/svn/docs/javadoc/2.8.0/index.html
* New utility database and diffing testing packages additions to the SPI
provides test classes for use by connector developers:
(available in package com.google.enterprise.connector.util.database.testing)
- TestJdbcDatabase
- TestLocalDatabase
- TestResourceClassLoader
(available in package com.google.enterprise.connector.util.diffing.testing)
- FakeDocumentSnapshotRepositoryMonitorManager
- FakeTraversalContext
- TestDirectoryManager
* The Connector Manager now ships with several new third party JARs.
The connector developer may find these functionally useful, however
they should note that these are now distributed with the Connector
Manager and the connectors should take care not to replace them
with older or incompatible versions.
- commons-cli.jar v1.2 http://commons.apache.org/cli
- eproperties.jar v1.1.0 http://code.google.com/p/eproperties
- h2.jar v1.2.147 http://www.h2database.com
Release 2.6.10, 04 February 2011
================================
Introduction
------------
This is an internal release for the Connector Manager on-board the GSA,
not for general use.
Release 2.6.6, 7 December 2010
===============================
Introduction
------------
This is a patch release that fixes a few small problems discovered in the
the previous release. New additions have been made to the connector SPI.
Users of the previous release are encouraged to upgrade.
Summary of Changes
------------------
* Issue 217 - Schedule intervals that include midnight were treated as
empty and ignored.
* Issue 224 - Fixes a potential loss of information about exceptions in
the connector logs.
* Issue 225 - Fixes a series of problems with the ImportExport utility.
* Issue 227 - Use ' instead of ' when escaping single quotes,
for HTML compatibility. The use of ' could lead to errors when
configuring connector instances using Internet Explorer.
* Improved log messages when free memory is low, when the feeds are
paused due to a backlog on the GSA, and when constructing a new
connector instance throws an exception while starting a new traversal
batch.
* New additions to the SPI:
o SpiConstants.RESERVED_PROPNAME_PREFIX
o SpiConstants.PROPNAME_FOLDER
o SpiConstants.PROPNAME_LOCK
o UrlValidator class
o UrlValidatorException class
The UrlValidator class is in a new com.google.enterprise.connector.util
package. This package will be used for utility classes that are not
part of or related to the spi package, but which connector
implementers might find useful.
Release 2.6.4, 16 September 2010
================================
Introduction
------------
This is an internal release for the Connector Manager on-board the GSA,
not for general use.
Summary of Changes
------------------
* Servlet access to the Connector Manager on-board the GSA has been
largely locked down. The getConfiguration and getConnectorLogs
servlets remain accessible for the benefit of connector administrators
and support personnel.
Release 2.6.2, 09 September 2010
================================
Introduction
------------
This is an internal release for the Connector Manager on-board the GSA,
not for general use.
Release 2.6.0, 14 June 2010
============================
Introduction
------------
This is a patch release that fixes a few small problems discovered in the
previous release. Users of the previous release are encouraged to upgrade.
Summary of Changes
------------------
* Issue 189 - Fixes a problem with the HostLoadManager that would
impose artificial traversal delays. The delays, although short,
would occur frequently. This fix allows the Connector Manager to
more accurately adhere to the configured documents per minute
traversal rate.
* Issue 194 - Enhances the HostLoadManager to more accurately adhere
to the configured documents per minute traversal rate; while allowing
the Connector to occasionally exceed that rate to improve efficiency.
* Issue 220 - Fixes the representation of multiple-valued metadata supplied
by the Connector as it is fed to the Google Search Appliance. This
corrects a problem with parametric navigation.
* Fixed a problem where restarting a traversal from the beginning
of a Connector's Repository or changing a Connector's traversal
schedule might not take immediate effect.
* Added a 'traversal.enabled' property that may be set in the Connector
Manager's applicationContext.properties is used to enable or disable
Traversals and Feeds for all Connector instances in the Connector
Manager. Disabling Traversal would be desirable if configuring a
Connector Manager deployment that only authorizes search results.
This feature is designed for turning off traversal for replica
Connector Managers in a clustered, load-balanced, or fail-over
environment. Traversals are enabled by default.
* Added an EncryptPassword command line utility that can be used by
an administrator to encrypt passwords that will be manually added to
Connector Manager or Connector properties files. For details, see
the EncryptPassword wiki page at:
http://code.google.com/p/google-enterprise-connector-manager/wiki/EncryptPassword
* Enhanced support for the Apache JULI LogManager and FileHandler.
* Corrected a minor issue with regards to the naming of log file
archives that are generated by the GetConnectorLogs servlet.
* Enhanced logging support for the testing environment by adding a
connector-manager/testdata/config/logging.properties file.
* Fixed the Connector Manager web application web.xml file to more
closely follow its DTD.
* Fixed a Daylight Saving Time bug in one of the tests. This issue
affected the test only, not the production Connector Manager.
Release 2.4.4, 05 February 2010
===============================
Introduction
------------
This is a patch release that fixes a few small problems discovered in the
previous release. Users of the previous release are encouraged to upgrade
immediately.
Summary of Changes
------------------
* Issue 209: Changed the default feed time zone from UTC to the
local time zone. Added a new configuration property, feed.timezone,
that may be set in applicationContext.properties if the local time
zone should not be used.
The 'feed.timezone' property defines the default time zone used
for Date metadata values for Documents. A null or empty string
indicates that the local time zone of the machine running the
Connector Manager should be used. Standard TimeZone identifiers
may be specified. For example:
feed.timezone=America/Los_Angeles
If a standard TimeZone identifier is unavailable, then a custom
TimeZone identifier can be constructed as +/-hours[minutes] offset
from GMT. For example:
feed.timezone=GMT+10 # GMT + 10 hours
feed.timezone=GMT+0630 # GMT + 6 hours, 30 minutes
feed.timezone=GMT-0800 # GMT - 8 hours, 0 minutes
This modification has compatibility implications when upgrading;
refer to the Version Compatibility section, below.
* Issue 211: Moved common Connector Manager configuration properties
from applicationContext.xml to applicationContext.properties.
This makes it much easier to upgrade to a newer version of the
Connector Manager, while preserving properties that had been
customized by the administrator.
This modification has compatibility implications when upgrading;
refer to the Version Compatibility section, below.
* Issue 212: Fixes "IOException: Attempted read on closed stream."
exceptions that might be generated if a Connector uses the
Apache Commons AutoCloseInputStream to provide document content
to the Connector Manager. Currently, only the SharePoint Connector
seems to have been affected by this problem.
Version Compatibility
---------------------
The time zone change has the most dramatic impact when upgrading.
Previous versions of Connector Manager would convert all date/time
values provided by the Connector to UTC dates when supplying the date
to the GSA. This had undesired consequences for users performing
date-range searches, where expected search results may have been
discarded as their adjusted date-stamps may have pushed the document
into the previous or next day. Consider this: a document modified
at 8:00PM PST today in California will have a calendar date of
tomorrow when adjusted to UTC time, so if you search for documents
modified today, it won't be found.
The Issue 209 change assumes date values supplied to the GSA are
local time, unless otherwise specified. This allows date-range
queries to function as expected when the Document Repository,
the Connector Manager, and the Search Appliance are in the same
time zone (or near enough time zones that 'normal working hours'
significantly overlap).
Upgrading an existing deployment to this new Connector Manager will
result in all newly indexed content feeding date values in the
local time zone, whereas all previously indexed content will have
UTC date values. The connector administrator may choose to handle
this inconsistency in one of several ways:
1) Do nothing. This may be desirable if date-range searches
for older materials need not be accurate to the day; or if
re-indexing all content is untenable; or if the Repository,
Connector Manager, Search Appliance, and search users are
widely dispersed across time zones.
2) Set the feed.timezone property in applicationContext.properties
to GMT. If your local time zone differs significantly from GMT,
then date-range searches will continue to be unreliable.
However, all dates in the index will be consistently inaccurate.
3) Re-index all content fed by Connectors. This is the preferred
solution if date-range searches are required to be accurate to
the day, or if re-indexing the content is not an onerous task.
The mechanics of re-indexing depends upon the GSA and Connector
Manager versions. Please consult the appropriate documentation
for details.
4) Set the feed.timezone property in applicationContext.properties
to a value other than the local time zone or GMT. This would
be appropriate if the Connector Manager and/or GSA were in
significantly different timezones than the Repository and/or
search users. The options to re-index or not still apply.
--
As a result of the modifications for Issue 211, the Connector Manager
applicationContext.xml and applicationContext.properties files have
changed significantly. Consequently, simply dropping the v2.4.4
JAR files into an existing installation will not function properly.
An in-place upgrade must include the applicationContext.xml file
as well.
If the connector administrator has made no modifications to
applicationContext.xml, then a drop-in update of just the
Connector Manager v2.4.4 JAR files and the applicationContext.xml
file over an existing v2.4.x installation should proceed uneventfully.
If using the GCI Connector Installer v2.4.4 to upgrade or following the
procedures as described in the UpdatePatchReleasewiki page (see below),
then installation of the applicationContext.xml file will be automatic.
However, if the connector administrator has made modifications to
the applicationContext.properties, those modifications must be
re-applied after installation. But because of the restructuring of
applicationContext.xml, it may be difficult to merge the differences.
Before upgrading, the administrator should make back-up copies of the
existing applicationContext.xml and applicationContext.properties files.
These may be used as reference when modifying the newer versions of
these files.
Once the backup files have been made, follow the instructions for applying
an update as described in the UpdatePatchRelease wiki page (see below).
EXCEPT - do not copy the old applicationContext.properties file over
the new one (step 5 in the instructions), and don't restart Tomcat yet
(step 6), or if using the GCI Connector Installer v2.4.4 to upgrade,
shut down Tomcat after the upgrade completes, before re-applying the
modifications.
Copy the few set properties from the old applicationContext.properties
to the new one. The old properties file contains only a half-dozen or so
properties and they are clearly documented in the new properties file.
Next it is time to reapply your old applicationContext.xml modifications.
Most of the properties an administrator would wish to change will now
be set in applicationContext.properties rather than the XML file.
For instance, rather than modifying the constructor-arg for the
TraversalTimeLimitSecondsDefault bean in the applicationContext.xml file,
the administrator should set the traversal.time.limit property in the
applicationContext.properties file. This process may be tedious, but
the v2.4.4 applicationContext.properties file is well commented, so
the appropriate modifications should be clear.
Finally, once the appropriate modifications have been made to the
new applicationContext.properties file, the Tomcat server may be
restarted as described in the UpdatePatchRelease wiki page. Once
the server is restarted, the administrator is encouraged to examine
the Connector Manager logs to check for any configuration errors.
For additional details, please refer to the Connector Manager wiki
page describing how to manually install an update or patch release:
http://code.google.com/p/google-enterprise-connector-manager/wiki/UpdatePatchRelease
Release 2.4.2, 11 January 2010
==============================
Introduction
------------
This is a patch release that fixes a few small problems discovered in the
previous release. Users of the previous release are encouraged to upgrade
immediately.
Summary of Changes
------------------
* Issue 3: Changed MockJcrRepository.login() to throw LoginException as
described by the Repository Interface rather than just return null Session.
The scope of this change is limited to the quality control unit tests.
* Issue 197: Added Microsoft Office 2007 OpenXML media types to the set of
supportedMimeTypes in applicationContext.xml.
* Issue 204: Handle XML CDATA end markers "]]>" embedded in the form
snippet returned by ConnectorType.getConfigForm() and
ConnectorType.getPopulatedConfigForm().
* Issue 206: Auto-Disabled connectors could not be re-enabled. This
problem would have been encountered if the connector administrator
entered a negative value for the Retry Delay in a connector's
configuration, and then tried to re-enable traversal after it had been
automatically disabled at the end of the traversal. This is the
preferred mechanism to "catch up", indexing documents that have been
added or changed since a "run-once" traversal finished.
* Issue 207: Fixes OutOfMemoryError when submitting feeds over a slow
feed connection or to busy Search Appliance. If the GSA is slow to
accept feeds, then too many feeds could be queued up waiting to be sent,
each with an attached 10MB buffer of data. Memory allocation would
grow with each new feed created until the heap was exhausted.
Release 2.4.0, 20 November 2009
===============================
Introduction
------------
This is an upgrade release with some enhancements. Users of previous
releases are encouraged to upgrade. It also contains some new features.
Users of previous releases should check the "Version Compatibility" section
below for instructions on how to use existing data with this new release.
This release focuses primarily on improving the performance and robustness
of the document traversal and feed process. Informal measurements show
feeding documents to the GSA 3 to 10 times faster than version 2.0.0.
Within the Connector Manager, these gains were achieved by:
- Reducing the number of feeds sent to the GSA by generating larger feeds
with more documents per feed.
- Sending compressed document content to newer GSAs that support compressed
content feeds (GSA v6.2).
- Traversing the repository in larger swaths.
- Reducing traversal work redundancy.
Performance analysis also identified some bottlenecks in several of the
individual Connectors that were addressed. (See each Connector's Release
Notes for details).
Summary of Changes
------------------
* Issue 106: Added support for feeding multiple documents to the GSA
in a single feed file. Previously, the Connector Manager would create
a new feed file for each Document. This was inefficient and could
result in slow feed performance. The Connector Manager will now
accumulate feed data into a single feed per connector traversal.
Once that feed exceeds a set size or when the traversal batch
completes, the feed is wrapped up and sent to the GSA. The default
maxFeedSize is 10MB, and is configurable in applicationContext.xml.
An early version of this feature was made available in the v2.0.2
release. This release provides the full implementation of the feature.
* Issue 111: A catch-all for small performance-related issues, including
a faster Base64 encoder, larger I/O buffer sizes, reduced data copying,
processing additional records from a returned traversal DocumentList,
even if it would exceed the supplied hint size and host load constraints.
Several other items were spun off into separate Issues.
* Issue 117: Fixes a problem where some Date meta-data fields are being
incorrectly formatted for non-English locales. The RFC 822 specification
explicitly states that month and day names are specified in English.
Previous releases would translate them to the current locale.
* Issue 124: Throttle feeds to GSAs that seem to be falling behind
in feed processing. GSA revisions 5.2.0.G28 and later allow the
Connector Manager to query the backlog of unprocessed feed files.
This feature is used to throttle back the document feed if the GSA has
fallen behind processing outstanding feed items. The Connector Manager
will periodically poll the GSA, asking for the count of unprocessed
feed items (the backlog count). If the backlog count exceeds the
a configured ceiling we pause the feed. We resume the feed once the
backlog count drops down below a floor value. The floor, ceiling, and
poll interval are configurable by editing the FeedBacklogFloor,
FeedBacklogCeiling, and FeedBacklogCheckIntervalSeconds bean definitions
in applicationContext.xml.
* Issue 141: Replaces many of the home-rolled multi-threading constructs
with newer java.util.concurrent technologies available in Java 1.5
* Issue 143: Adds an 'excluded' set to the mime type map. This allows
administrators to specify a set of document types that should be
excluded during traversals. Neither their content, nor their meta-data
should be fed to the GSA. Note that not all Connectors yet support
this feature.
* Issue 153: Adds support for compressing the document content data
in Content Feeds. This reduces the size of the feed file sent to
the GSA. Compressed Content Feeds are supported in GSA versions
6.2 and above. The Connector Manager automatically detects whether
the GSA feed host supports compressed feeds and provides either
compressed or uncompressed data accordingly.
* Issue 164: Enhanced the SimpleProperty class, adding a single-value
constructor. This should make it much easier for Connectors to
use this class.
* Issue 171: Moves the traversal schedule check into a synchronized
block, eliminating the risk of using a stale schedule.
* Issue 172: Corrects a problem when shutting down the Connector Manager
after a feed error is encountered.
* Issue 173: Properly format the HTTP feed requests packets sent to
the GSA. The HTTP protocol explicitly specifies the use of MS-DOS
style CR-LF line endings.
* Issue 174: Fixes a unit test failure for non-English locales.
* Issue 175: Fixes a NullPointerException that would occur if the
Connector Manager was not an authorized feed client of the GSA
it was attempting to feed.
* Issue 177: Fixes an IOException thrown on startup if a GSA
feed host is not defined.
* Issue 178: Cleans up handling of legacy Connector traversal
schedule strings exchanged between the GSA and the Connector Manager.
* Issue 182: Submit feeds to the GSA in a separate thread. In the
case where a traversal batch generates multiple feed files, a
full feed file is submitted to the GSA in separate thread, while
the traversal thread builds the next feed file. This overlaps
I/O, adding better concurrency. One thread is focused on I/O
between the Connector Manager and the GSA, the other thread
is focused on I/O between the Connector Manager and the document
Repository.
* Issue 187: Fix a problem that would add a redundant and unnecessary
log message once per second during connector traversals.
* Issue 188: Adds simple implementations of the SPI callback
interfaces, SimpleConnectorFactory and SimpleTraversalContext.
These make it easier for Connector Developers to create tests
that use these features.
* Issue 190: Fixes a regression in handling non-lowercase connector
names. Although fixed in release v1.3.0, this got broken again
in release v2.0.0. For details, see:
http://code.google.com/p/google-enterprise-connector-manager/wiki/LowerCaseConnectorNames
* Issue 193: Adds code to check for the presence of a "password" attribute in
the <Identity> element sent within the Authorization query. If the
attribute is present it is now read and stored in the parsed
AuthenticationIdentity object.
* Issue 196: Connections between the Connector Manager and the feed reader
used when the Connector Manager pushes feeds are now explicitly being
closed.
Version Compatibility
---------------------
Connector authors should note the changes for Issue 143 ('excluded' mime
type) subtly changes the meaning of values returned by the method,
TraversalContext.mimeTypeSupportLevel(String). Previously, values
less than or equal to zero were considered 'unsupported'. With this
release, values equal to zero are considered 'unsupported', while
values less than zero are considered 'excluded'.
Connector authors and administrators should note the changes related to
Issue 111 change the behavior of the Connector Manager regarding DocumentList
objects that contain more Documents than was specified by the batchHint
(as provided to TraversalManager.setBatchHint(int)). Previously, the
Connector Manager would process no more than batchHint number of
Documents from the returned DocumentList. The current release will
continue to process the DocumentList until it is exhausted, or the
number of documents exceeds twice the batchHint, or the traversal
time limit is reached. This could result in the Connector Manager
processing up to twice as many documents from each batch. The current
host load management does not take this into consideration, so in
certain instances, traversal rates may exceed the configured host
load for brief periods. This load management issue will be corrected
in a subsequent release.
Known Issues
------------
Connector administrators should note the changes related to Issue 111
can result in traversal rates that may exceed the configured host load.
For most of the current Google-supplied Connectors this has little
impact. For instance the additional documents returned by the Livelink
and Documentum connectors are deleted documents, which pull no content
from the repository. However, the Sharepoint Connector might regularly
exceed the configured load. In this case, the administrator may wish
to reduce the configured load to compensate.
Release 2.0.4, 09 October 2009
===============================
Introduction
------------
This is a patch release that fixes a few small problems discovered in the
previous release. Users of previous releases are encouraged to upgrade if they
are pushing large documents or using the ImportExport utility.
Summary of Changes
------------------
* r2259: This change fixes an error in handling big documents. The code for
supplying the alternate content (title or space) for documents that exceed
the maximum document size was broken. The alternate content was the right
size, but was read into the wrong location of the I/O buffer.
* r2263: Fixed ImportExport utility on the branch. There was a series of
failures with the current ImportExport utility.
Release 2.0.2, 02 Sept 2009
============================
Introduction
------------
This is a patch release that addresses performance issues and
fixes several small problems discovered in the previous release.
Users of previous releases are encouraged to upgrade.
Summary of Changes
------------------
* Issue 162: Fixed a NullPointerException in logging XmlFormatter.
This fix now allows use of the XmlFormatter without requiring a
format to be specified.
* Fixed a NullPointerException in context logging on shutdown.
* Issue 106: Added support for feeding multiple documents to the GSA
in a single feed file. Previously, the Connector Manager would create
a new feed file for each Document. This was inefficient and could
result in slow feed performance. The Connector Manager will now
accumulate feed data into a single feed per connector traversal.
Once that feed exceeds a set size or when the traversal batch
completes, the feed is wrapped up and sent to the GSA. The default
maxFeedSize is 10MB, and is configurable in applicationContext.xml.
* Issue 180: Increased the default size of traversal batches from 100
documents to 500 documents. This provides improved efficiency in most
connectors. The size of traversal batches can now be configured
by setting the batchSize property of the HostLoadManager bean
in applicationContext.xml.
* Fixed a rounding error in the HostLoadManager documents-per-minute
computation.
* The Connector Manager now enforces the FileSizeLimitInfo.maxDocumentSize
configuration property. This property sets the maximum size of a Document's
content and is specified in applicationContext.xml. The default value is
30 megabytes. When constructing a document feed, if the number of bytes
read from the Document's content exceeds the maxDocumentSize, the
content will be discarded. The Document's meta-data will be supplied in
the feed, but its content will not.
It is still preferable for TraversalContextAware Connector implementations
to make use of the maxDocumentSize supplied in TraversalContext to avoid
feeding the content of large documents. If the Connector knows in advance
that a Document's content will exceed the maximum size, it can avoid the
Connector Manager pulling megabytes of data from the Repository, only to
have it discarded.
Release 2.0.0, 01 June 2009
============================
Introduction
------------
This is an upgrade release with some enhancements. Users of previous
releases are encouraged to upgrade. It also contains some new features.
Users of previous releases should check the "Version Compatibility" section
below for instructions on how to use existing data with this new release.
Summary of Changes
------------------
* Issues 79, 127, 134, 135: Connector Traversal Schedule improvements
that allow traversals to be disabled, paused, resumed, or to be empty.
* Issue 94: Enhanced documentation with regards to the meaning of
a null return value from DocumentList.checkpoint().
* Issue 107: Connector Instance name added to log messages makes it much
easier to troubleshoot problems with multiple Connector instances.
This has deployment configuration issues in manual deployments and
upgrading existing deployments. See the Compatibility section below.
* Issues 111, 119, 142: Infrastructure upgrades to Java 5 language features,
faster Base64 encoding in feeds, and deployment using Apache
Tomcat 6.0.18 and Spring Framework 2.5.6.
* Issue 114: Changes to connectorInstance.xml make it easier to add new
configuration parameters and to change the default values of existing
ones in the future, even if a customer has a customized configuration..
* Issue 122: The Connector's name is now passed to the connector via its
configuration properties so that connectors may know their own name.
* Issues 126, 145: Enhanced reliability and error handling in the
AuthenticationManager and AuthorizationManager.
* Issue 128: Invalid characters are now either removed or properly quoted
in the Feed XML attribute values.
* Issues 131, 148: "Domain" gets preserved when the Connector Manager creates
an AuthenticationIdentity for the Connector. This has compatibility
implications for connector writers. See Compatibility section below.
* Issues 133, 152: Fixed problems that would corrupt the Connector's
configuration or leave a partially constructed connector directory
on disk if an error occurred when creating a new Connector instance.
* Issues 146, 149: The Connector Manager servlet now has a simple main page
it may be useful for connectivity test, rather than returning a 404.
Version Compatibility
---------------------
Connector authors should note the Issues 131 and 148 changes to the
AuthenticationIdentity Interface, adding the Domain element and getDomain()
method. The AuthenticationIdentity supplied to the Connector's
AuthenticationManager and the AuthorizationManager. Older versions
of the Google Search Appliance do not supply the domain, so connectors often
required a configuration setting that specified the Windows domain used.
Connectors that implemented this work-around should continue to do so for the
indefinite future. However, if the connector received a domain from the GSA,
that domain should be used in deference to any locally configured domain.
For additional details, please see:
http://code.google.com/p/google-enterprise-connector-manager/issues/detail?id=131
http://code.google.com/p/google-enterprise-connector-manager/issues/detail?id=148
The Issue 107 changes provide enhanced logging capabilities that make it
easier to troubleshoot installations with multiple Connector instances.
This feature requires the configuration of a custom log formatter.
The Google Connectors Installer will automatically configure this properly.
However manual installations will require two small changes to enable this
feature:
- In $TOMCAT_HOME/webapps/connector-manager/WEB-INF/classes/logging.properties,
change the FileHandler.formatter to one of our custom formatters:
java.util.logging.FileHandler.formatter=com.google.enterprise.connector.logging.SimpleFormatter
or
java.util.logging.FileHandler.formatter=com.google.enterprise.connector.logging.XmlFormatter
- If $TOMCAT_HOME/webapps/connector-manager/WEB-INF/classes/logging.properties
specifies using the java.util.logging.FileHandler, then you must add the
new connector-logging.jar file to the system classpath that Tomcat uses at
startup. Tomcat ignores the CLASSPATH environment variable and builds a
custom classpath using the $TOMCAT_HOME/bin/setclasspath.sh or
$TOMCAT_HOME/bin/setclasspath.bat scripts. Modify these scripts, adding
connector-logging.jar to the CLASSPATH constructed: For instance:
CLASSPATH="$CLASSPATH":"$BASEDIR"/webapps/connector-manager/WEB-INF/lib/connector-logging.jar
Release 1.3.2, Apr 07, 2009
===========================
Introduction
------------
This is an upgrade release that addresses some problems discovered since
the 1.3.0 release. It also adds some enhanced security measures that
obfuscates configuration data communicated between the Google Search
Appliance and the Connector Manager, and prevents hijacking of a content feed.
These enhanced security measures impact Connector Manager administrators
and Connector Developers. For details see the Version Compatibility
section below. Users of previous releases are encouraged to upgrade.
Summary of Changes
------------------
* Fixed a Concurrency issue with the HostLoadManager.
* Fix Issue 137: Obfuscate configuration data transported between the
Google Search Appliance and the Connector Manager.