-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error management on random-connector-incremental #60
Comments
Hi @matteogrolla , Thank you for posting your questions.
|
Hi @matteogrolla, For this one:
Are you saying that you'd like the job to stop immediately, due to the exception that was thrown?
Do you have another way you'd like errors to behave? |
Hi @mwmitchell, In the context of a batch job errors can be partitioned in
Most errors should be thrown during the communication with the documents source (a web service, a mail server...), but if I'm not wrong the connector framework is a distributed system, so even fetchContext's emits are not error free and I'd like to understand what happens when these errors arise. Here are some practical scenarios that I have to deal with -scenarios A source system goes offline (retriable exception needing many retries) next morning QUESTION: what happens if the crawl is stopped when source is offline? and maybe fusion is restarted? -- scenario A2: a proposal next morning someone (or maybe a scheduler) restarts the crawl -scenario B wrong request to source system (unretriable exception that shuld stop the crawl)
QUESTION: I don't understand the responsibility of fetchContext.newResult() |
Hi,
I'm Matteo Grolla from Sourcesense, Lucidwork's partner in Italy.
I'm developing a custom connector for a customer but I have questions about error management, Robert Lucarini suggested to post my questions here.
Let's use random-content-incremental for our discussion and let's focus on the fetch method
What I've noticed is:
How can I terminate the crawl marking it as failed?
I'd like that next time I restart the crawl it proceeds from last saved checkpoint
Will this document be recrawled? When? Can we control this?
Thanks a lot
The text was updated successfully, but these errors were encountered: