Closed Bug 1359224 Opened 7 years ago Closed 6 years ago

Write a service file for bulk_index.pl

Categories

(bugzilla.mozilla.org :: Infrastructure, enhancement)

Production
enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: dylan, Unassigned)

References

Details

I realize that I should have thought of this sooner, but we should provide a service file for running https://github.com/mozilla-bteam/bmo/blob/elasticsearch/scripts/bulk_index.pl. Right now I'm manually running it on screen for bugzilla-dev, but that is not sane for production. :-)
What's the best way to go about this? Should I ask you or dhouse to do it?
Flags: needinfo?(klibby)
I'll do it. Do we want to run this with cron, or are we creating a daemon?
Flags: needinfo?(klibby)
Daemon mode for prod at least.
Currently it makes no attempt at backgrounding itself. 
If you want logs, pass --verbose and it will output some minimal stuff to stdout otherwise it will just run forever.


Let me know if it needs to manage a pidfile.
(non-daemon mode would be running it with --once)
What configuration info is this looking for? Running it for prod throws an error about localhost:9200 (reasonably); I need to be able to run it to make sure it's working without BMO using it...  and dev is throwing a timeout error:

bugzillaadm.private.scl3#./www/bugzilla-dev.allizom.org/scripts/bulk_index.pl --verbose --once
[Timeout] ** [http://10.22.85.50:9200]-[599] Timed out while waiting for socket to become ready for reading, called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at Bugzilla/Elastic/Indexer.pm line 39. With vars: {'request' => {'body' => undef,'qs' => {},'method' => 'HEAD','serialize' => 'std','path' => '/bugzilla-dev','ignore' => []},'status_code' => 599}
Flags: needinfo?(dylan)
Configured for es-prod:9200 now. I should have put this in localconfig, but it's in data/params for the moment.
Flags: needinfo?(dylan)
Blocks: 1373433
Depends on: 1376869
dev is currently configured to connect to es1.stage.bugs (and timing out):

write(3, "HEAD http://10.22.85.50:9200/bugzilla-dev HTTP/1.1\r\nHost: 10.22.85.50:9200\r\nUser-Agent: HTTP-Tiny/0.058\r\n\r\n", 107) = 107

but the /etc/hosts entry we set up for es-dev (10.22.82.19) points at a zeus endpoint (which answers). Prod looks to be set up correctly, though.



Changing data/params on dev to use es-dev:9200, or running it as-is on prod, results in the following error, though:

[NoNodes] ** No nodes are available: [es-dev:9200], called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at Bugzilla/Elastic/Indexer.pm line 39.

There are no error logs on the ES side, so I'm not sure what the issue is offhand. Prod has a bunch of the following errors, but I don't think they are related to the above:

[2017-06-29 22:19:51,085][DEBUG][action.search.type       ] [es3_bugzilla_prod] [bugzilla][0], node[K_tGO_pRS-exfeasfMSSiQ], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@2e326048]
org.elasticsearch.transport.RemoteTransportException: [es2_bugzilla_prod][inet[/10.22.82.142:9300]][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.search.SearchParseException: [bugzilla][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"_source":true,"sort":[{"bug_severity.eq":""},"resolution.eq","assigned_to.eq",{"priority.eq":""},{"bug_status.eq":""},{"short_desc.eq":""},"_id"],"query":{"bool":{"must":[{"match":{"component.eq":"he / Hebrew"}},{"match":{"product.eq":"Mozilla Localizations"}},{"match":{"resolution.eq":""}}]}},"size":"500"}]]
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:687)
        at org.elasticsearch.search.SearchService.createContext(SearchService.java:543)
        at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:515)
        at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:277)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:776)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:767)
        at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.ElasticsearchIllegalArgumentException: sort direction [bug_severity.eq] not supported
        at org.elasticsearch.search.sort.SortParseElement.addCompoundSortField(SortParseElement.java:139)
        at org.elasticsearch.search.sort.SortParseElement.parse(SortParseElement.java:86)
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:671)
        ... 9 more
[2017-06-29 22:19:51,085][DEBUG][action.search.type       ] [es3_bugzilla_prod] All shards failed for phase: [query]


The prod cluster had been yellow, but restarting a couple of nodes cleared it to green; the above still happens (both the back trace and NoNodes error).
(In reply to Kendall Libby [:fubar] (PTO july 3-14) from comment #7)
> dev is currently configured to connect to es1.stage.bugs (and timing out):
> 
> write(3, "HEAD http://10.22.85.50:9200/bugzilla-dev HTTP/1.1\r\nHost:
> 10.22.85.50:9200\r\nUser-Agent: HTTP-Tiny/0.058\r\n\r\n", 107) = 107
> 
> but the /etc/hosts entry we set up for es-dev (10.22.82.19) points at a zeus
> endpoint (which answers). Prod looks to be set up correctly, though.
> 
> 
> 
> Changing data/params on dev to use es-dev:9200, or running it as-is on prod,
> results in the following error, though:
> 
> [NoNodes] ** No nodes are available: [es-dev:9200], called from sub
> Search::Elasticsearch::Role::Client::Direct::__ANON__ at
> Bugzilla/Elastic/Indexer.pm line 39.
> 
> There are no error logs on the ES side, so I'm not sure what the issue is
> offhand. Prod has a bunch of the following errors, but I don't think they
> are related to the above:
> 
> [2017-06-29 22:19:51,085][DEBUG][action.search.type       ]
> [es3_bugzilla_prod] [bugzilla][0], node[K_tGO_pRS-exfeasfMSSiQ], [P],
> s[STARTED]: Failed to execute
> [org.elasticsearch.action.search.SearchRequest@2e326048]
> org.elasticsearch.transport.RemoteTransportException:
> [es2_bugzilla_prod][inet[/10.22.82.142:9300]][indices:data/read/search[phase/
> query]]
> Caused by: org.elasticsearch.search.SearchParseException: [bugzilla][0]:
> from[-1],size[-1]: Parse Failure [Failed to parse source
> [{"_source":true,"sort":[{"bug_severity.eq":""},"resolution.eq","assigned_to.
> eq",{"priority.eq":""},{"bug_status.eq":""},{"short_desc.eq":""},"_id"],
> "query":{"bool":{"must":[{"match":{"component.eq":"he /
> Hebrew"}},{"match":{"product.eq":"Mozilla
> Localizations"}},{"match":{"resolution.eq":""}}]}},"size":"500"}]]
>         at
> org.elasticsearch.search.SearchService.parseSource(SearchService.java:687)
>         at
> org.elasticsearch.search.SearchService.createContext(SearchService.java:543)
>         at
> org.elasticsearch.search.SearchService.createAndPutContext(SearchService.
> java:515)
>         at
> org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:
> 277)
>         at
> org.elasticsearch.search.action.
> SearchServiceTransportAction$SearchQueryTransportHandler.
> messageReceived(SearchServiceTransportAction.java:776)
>         at
> org.elasticsearch.search.action.
> SearchServiceTransportAction$SearchQueryTransportHandler.
> messageReceived(SearchServiceTransportAction.java:767)
>         at
> org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.
> run(MessageChannelHandler.java:275)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> 1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
> 617)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.elasticsearch.ElasticsearchIllegalArgumentException: sort
> direction [bug_severity.eq] not supported
>         at
> org.elasticsearch.search.sort.SortParseElement.
> addCompoundSortField(SortParseElement.java:139)
>         at
> org.elasticsearch.search.sort.SortParseElement.parse(SortParseElement.java:
> 86)
>         at
> org.elasticsearch.search.SearchService.parseSource(SearchService.java:671)
>         ... 9 more
> [2017-06-29 22:19:51,085][DEBUG][action.search.type       ]
> [es3_bugzilla_prod] All shards failed for phase: [query]
> 
> 
> The prod cluster had been yellow, but restarting a couple of nodes cleared
> it to green; the above still happens (both the back trace and NoNodes error).

Sorry, I should have realized -- you need to unset all the proxy vars because elasticsearch will try to use them.
boo for ES obeying http(s)_proxy env vars but NOT also obeying no_proxy env vars.

configuration for dev needs changing to 'es-dev:9200':
           'elasticsearch_nodes' => 'es1.stage.bugs.scl3.mozilla.com:9200',

configuration for stage is missing, should be 'es-stage:9200' (I can never remember if it's safe to change data/params by hand, so I default to not. if that's something I can do, let me know)

prod is good, but I don't want to test/debug on prod :-D

dev is also throwing a new error:

bugzillaadm.private.scl3# cd -
/data/bugzilla-dev/www/bugzilla-dev.allizom.org
bugzillaadm.private.scl3# ./scripts/bulk_index.pl --once
Running after 0 seconds
[Request] ** [http://10.22.85.50:9200]-[400] ElasticsearchIllegalArgumentException[Can't find default or mapped analyzer with name [autocomplete]], called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at Bugzilla/Elastic/Indexer.pm line 159. With vars: {'body' => {'status' => 400,'error' => 'ElasticsearchIllegalArgumentException[Can\'t find default or mapped analyzer with name [autocomplete]]'},'request' => {'body' => {'properties' => {'is_enabled' => {'type' => 'boolean'},'es_mtime' => {'type' => 'long'},'userid' => {'type' => 'long','analyzer' => 'keyword'},'suggest_user' => {'search_analyzer' => 'folding','payloads' => \1,'type' => 'completion','analyzer' => 'folding'},'name' => {'type' => 'string'},'suggest_nick' => {'payloads' => \1,'type' => 'completion','analyzer' => 'autocomplete'},'login' => {'type' => 'string'}}},'qs' => {},'method' => 'PUT','serialize' => 'std','path' => '/bugzilla-dev/_mapping/user','ignore' => [],'mime_type' => 'application/json'},'status_code' => 400}
Flags: needinfo?(dylan)
(In reply to Kendall Libby [:fubar] from comment #9)
> boo for ES obeying http(s)_proxy env vars but NOT also obeying no_proxy env
> vars.
> 
> configuration for dev needs changing to 'es-dev:9200':
>            'elasticsearch_nodes' => 'es1.stage.bugs.scl3.mozilla.com:9200',
> 
> configuration for stage is missing, should be 'es-stage:9200' (I can never
> remember if it's safe to change data/params by hand, so I default to not. if
> that's something I can do, let me know)

Currently it would be safe, but not a good habit to be in. I think I'll want to eventually give (some) operations people a specific permissions to edit the params on the admin page. For the moment I'll VPN up and make that change.

> prod is good, but I don't want to test/debug on prod :-D


Elasticsearch is off on prod at the moment; I realized I might want to make the indexer no-op when this is the case so I'm going to file another bug about that. Shouldn't impact anything else though.

> dev is also throwing a new error:
> 
> bugzillaadm.private.scl3# cd -
> /data/bugzilla-dev/www/bugzilla-dev.allizom.org
> bugzillaadm.private.scl3# ./scripts/bulk_index.pl --once
> Running after 0 seconds
> [Request] ** [http://10.22.85.50:9200]-[400]
> ElasticsearchIllegalArgumentException[Can't find default or mapped analyzer
> with name [autocomplete]], called from sub
> Search::Elasticsearch::Role::Client::Direct::__ANON__ at
> Bugzilla/Elastic/Indexer.pm line 159. With vars: {'body' => {'status' =>
> 400,'error' => 'ElasticsearchIllegalArgumentException[Can\'t find default or
> mapped analyzer with name [autocomplete]]'},'request' => {'body' =>
> {'properties' => {'is_enabled' => {'type' => 'boolean'},'es_mtime' =>
> {'type' => 'long'},'userid' => {'type' => 'long','analyzer' =>
> 'keyword'},'suggest_user' => {'search_analyzer' => 'folding','payloads' =>
> \1,'type' => 'completion','analyzer' => 'folding'},'name' => {'type' =>
> 'string'},'suggest_nick' => {'payloads' => \1,'type' =>
> 'completion','analyzer' => 'autocomplete'},'login' => {'type' =>
> 'string'}}},'qs' => {},'method' => 'PUT','serialize' => 'std','path' =>
> '/bugzilla-dev/_mapping/user','ignore' => [],'mime_type' =>
> 'application/json'},'status_code' => 400}

I can fix that, working on that now
Flags: needinfo?(dylan)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.