Closed Bug 733418 Opened 14 years ago Closed 13 years ago

ES not returning consistent results for some queries

Categories

(Mozilla Metrics :: Metrics Operations, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX
Unreviewed

People

(Reporter: jgriffin, Unassigned)

Details

(Whiteboard: handing over to IT)

ES is sometimes failing to return results for a query that should in fact return some. For example: http://elasticsearch1.metrics.sjc1.mozilla.com:9200/logs/builds/_search?q=starttime:1330704266 This sometimes returns no results, and sometimes returns 1 (1 being correct). If I further shorten the query to omit the doc_type: http://elasticsearch1.metrics.sjc1.mozilla.com:9200/logs/_search?q=starttime:1330704266 I always get 0 hits, even though I should get 3. Strangely, this only seems to occur with this specific field (starttime), which holds an integer, not a string. If I query the same documents using another (string) field, the query always succeeds: http://elasticsearch1.metrics.sjc1.mozilla.com:9200/logs/builds/_search?q=testgroup_id:1330704266-f80af377a148f1205f4f6ed9441244501b3c46a0 Why is ES failing to (sometimes) return the correct results for the integer field? I should note that the mapping (http://elasticsearch1.metrics.sjc1.mozilla.com:9200/logs/builds/_mapping) indicates starttime as a string; is this the source of the problem? Even so, I don't understand the inconsistency in the returned hits.
This is breaking some of the functionality of the OrangeFactor website; any idea when we can get this fixed?
aphadke, could you please take this to the ES forums or IRC and see if we can get to the bottom of it? Might be a mapping problem or it could be a bug with the version this cluster is running.
Group: metrics-private
I posted the question to ES mailing list @ https://groups.google.com/group/elasticsearch/browse_thread/thread/f9d2565a97a96cd1 Will keep everyone posted..
Assignee: cliang → nobody
Any updates?
We decided this depends on an ES upgrade (bug 742465). But I'm not sure whether IT or metrics will do this upgrade. I'm also not sure if we're intended to move to the IT ES cluster, or stay on the metrics one. Anurag, do you have any insight?
:jgriffin IT is best equipped to handle ES instances and buildbot should definitely move to the IT ES cluster. I'll co-ordinate with cyliang on how we roll this process. wrt data migration: afaik, we can simply copy the data over and it *should* work, though, I am not sure if it will resolve the "consistency bug". More prudent way, assuming _source is stored as a field in ES, would be to write export / import scripts and move the data over to the new version (http://elasticsearch-users.115913.n3.nabble.com/How-to-export-a-cluster-index-td3440858.html)
Thanks anurag. I'd definitely be in favor of exporting/importing, as we'd need to do this anyway to fix the problem with text analyzers being enabled for some of our data.
Checking in with Corey Shields, it sounds like the IT cluster will need additional capacity before taking on more load. :jgriffin -- How badly is the functionality of the OrangeFactor website impaired? Is the impairment great enough that it would be better to do an upgrade to existing elasticsearch[1-3].metrics.scl3 systems?
(In reply to C. Liang [:cyliang] from comment #8) > Checking in with Corey Shields, it sounds like the IT cluster will need > additional capacity before taking on more load. > > :jgriffin -- How badly is the functionality of the OrangeFactor website > impaired? Is the impairment great enough that it would be better to do an > upgrade to existing elasticsearch[1-3].metrics.scl3 systems? There are some less-used features which are completely broken, and others that we'd like to implement which won't work until this is resolved. None of these are super high priority, but I'd rather not have them languishing for months either. Do you have a rough guesstimate about how long it would take to add more capacity to the IT ES cluster?
C., Could you file an IT bug specifically for getting bugbot moved over to IT (if there isn't already one) and then we can dupe this forward to it?
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #10) > C., > > Could you file an IT bug specifically for getting bugbot moved over to IT > (if there isn't already one) and then we can dupe this forward to it? By 'bugbot', do you mean the ES indices used by OrangeFactor?
yup, meant bugbot thats used by orangefactor.
As per DE, this is being handed off to IT
Status: NEW → ASSIGNED
Whiteboard: handing over to IT
Superseded by bug 772503.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.