Closed Bug 912111 Opened 11 years ago Closed 11 years ago

Firefox displays same product announcement repeatedly

Categories

(Cloud Services :: Server: Product Announcements Campaign Manager, defect)

ARM
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: edwong, Assigned: edwong)

Details

A test product announcement was posted and saw it on my (galaxy s4 with nightly) phone every 24 hours.  The same announcement should not nag me over and over, it will result in a user uninstalling firefox.

1. sign in here: https://campaigns.services.mozilla.com/author/1/ and deploy an announcement to obscure locale/geo.
2. on android - set your phone to that locale/geo
3. wait (or install an add-on that will fetch it)

result: the same announcement will appear every 24 hrs

expected: the same announcement should only appear one time for the user.  There should be logic to see if this announcement has displayed before.  Otherwise we nag the hell out of the user.
Summary: Android 4.x displays same product announcement repeatedly → Firefox displays same product announcement repeatedly
This is normally controlled by only displaying notices with a creation date plus idle days later than the If-Modified-Since header. This might be an issue with the current caching strategy. I'm happy to work with bwong to see if there's a way to provide data back in a more cacheable manner.

(Otherwise, we may have to push on getting the more efficient server in place.)
I suspect this is a cache issue, because this code was thoroughly tested and hasn't changed. We pass the last timestamp every time we fetch.
assigning to benson
Assignee: nobody → bwong
when I curl my test - I see this as id: 
137781954200 which is: GMT: Tue, 14 May 1974 16:45:54 GMT


curl -v https://campaigns.services.mozilla.com/announce/1/qa/all/all/all | python -mjson.tool
{
    "announcements": [
        {
            "id": 137781954200,
            "text": "quick brown fox jumped ov",
            "title": "Testing 1...2...3...",
            "url": "https://campaigns.services.mozilla.com/redirect/1/1vfbNPfPO"
        }
    ]
}
curl -H "If-Modified-Since: Wed, 4 Sep 2013 23:00:00 GMT" -v https://campaigns.services.mozilla.com/announce/1/qa/all/all/all
=>
{"announcements": [{"url": "https://campaigns.services.mozilla.com/redirect/1/1vfbNPfPO", "text": "quick brown fox jumped ov", "id": 137781954200, "title": "Testing 1...2...3..."}]}
Component: Product Announcements → Server: Product Announcements Campaign Manager
Product: Android Background Services → Mozilla Services
Version: Firefox 26 → unspecified
This bug blocks any new product announcements from going out.
Whiteboard: qa+
edwong, sorry I must have missed this. Apologize for the late response.

nginx *does not* cache HTTP 200 responses from the python app server. An easy way to test if this is true is if the metrics for a campaign go up everytime you curl it. You can see the metrics if you log into the admin tool.

The only thing it caches is HTTP 204 responses, which is "no content", and make up the bulk of requests from clients. 

Here is the nginx config. Note the proxy_cache_valid field.


proxy_cache_path /tmp/nginx-cache levels=1:2 keys_zone=campaignCache:100m;

log_format clog '$time_iso8601 $request $status '
                '$upstream_cache_status $upstream_response_time';
server {
    listen 80 default_server;
    server_name _;

    error_log /media/ephemeral0/logs/nginx/campaign.error.log;

    # Proxy all requests to the the python app
    location / {
        proxy_pass http://localhost:8081;
        proxy_set_header X-Real-IP $remote_addr;

        # important for Pyramid app to know what to set audience value to
        # for Persona login to work
        proxy_set_header Host $http_host;

        set $no_cache "";
        proxy_no_cache $no_cache;
        proxy_cache_bypass $no_cache;
    }

    location /announce {
        proxy_pass http://localhost:8081;
        proxy_set_header X-Real-IP $remote_addr;

        # caching
        proxy_cache campaignCache;
        proxy_cache_key "$request_method$host$request_uri";

        # note we don't cache 200's responses since we want the python app to handle
        # them and be able to record the metrics
        proxy_cache_valid 204 302 404 3600s;

        proxy_cache_use_stale updating http_502;

        access_log /media/ephemeral0/logs/nginx/campaign.announce.access.log clog;
    }
}
benson, i think the issue is that 'If-Modified-Since' is returning even if it hasn't been modified.  I don't know how this header is set.  For example, this shouldn't return since the announcement hasn't been modified.

curl -H "If-Modified-Since: Wed, 13 Sep 2013 23:00:00 GMT" -v https://campaign.stage.mozaws.net/announce/1/android/all/all/all
I think we need :jrconlin to comment on the use of If-Modified-Since, since there is also a get param, idle=#days (I think it is idle) that is also sent from clients
If-Modified-Since is the last time that the client has requested a campaign.

idle is the number of days since the firefox application has been actively used.

These numbers are not the same, nor do they reflect the same value.
I did some research into this and it is a caching bug w/ the nginx configuration. Basically, we should compare what comes out of the python app directly w/ what comes out of nginx cache. This was tested directly on one of the app servers. 

= TO THE PYTHON APP =
[ec2-user@i-3910285b ~]$ curl -H "If-Modified-Since: Wed, 16 Sep 2013 23:00:00 GMT" -v http://localhost:8081/announce/1/qa/all/all/all; echo
* About to connect() to localhost port 8081 (#0)
*   Trying ::1... Connection refused
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 8081 (#0)
> GET /announce/1/qa/all/all/all HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: localhost:8081
> Accept: */*
> If-Modified-Since: Wed, 16 Sep 2013 23:00:00 GMT
>
< HTTP/1.1 304 Not Modified
< Content-Length: 0
< Date: Mon, 16 Sep 2013 18:07:00 GMT
< Server: waitress
<
* Connection #0 to host localhost left intact
* Closing connection #0

= TO NGINX CACHING =
[ec2-user@i-3910285b ~]$ curl -H "If-Modified-Since: Wed, 16 Sep 2013 23:00:00 GMT" -v http://localhost/announce/1/qa/all/all/all; echo
* About to connect() to localhost port 80 (#0)
*   Trying ::1... Connection refused
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET /announce/1/qa/all/all/all HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: localhost
> Accept: */*
> If-Modified-Since: Wed, 16 Sep 2013 23:00:00 GMT
>
< HTTP/1.1 200 OK
< Server: nginx/1.4.2
< Date: Mon, 16 Sep 2013 18:07:03 GMT
< Content-Type: application/json; charset=UTF-8
< Content-Length: 163
< Connection: keep-alive
<
* Connection #0 to host localhost left intact
* Closing connection #0
{"announcements": [{"url": "https://campaigns.services.mozilla.com/redirect/1/1vlCr3daR", "text": "Test test test", "id": 137935188600, "title": "Test Campaign"}]}

The nginx results should return a 304 Not Modified. 

The fix: 

- app should return a Last-Modified with the creation time of the campaign. 
- nginx config needs a if_modified_since set to "before", ref: http://wiki.nginx.org/HttpCoreModule#if_modified_since
Assign to :jrconlin to add the "Last-Modified" header. Then he can assign back to me to update the nginx configuration.
Assignee: bwong → jrconlin
submitted PR https://github.com/mozilla-services/campaign_manager/pull/5
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Reopening bug as i'm doing some testing on the server configs.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: qa+ → [qa+]
So this is fixed now. We'll be rolling out a new version of campaign manager to the staging environment. It will include: 

- new user login functionality (https://bugzilla.mozilla.org/show_bug.cgi?id=920591#c6)
- bug fix for duplicate messages (nginx config issue, not passing If-Modified-Since to CampaignManager)
Assignee: jrconlin → ckolos
arn:aws:cloudformation:us-east-1:142069644989:stack/cammgr-qa-48aa0/99fc5310-27ae-11e3-a82b-50e241629418

cammgr-qa-48aa0 is running. Please test.
Assignee: ckolos → edwong
verified this issue as fixed on staging.  Removing qa+
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Whiteboard: [qa+]
You need to log in before you can comment on or make changes to this bug.