Closed
Bug 1353161
Opened 8 years ago
Closed 7 years ago
Send FxA data from basket to SFMC
Categories
(Websites :: Basket, enhancement)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: kirby, Assigned: pmac)
Details
Attachments
(1 file)
Cloud Services is sending a data feed to AWS with data about who is logging into their Firefox Account. For background, see: (https://bugzilla.mozilla.org/show_bug.cgi?id=1338939)
We need a daily feed from basket into Salesforce Marketing Cloud (ExactTarget). pmac has been provided credentials for AWS. External Key for the target data extension provided previously.
Assignee | ||
Comment 2•8 years ago
|
||
bniolet and I discussed storage and we decided that we only need to store the FxA_ID and the most recent login date. In this way we can keep the data extension much smaller and make updates easier to perform.
Flags: needinfo?(pmac)
Comment 3•8 years ago
|
||
Indeed we did.
And I already made FXA_ID the primary key in the data extension :pmac.
Assignee | ||
Comment 4•8 years ago
|
||
I've gotten into this, and wow, are these files large:
* Each file is a day's worth of login activity
* Each contains around 20 Million records
* After processing, that's around 7 Million unique FxA_IDs
I've also processed a full week's worth of files (all that are in the S3 bucket) and that resulted in around 10.5 Million records. That was from around 6.5 GB worth of text files. That's a whole lot of data transfer and processing and API calls, and what we'll be storing is not actually what we want. We want those users who aren't active now, but were recently. Would it be better if instead of recording active users if we asked FxA to send us a weekly file containing users that last logged in more than a week ago and less than 2 months ago (for example. numbers should be tweaked obviously). This would be exactly what we want, should be WAY less data to process and store, and be much quicker. If we needed an initial dump of inactives perhaps we could provide them a list of the FxA IDs we have, and they could tell us which of them are "inactive".
I'm just trying to think of alternatives because I'm not sure basket can process nearly 50 Million rows in a Data Extension per week (that's assuming around 7 Million per day for 7 days). I'm willing to try, but it feels like we'll run into limitations of their API.
Assignee | ||
Comment 5•8 years ago
|
||
This is my initial attempt. As soon as code review is complete we can test and see if my assumptions hold true.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → pmac
Status: NEW → ASSIGNED
Comment 6•7 years ago
|
||
Commits pushed to master at https://github.com/mozmar/basket
https://github.com/mozmar/basket/commit/0f3ae7cbc9cc41e2dad856cfd67a2a22d72a4b54
Fix bug 1353161: Import FxA Activity data into SFMC
* Download csv files from s3
* Parse csv files and get the most recent timestamps per fxa_id
* Update said timestamps in a Data Extension in SFMC
* Cache the timestamps in Redis to avoid so many SFMC API calls
* Cache which files we've successfully processed
These files are FxA login timestamps per day. Each one contains
around 20M rows. After processing all 8 (max in the bucket at a time)
there are around 10M records to update. This will take quite a while
per run.
https://github.com/mozmar/basket/commit/cb425cf56c1f10151923831cd3c62d4005270413
Merge pull request #17 from pmac/fxa-s3-info-to-sfmc-1353161
Fix bug 1353161: Import FxA Activity data into SFMC
Updated•7 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 7•7 years ago
|
||
I've just greatly improved this in https://github.com/mozmar/basket/commit/8944603aa36c5d49fb25ca812b2af97184583858. It switches this from using the SOAP API we use for smaller updates to the REST API which supports updating multiple Data Extension rows in a single request. This is allowing basket to update 1000 records per call and has reduced the time to update the data from 10 days to under 4 hours. This should be far more reliably up to date now.
Comment 8•7 years ago
|
||
OMG. You're awesome! Thanks, pmac!!!
You need to log in
before you can comment on or make changes to this bug.
Description
•