[Compat Data] Improve MDN importer, Round 2

RESOLVED FIXED

Status

--
enhancement
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: jwhitlock, Assigned: jwhitlock)

Tracking

Details

(Whiteboard: [specification][type:feature])

(Assignee)

Description

4 years ago
What problems would this solve?
===============================
Importing data from MDN into the API is an iterative process:

1) A parser extracts data from an MDN page and reports any detected data issues,
2) A human fixes data issues and badly scraped data by changing the MDN page, rerunning the parser as needed,
3) The page is imported into the API,
4) A human adds additional information to the API,
5) The MDN page's tables are replaced with versions generated from the API
6) Further data additions are made in the API rather than on the MDN page.

One part of making this process effective is improving the parser.

Who would use this?
===================
MDN staff and volunteers who are converting MDN pages to use API-backed compatibility data

What would users see?
=====================
Importer issues would be limited to data quality issues and issues that are best fixed manually.

What would users do? What would happen as a result?
===================================================
MDN staff and volunteers will quickly convert the 1000+ pages with compatibility data to use the API in Q2 2015.

Is there anything else we should know?
======================================
This is a tracking bug for desired MDN importer issues.  The goal is not a perfect importer.  The importer is temporary code, and will be discarded after the import task is done.  Instead, the goal is for MDN staff to determine what importer improvements are worth doing, and which are human-level tasks.

Propose improvements as bugs blocking this bug, and use +! or CC to signal votes for that improvement.  The top voted-improvements will be estimated and bundled up as Q2 2015 deliverables as budget allows.
(Assignee)

Updated

4 years ago
Blocks: 996570
Severity: normal → enhancement

Updated

4 years ago
Blocks: 1132781

Updated

4 years ago
No longer blocks: 1132781
Depends on: 1132781
Regarding the bugs filed by :fscholz above, we may need to discuss with the writing team about dedicating some time to testing/improving this. :groovecoder, can you reach out to them?
Flags: needinfo?(lcrouch)
After a quick chat with Ali, she told me that Jeremie is still acting as a stakeholder, representing us as the initial customers for this project. This should be enough, we will go through him.
(Assignee)

Updated

4 years ago
Depends on: 1134373

Updated

4 years ago
Depends on: 1134426

Updated

4 years ago
Depends on: 1134450

Updated

4 years ago
Depends on: 1134474

Updated

4 years ago
Depends on: 1134586

Updated

4 years ago
Depends on: 1134587

Updated

4 years ago
Depends on: 1134624

Updated

4 years ago
Depends on: 1135000

Updated

4 years ago
Depends on: 1135060
(Assignee)

Updated

4 years ago
Depends on: 1138455
(Assignee)

Updated

4 years ago
Depends on: 1138458
(Assignee)

Updated

4 years ago
Depends on: 1139433
(Assignee)

Updated

4 years ago
Depends on: 1139619
(Assignee)

Updated

4 years ago
Depends on: 1140009
Clearing my needinfo. The writing team is sending feedback thru Jeremie, and Trevor Hobson is even sending pull requests to the code itself. [1]

[1] https://github.com/jwhitlock/web-platform-compat/pulls
Flags: needinfo?(lcrouch)
Commits pushed to master at https://github.com/mozilla/web-platform-compat

https://github.com/mozilla/web-platform-compat/commit/e8cd01947348efdc0c0f212ba79db7d1ca7f8226
bug 1132269 - Refactor attribute handling

Capture attribute details at leaf node, and consume attributes higher in
the parse tree, where validation can be customized.  Add more tests for
the Specification <h2>, for code coverage.

https://github.com/mozilla/web-platform-compat/commit/36ee8a99f11b6f0ac658b8f2360c272c7bafd4db
Merge pull request #23 from jwhitlock/1132269_more_importer

bug 1132269 - Various importer fixes
(Assignee)

Updated

4 years ago
Depends on: 1153260
Depends on: 1154349
(Assignee)

Updated

4 years ago
Depends on: 1164311
(Assignee)

Updated

4 years ago
Depends on: 1170196
(Assignee)

Updated

4 years ago
Depends on: 1170199
(Assignee)

Updated

4 years ago
Depends on: 1170206
Depends on: 1174808
Depends on: 1175177
(Assignee)

Updated

4 years ago
No longer depends on: 1170709
(Assignee)

Updated

4 years ago
No longer depends on: 1175177
(Assignee)

Updated

4 years ago
No longer depends on: 1174808
(Assignee)

Updated

4 years ago
No longer depends on: 1180573
(Assignee)

Updated

4 years ago
No longer depends on: 1183593
(Assignee)

Updated

3 years ago
Assignee: nobody → jwhitlock
(Assignee)

Updated

3 years ago
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Commit pushed to master at https://github.com/mozilla/web-platform-compat

https://github.com/mozilla/web-platform-compat/commit/3e91e3671a6622943059c6560b09eaf0f4cde90e
bug 1132269 - Handle canonical feature names

In the sample JS display and the browse app, handle features with
canonical names, which are encoded as strings rather than objects.
(Assignee)

Comment 6

3 years ago
All round 2 PRs are merged.
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.