Closed Bug 983146 Opened 10 years ago Closed 6 years ago

[GSoC2014] [Week 6] All-At-Once terminology replacement method

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: gueroJeff, Unassigned)

References

(
URL
)

Details

Jeff Beatty [:gueroJeff]

Reporter

Description

•

10 years ago

Tracking bug.

All-At-Once Replacement Method: Regenerate the DOM with matched target terminology, output content into a new webpage, and render it.

Jeff Beatty [:gueroJeff]

Reporter

Comment 1

•

10 years ago

Terminology matching takes place over the entirety of extracted text from DOM, rather than node by node (or segment by segment). Regenerate the DOM with matched target terminology, output content into a new webpage, and render it.

Summary: [meta] Post-processes website regeneration → [meta] All-At-Once terminology replacement method

Jeff Beatty [:gueroJeff]

Reporter

Updated

•

10 years ago

Depends on: 983149

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

Keywords: meta

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

Summary: [meta] All-At-Once terminology replacement method → [GSoC2014] [Week 6] All-At-Once terminology replacement method

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

Keywords: meta

Jeff Beatty [:gueroJeff]

Reporter

Updated

•

10 years ago

No longer blocks: 983138

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

Blocks: 983250

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

No longer depends on: 983149

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

No longer blocks: 983144

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

Depends on: 983144

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

No longer blocks: 983143

Gordon P. Hemsley [:GPHemsley]

Updated

•

10 years ago

Blocks: 983138

URL: https://wiki.mozilla.org/Intellego/GS...

Tharshan

Comment 2

•

10 years ago

Last week I had progressed with the project in making a working prototype of web page translator. My mentor had pointed out a few issues to fix to improve the system. The TBX file we used had a few errors, some segment pairs had translated words separated by commas. The string replacement was also not taking into account pluralisation or capitalisation when replacing the source string. I also found that the text contents were sent all at once and I needed to attempt a different method to check for any difference. A segment by segment method seemed like a good approach.

The current approach to translating the text makes use of Javascript to replace the text found in the DOM. Through research I came across the NLTK in python and it has many utilities that we could reuse for our project such as tokenising segments of text. Moving the translation to the server side meant that utilities provided by NLTK could be used in the process to translate the DOM and sent to the browser already translated to the target language.

All hyperlinks on the website has to be changed so that when clicked it would load within the iframe and go through our proxy. Many sites including Mozilla Support sites have the X-Frame:Deny header, meaning that the website cannot be browsed within an IFrame. To get around this issue, we load each link through our translation engine - so it fetches the raw html, translates the DOM and send it to the client for the IFrame to load.

Jeff Beatty [:gueroJeff]

Reporter

Updated

•

6 years ago

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

[GSoC2014] [Week 6] All-At-Once terminology replacement method

Categories

(Intellego Graveyard :: General, defect)

Tracking

(Not tracked)

People

(Reporter: gueroJeff, Unassigned)

References

(
URL
)

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Updated

Updated

Updated

Updated

Updated

Updated

Updated

Updated

Updated

Updated

Comment 2

Updated