Closed Bug 852936 Opened 11 years ago Closed 10 years ago

release runner died because of unprintable unicode characters in revision field

Categories

(Release Engineering :: Release Automation: Other, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: bhearsum, Unassigned)

Details

(Whiteboard: [shipit])

The Fennec 20.0b6 release died when release runner tried to write out its configs. The error we got was:
Traceback (most recent call last):
  File "release-runner.py", line 345, in <module>
    ssh_username=hg_username, ssh_key=hg_ssh_key)
  File "/home/cltbld/release-runner/tools/lib/python/util/hg.py", line 469, in apply_and_push
    changer(localrepo, 1)
  File "release-runner.py", line 328, in process_configs
    productionBranch=buildbot_configs_branch)
  File "release-runner.py", line 149, in bump_configs
    f.write(releaseConfig)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1704-1705: ordinal not in range(128)

Also in the output was:
'mozillaRevision': u'\ufeff\ufeff63f07a9ed826'

\ufeff is unicode's "ZERO WIDTH NO-BREAK SPACE", which means we had two unprintable characters at the start of this field. Interestingly, the companion Firefox release didn't have the same.

Lukas, do you recall if you did anything different for the Fennec release or where you copy/pasted the revision from?

Seems like the web interface should either be stripping these characters or putting up a fuss about them.
Flags: needinfo?(lsblakk)
I'd say strip and validate. Assuming that we always want an explicit changeset (not tag nor branch name) the following snippet should help:

try:
    _ = int(revision, 16)
    return True
except (TypeError, ValueError):
    return False
(In reply to Rail Aliiev [:rail] from comment #1)
> I'd say strip and validate. Assuming that we always want an explicit
> changeset (not tag nor branch name) the following snippet should help:
> 
> try:
>     _ = int(revision, 16)
>     return True
> except (TypeError, ValueError):
>     return False

I'm not sure how I feel about making it impossible to use branch/tag names. It's certainly not recommended, but it wouldn't break the automation to do so. If we take this approach we definitely need to allow for full length revisions to account for the extremely unlikely case that the short revision has a collision.
Thinking about this a bit more I realized that we could have these sorts of characters in any field, not just revision. I'm leaning towards doing nothing about this as it seems like a freak incident. If it becomes a problem we can do something like subclass StringField to strip out unicode data or something...
Priority: -- → P3
It's possible I copied from TBPL, but it's been a while.
Flags: needinfo?(lsblakk)
Product: mozilla.org → Release Engineering
Haven't seen this again...
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.