Closed Bug 228444 Opened 21 years ago Closed 8 years ago

Checking for unquoted non regex variable interpolated into regex.

Categories

(Bugzilla :: Testing Suite, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 6.0

People

(Reporter: CodeMachine, Assigned: mail)

References

Details

Attachments

(1 file, 2 obsolete files)

Having an unquoted non regex variable interpolated into regex is a potential source of bugs. It caused the major data loss issue with bugs activity in the past. I have written a script which searches for these. It finds about 10 instances in the current codebase. At least some of these are actually interpolations of regex variables and therefore not a bug, but there is no consistent naming convention to declare this. Some vars end with "exp" and some with "pattern" and so on. I think the best option is "regex" as it is the most accurate and obvious to the reader. It's difficult to get all instances of this without a Perl parser but I think I've got most of the possibilities.
Attached file Check for unsafe interpolations. (obsolete) —
*** Bug 123198 has been marked as a duplicate of this bug. ***
Comment on attachment 137399 [details] Check for unsafe interpolations. Dave can you please take a look at this or play 'hot potato' please. I'll attach a patch fixing the failures before approval if it gets reviewed +.
Attachment #137399 - Flags: review?(justdave)
Attached file Saner version of the patch. (obsolete) —
Attachment #137399 - Attachment is obsolete: true
Attachment #167081 - Flags: review?(justdave)
Attachment #137399 - Flags: review?(justdave)
Attachment #167081 - Attachment is patch: false
Comment on attachment 167081 [details] Saner version of the patch. # The empty parentheses are to ensure all the patterns have corresponding extraction positions. This is simply not enough of a comment in order to give an insight of what's following. The comment should be more detailed, in order to include: -> the API (what the extraction position of each paranthese means) -> an explanation regarding the fact that the first part is the beginning of what we're searching, and the second part is the end of our search. Also, I don't like the fact that we have special scenarios for s and m. Even more, the test fails to detect things like: $var =~ s#a#aa#; despite detecting: if ($var =~ m#a#) ... So I don't like that / is hard-coded for the "s" option, but / can be replaced with anything else for the "m" option (that is, the difference between /() ... ()/ for 's' and (.) ... (.) for 'm'). As far as I know both are valid. Otherwise this looks good to go. If someone posts a second version, it would be worthy to post also a fix of the current trunk failures. They are 8 failures at this moment, and they look worthy to be fixed.
Attachment #167081 - Flags: review?(justdave) → review-
QA Contact: mattyt-bugzilla → default-qa
(In reply to comment #5) > This is simply not enough of a comment in order to give an insight of what's > following. The comment should be more detailed, in order to include: The extraction positions are arbitrary, all that is important is that each pattern has pre, id and post in the same $num position to simplify what's going on below, so each can be assumed to be at a specific position. This has led to a few patterns being separate that might be merged, such as =~ versus !~, to avoid extra brackets breaking this. Originally I didn't do this, but the code was longer and I felt it was better that way. If there's a way to "name" brackets, or having non-numbered brackets that'd be better but I'm not aware of a way. Alternatives that are probably even more complicated are having a number array with each expression to indicate the positions into some sort of $num array. Or counting the number of brackets in the expressions and accounting for them, but you'd have to deal with \( and such. > -> an explanation regarding the fact that the first part is the beginning of > what we're searching, and the second part is the end of our search. It's the text before/after the regex. > Also, I don't like the fact that we have special scenarios for s and m. Not sure what this means. > Even more, the test fails to detect things like: It's not intended to be complete, nor does it need to be. Feel free to improve it, otherwise there's no reason for false negatives to block checkin. > So I don't like that / is hard-coded for the "s" option, but / can be > replaced with anything else for the "m" option I probably wasn't aware of that usage of s.
Assignee: zach → testing
Fixes the following error (the only one mentioned by test in 5.22 / Ubuntu 16.04): Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in m/\${ <-- HERE [^}]+}/ at t/010dependencies.t line 72.
Assignee: testing → mail
Attachment #167081 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8798794 - Flags: review?(dkl)
Comment on attachment 8798794 [details] [diff] [review] bug228444-v1.patch Review of attachment 8798794 [details] [diff] [review]: ----------------------------------------------------------------- r=dkl
Attachment #8798794 - Flags: review?(dkl) → review+
To github.com:bugzilla/bugzilla.git 0677903..3c60fba master -> master
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Bugzilla 6.0
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: