Closed Bug 1915352 Opened 1 month ago Closed 1 month ago

Blocked dialog displayed when pasting random text from Google Docs

Categories

(Firefox :: Data Loss Prevention, defect)

Firefox 131
Desktop
Windows
defect

Tracking

()

RESOLVED INVALID
Tracking Status
firefox131 --- affected

People

(Reporter: bhidecuti, Unassigned)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

Attached video blocked dialog

Found in

  • 131.0a1 (2024-08-27)

Affected versions

  • 131.0a1

Preconditions

  • Download the DLP test assets from https://drive.google.com/file/d/1yjqVRuxdKV3WnO7D2wzMgDXBuYBxUgVw/view
  • Create a distribution folder inside the Firefox folder and paste the policies-1.json to it and then rename it to policies.json
  • Run the DLP agent in CMD using: .\content_analysis_sdk_agent.exe --user --toblock=.\d{3}-?\d{2}-?\d{4}. --towarn=.warn. --delays=10

Tested platforms

  • Affected platforms: Windows 10/11
  • Unaffected platforms: Ubuntu, macOS

Steps to reproduce

  1. Go to https://drive.google.com/drive/home and sign in
  2. Open up a Google Doc and type “ok text”/”sample text”/any random word that should not be blocked by DLP and copy it
  3. Open a second Google Doc or navigate to a random website (e.g. Wikipedia)
  4. Paste the copied text from step 2
  5. Observe the behavior

Expected result

  • Blocked dialog is not shown after the content is scanned

Actual result

  • Blocked dialog is shown after the content is scanned (it gets pasted anyway, as it should)

Regression range

  • Potentially regressed by 1912384, but I am not 100% sure, since I cannot determine the regression range using Mozregression

Additional notes

  • See the attached video
  • Not reproducing if pasting from Notepad or from other website

In the video I see the top of the console that is running the DLP agent says "Block" - can you copy and paste what was being analyzed here?

I'm pretty sure this is a limitation of our demo DLP agent. When copying data from Google Docs, it stores data in a custom format which includes a large string of letters and numbers. (I suspected this was base64-encoded, but I played around with it a little and that doesn't seem to be right) So I bet that large string is hitting the 9 digit regex that the demo DLP agent is using to determine whether to block something. I'm expecting that real DLP agents have figured this out and can inspect this.

For Google Docs I'd recommend getting rid of the ? characters in the --toblock string - this will make it match only strings that have the dashes in it, like "123-45-6789", and that should be enough to make it not trigger in this case.

(leaving this open just to be sure that the string being analyzed above is as I described)

Flags: needinfo?(bhidecuti)

(oh, and if I'm right about this, this probably started with bug 1913760)

Attached file 1915352.txt

In the video I see the top of the console that is running the DLP agent says "Block" - can you copy and paste what was being analyzed here?

Of course, attached you can find the analyzed content.

For Google Docs I'd recommend getting rid of the ? characters in the --toblock string - this will make it match only strings that have the dashes in it, like "123-45-6789", and that should be enough to make it not trigger in this case.

Indeed, if I remove the ? chars from the --toblock string, the Blocked dialog no longer appears. Instead, the Warn dialog is now displayed. I have also noticed that the scan takes much longer to complete.
Please let me know if I can provide more details! Hope this helps.

Flags: needinfo?(bhidecuti)

Thanks! Yeah, that's what I would expect. I'm going to close this bug then.

Status: NEW → RESOLVED
Closed: 1 month ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: