Status

defect
RESOLVED FIXED
10 years ago
10 years ago

People

(Reporter: adriank, Assigned: gandalf)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 7 obsolete attachments)

Reporter

Description

10 years ago
Because of bug 479150 I found out, that in fact, write_to_file() writes files using the default system encoding, which is for OS X Mac-Roman, for some systems ASCII and for some others UTF8.
Because of that, I'm not sure if that method makes sense at all... At least it does not guarantee proper writing ob Blobs in any way - and that is why it was written in the beginning.

Maybe hardcoding the ASCII encoding for Blobs would make more sense?

The second way to solve this could be something like this (for Blobs):
1. Try ASCII
2. If it fails: try UTF8

What do you think about this? Axel? Gandalf?
Assignee

Comment 1

10 years ago
I'm still working on the patch, but I took this case to clean up the IOClient API.

Basically, current proposal is (more or less) like this:

class IOClient(object):
    def get_blob (cls, path, source=True, blob=None):
        pass

    def get_entitylist (cls, path, source=False, code='default', parser=None):
        pass

    def get_l10nobject (cls, path, source=False, code='default', parser=None):
        pass

    def get_l10npackage (cls, path):
        pass

    def write_blob(cls, blob, path):
        pass

    def write_entitylist(cls, elist, path):
        pass
    
    def write_l10nobject(cls, l10nobject, path):
        pass

    def write_l10npackage(cls, l10npackage, path):
        pass

    def get_source(cls, path, encoding=None, fallback=None):
        """
        reads source from the path
        """
        pass

    def get_source_with_encoding(cls, path, encoding):
        """
        reads source with encoding fallback
        """
        pass

    def get_source_without_encoding(cls, path):
        """
        reads source ignoring encoding (in binary compatible mode)
        """
        pass
Status: NEW → ASSIGNED
Assignee

Comment 2

10 years ago
Posted patch ioclients rewrite POC 1 (obsolete) — Splinter Review
I'm attaching initial rewrite POC.

I limited the patch to what happens in silme.io.clients and silme.io.file to make it simpler to read.

I cleaned whole IOClient to make it more abstract (at least for now) and reorganized API. FileClient should support whole IOClient now.

I moved some methods to the private zone in order to keep the API clean.

One thing I did not touch but consider to, is follow Java and split IOClient into IOClientReadable and IOClientWriteable interfaces so that each client can support one or both. My only concern here is potential performance drop due to interfaces use in scriptable language.

Adrian, could you take a look at it? It's not finished so no need of testing, but it would be helpful for me to get your opinion on those changes.

One thing to notice for you is that in IOClient.get_l10npackage, argument ignore may be a function now which implements your long awaited filter function :)

More patches will come soon. I'll try to implement the API on other clients now to see how it matches the needs of DBClients and RCSClients.
Assignee

Updated

10 years ago
Summary: [silme] silme.io.file.write_to_file() and the encoding used to write files → [silme] clean silme.io
Assignee

Comment 3

10 years ago
Posted patch WIP1 (obsolete) — Splinter Review
this is WIP1.

* clean API
* silme.io.file clean
* silme.io.jar almost clean
Assignee

Comment 4

10 years ago
Posted patch WIP3 (obsolete) — Splinter Review
This is WIP3:
* whole silme.io.file
* whole silme.io.jar
* partial silme.io.mysql
* a lot of unittests

The last thing before commit will be silme.io.hg|svn
Attachment #364502 - Attachment is obsolete: true
Attachment #366724 - Attachment is obsolete: true
Assignee

Comment 5

10 years ago
Posted patch WIP4 (obsolete) — Splinter Review
WIP4:

* silme.io.clients
* silme.io.file
* silme.io.jar
* silme.io.mysql
* silme.io.sqlite[1]
* silme.io.svn
* silme.io.hg[2]

1) SQLite support is only partial which is not a problem at the moment
2) HG support is extremely early in the game. I just made sure that proposed API supports Mercurial.

This patch is feature ready. I'm cleaning the code and reducing redundancy by moving overlapping pieces of code to abstract classes in silme.io.clients.

Adrian, it would be super helpful if you could take a look at the approach and let me know how it fits your needs.

The ultimate goal is to be able to give path: ./path/to/package or jar:/path/to/file.jar or hg:http://hg.mozilla.org/l10n-central/pl as an argument and it should work without any change in the code. We may not be there yet, but for 0.5 I want to make sure we can achieve it.
Assignee: nobody → gandalf
Attachment #368094 - Attachment is obsolete: true
Assignee

Comment 6

10 years ago
Posted patch patch v1 (obsolete) — Splinter Review
WIP5
Attachment #368978 - Attachment is obsolete: true
Reporter

Comment 7

10 years ago
Gandalf: could you please clean up the patch of obviously obsolete files, e.g.: pl.jar, test.jpg ? It's already big enough without them...
Reporter

Comment 8

10 years ago
Gandalf: or, if you want to have them in the repo: making a separate patch with the test data would help
Assignee

Comment 9

10 years ago
Posted patch patch v2 (obsolete) — Splinter Review
patch v2. minor updates and remove ./tests part to make it cleaner for review
Attachment #369098 - Attachment is obsolete: true
Assignee

Updated

10 years ago
Attachment #369511 - Attachment is patch: true
Attachment #369511 - Attachment mime type: application/octet-stream → text/plain
Assignee

Updated

10 years ago
Blocks: 485731
Reporter

Comment 10

10 years ago
Comment on attachment 369511 [details] [diff] [review]
patch v2

diff --git a/lib/silme/io/clients.py b/lib/silme/io/clients.py
--- a/lib/silme/io/clients.py
+++ b/lib/silme/io/clients.py

+class FileFormatClient(IOClient):

+    @classmethod
+    def get_l10npackage(cls, path,
+                        code='default',
+                        object_type='l10nobject',
+                        source=None,
+                        ignore=['CVS','.svn','.DS_Store', '.hg']):
+        l10npackage = L10nPackage()
+        l10npackage.id = os.path.basename(path)
+        l10npackage.uri = path

Looks like this method is not returning anything?


@@ -44,17 +174,17 @@ class IOClient(object):
             for coding in fallback:
                 try:
                     return cls.get_source_with_encoding(path, coding)
                 except IOError, e:
                     raise
                 except UnicodeDecodeError, e:
                     continue # TODO: logging
             # if we still did not succeed, try to check if the BOM is specified
-            text = cls.read_without_encoding(path)
+            text = cls._read_without_encoding(path)

Shoudn't either IOClient or FileFormatClient have a method returning "NotImplementedError()" or something else?

Second problem: silme.io.file.FileClient._read_without_encoding(path) returns data in binary mode, so you cannot use it for testing text.


     @classmethod
     def get_source_with_encoding(cls, path, encoding):
         try:
-            text = cls.read_with_encoding(path, encoding)
+            text = cls._read_with_encoding(path, encoding)

Shoudn't either IOClient or FileFormatClient have such a method returning "NotImplementedError()"?


+    @staticmethod
+    def _get_source_policy(source):

please add a comment what this method is doing

    
+    @staticmethod
+    def _should_ignore(ignore, path, elems):

please add a comment what this method is doing (ignore what?)



diff --git a/lib/silme/io/file.py b/lib/silme/io/file.py
--- a/lib/silme/io/file.py
+++ b/lib/silme/io/file.py
-class FileClient (IOClient):
+class FileClient (FileFormatClient):

+    @classmethod
+    def write_object(cls, object, path, encoding=None):
+        if isinstance(object, L10nObject):
+            cls.write_l10nobject(object, path, encoding=encoding)
+        elif isinstance(object, EntityList):
+            cls.write_entitylist(object, path, encoding=encoding)
+        elif isinstance(object, Blob):
+            cls.write_blob(object, path)

Error handling in an "else" case?


+    @classmethod
+    def write_l10npackage(cls, l10npack, path):
+        if not os.path.isdir(path):
+            os.mkdir(path)
+        for i in l10npack.packages:
+            if not os.path.isdir(os.path.join(path, i)):
+                os.mkdir(os.path.join(path, i))
+            cls.write_l10npackage(l10npack.packages[i], os.path.join(path, i))
+        for i in l10npack.objects:
+            cls.write_object(l10npack.objects[i], path)
+        return True

I'm not sure if we should better use here mktree instead of mkdir


+    @classmethod
+    def _read_without_encoding(cls, path):
+        file = open(path, 'rb')
+        text = file.read()
+        file.close()
+        return text

you are reading here in binary mode - so probably we should rename this method to avoid confusion?


diff --git a/lib/silme/io/hg.py b/lib/silme/io/hg.py
new file mode 100644
--- /dev/null
+++ b/lib/silme/io/hg.py

+    @classmethod
+    def _read_with_encoding(cls, path, encoding):
+        client = pysvn.Client()
+        text = client.cat(path, revision=pysvn.Revision( pysvn.opt_revision_kind.head ))
+        return text

Don't know how it works, but do we really use svn for hg...?


+    @classmethod
+    def _read_without_encoding(cls, path):
+        repo = hg.repository(ui.ui(), 'http://hg.mozilla.org/webtools/mcs/')
+        print repo
+        text = commands.cat(ui, repo, 'theme/html/index.html')
+        #text = client.cat(path, revision=pysvn.Revision( pysvn.opt_revision_kind.head ))
+        return text

like above


diff --git a/lib/silme/io/jar.py b/lib/silme/io/jar.py
--- a/lib/silme/io/jar.py
+++ b/lib/silme/io/jar.py
     def matches_path(cls, path):
         """
         tests if the ioclient should be used for this type of path
+        example: jar:browser.jar!/content/branding/
         """
-        return path.endswith('.jar')
+        return path.startswith('jar:')

As talked on IRC, we should return true also in the case when a path simply ends with ".jar" (we could test such a path with os.path.isdir, probably)


+    @classmethod
+    def _explode_path(cls, path):
+        if path.startswith('jar:'):
+            protocol = path[0:4]
+            jarpath = path[4:path.find('!')]
+            internalpath = path[path.find('!')+2:]

what happens here if we get such a path:
jar:/Users/bla.jar
and what if:
jar:/Users/bla.jar!/
?
Posted patch patch3 (obsolete) — Splinter Review
(In reply to comment #10)

Thanks Adrian! I applied most of your comments in the following patch.

> Second problem: silme.io.file.FileClient._read_without_encoding(path) returns
> data in binary mode, so you cannot use it for testing text.

My tests indicate I can. Can you provide an example?
 
> I'm not sure if we should better use here mktree instead of mkdir

I assume you mean os.makedirs?
 
> you are reading here in binary mode - so probably we should rename this method
> to avoid confusion?

Same as above. I enabled binary mode, but I can read text. It's just that I ignore encoding and read it "raw".
> Don't know how it works, but do we really use svn for hg...?

I removed silme.io.hg from this patch. It's experimental code that is able to read blobs for now ;)

Sorry for stealing your time with it.

patch v3. addresses known issues.
Attachment #369511 - Attachment is obsolete: true
Attachment #370163 - Flags: review?(akalla)
Reporter

Comment 12

10 years ago
Comment on attachment 370163 [details] [diff] [review]
patch3

>+    @staticmethod
>+    def _get_source_policy(source):
>+        # returns two variables that manages in which
>+        # case source of the file is beeing attached to the object

Maybe it's just my bad English, but that sentence seems to be not correct (and: s/beeing/being/)...


>+    @classmethod
>+    def write_object(cls, object, path, encoding=None):
>+        if isinstance(object, L10nObject):
>+            cls.write_l10nobject(object, path, encoding=encoding)
>+        elif isinstance(object, EntityList):
>+            cls.write_entitylist(object, path, encoding=encoding)
>+        elif isinstance(object, Blob):
>+            cls.write_blob(object, path)
>         else:
>+            raise TypeError('Cannot write object of suh type (%s)' % object.__class__.__name__)

s/suh/such/


>+def zipfile_delete(self, file):
>+    zfile = zipfile.ZipFile('/tmp/random.zip', mode='w')
>+    for item in self.infolist():
>+        buf = self.read(item.filename)
>+        if not item.filename==file:
>+            zfile.writestr(item, buf)
>+    zfile.close()
>+    shutil.move(zfile.filename, self.filename)


'/tmp/random.zip' ?
Does this method work with Windows?
Attachment #370163 - Flags: review?(akalla) → review-
Posted patch Patch, v4Splinter Review
Great feedback! Thanks!

Attaching the patch with fixes.
Attachment #370163 - Attachment is obsolete: true
Attachment #372651 - Flags: review?(akalla)
Reporter

Comment 14

10 years ago
Comment on attachment 372651 [details] [diff] [review]
Patch, v4

the code looks good now :) Good work.

Unfortunately, I didn't have time to test it to see if it really works (e.g. with compare-locales2)...
Attachment #372651 - Flags: review?(akalla) → review+
commited!

http://hg.mozilla.org/users/zbraniecki_mozilla.com/silme/rev/a0c4f0569abd

I'm not going to close the bug just yet, I'd like to give you a chance with compare-locales2. Do you think you'll be able to give it a try over the next few days?
timeout! :)
Status: ASSIGNED → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.