[MERGE] An EmailMessage class for bzrlib (v3)

Adeodato Simó dato at net.com.org.es
Mon Jul 9 20:49:21 BST 2007


(Aaron: CCing you in case you want to review the changes to
test_merge_directive below. I had to change those that dealt with email.
Instead of making the expected email completely match the resulting MIME
structure, I changed it to check each header and body hunk individually.
This seemed less fragile and more encapsulated (?) to me.)

* Adeodato Simó [Mon, 09 Jul 2007 15:52:41 +0200]:

> * Martin Pool [Mon, 09 Jul 2007 00:29:59 -0400]:

> > It seems like we would still be able to send non-mime messages by 
> > constructing a new message object and sending it?  We could add parameters 
> > to the static function to control this.

> * John Arbash Meinel [Fri, 06 Jul 2007 14:36:53 -0400]:

> > It would be nice if we could have a way to send an email as a single chunk 
> > (non mime) because that is what I believe PQM needs. (So it means that both 
> > bzr-email and bzr-pqm can share the email code).
> > I don't think it is easy to do, though, because if you start with a single 
> > body, and then add an attachment, you have to start over (at least that is 
> > what I understood about e-mail, which is why bzr-email doesn't do it that 
> > way).

> So it seems that non-MIME messages are a wanted feature. :-) I think
> that shoehorning this into EmailMessage itself is not a good option,
> because you cannot inherit from MIMEMultipart anymore, and have to use a
> hidden object, proxying attribute access, etc. (If there's a better way
> to achive that, by all means please tell me.)

> However, along the lines of what Martin proposes, I think it'd be
> feasible to have the staticmethod to act different whenever attachment
> is None, creatting a plain email.Message instead of a bzrlib.EmailMessage.
> With this, it's only necessary to refactor the code in
> EmailMessage.__init__ to a separate method, and do that. How does that
> sound?

Okay, I did this and it works nicely. Prior to that, I snapped in a
commit to not force the body to be sent always as utf-8 or 8-bit,
because in that case Python's email module always encodes it to base64;
and my guess is that if the receiving end (eg. PQM) does not understand
MIME multipart messages, it won't be happy about receiving base64
either.

So this should finally cover all use cases. :-)

Partial diff:

=== modified file 'bzrlib/email_message.py'
--- bzrlib/email_message.py	2007-07-09 13:34:17 +0000
+++ bzrlib/email_message.py	2007-07-09 18:26:44 +0000
@@ -18,6 +18,7 @@
 
 from email import (
     Header,
+    Message,
     MIMEMultipart,
     MIMEText,
     Utils,
@@ -42,31 +43,46 @@
 
         All four parameters can be unicode strings or byte strings, but for the
         addresses and subject byte strings must be encoded in UTF-8. For the
-        body any byte string will be accepted, but no guessing of the encoding
-        will be done (the charset will be set to '8-bit').
+        body any byte string will be accepted; if it's not ASCII or UTF-8,
+        it'll be sent with charset=8-bit.
         """
         MIMEMultipart.MIMEMultipart.__init__(self)
-        self['From'] = self.address_to_encoded_header(from_address)
-        self['User-Agent'] = 'Bazaar (%s)' % _bzrlib_version
+        self._init_message(self, from_address, to_address, subject, body)
+
+    @staticmethod
+    def _init_message(msg, from_address, to_address, subject, body=None):
+        """Initialize a message with addresses, subject, and body.
+
+        If :param msg: is an EmailMessage instance, the body will be added with
+        the add_inline_attachment() method; if not, a Message.Message instance
+        is assumed, and the set_payload() method will be used.
+        """
+        msg['From'] = EmailMessage.address_to_encoded_header(from_address)
+        msg['User-Agent'] = 'Bazaar (%s)' % _bzrlib_version
 
         if isinstance(to_address, basestring):
             to_address = [to_address]
 
         to_addresses = []
         for addr in to_address:
-            to_addresses.append(self.address_to_encoded_header(addr))
-        self['To'] = ', '.join(to_addresses)
+            to_addresses.append(EmailMessage.address_to_encoded_header(addr))
+        msg['To'] = ', '.join(to_addresses)
 
-        self['Subject'] = Header.Header(safe_unicode(subject))
+        msg['Subject'] = Header.Header(safe_unicode(subject))
 
         if body is not None:
-            self.add_inline_attachment(body)
+            if isinstance(msg, EmailMessage):
+                msg.add_inline_attachment(body)
+            else:
+                string, encoding = EmailMessage.string_with_encoding(body)
+                msg.set_payload(string, encoding)
 
     def add_inline_attachment(self, body, filename=None, mime_subtype='plain'):
         """Add an inline attachment to the message.
 
-        :param body: A text to attach. Can be an unicode string, and it'll be
-            sent as utf-8, or a byte string, and it'll be sent as 8-bit.
+        :param body: A text to attach. Can be an unicode string or a byte
+            string, and it'll be sent as ascii, utf-8, or 8-bit, in that
+            preferred order.
         :param filename: The name for the attachment. This will give a default
             name for email programs to save the attachment.
         :param mime_subtype: MIME subtype of the attachment (eg. 'plain' for
@@ -75,11 +91,8 @@
         The attachment body will be displayed inline, so do not use this
         function to attach binary attachments.
         """
-        if isinstance(body, unicode):
-            payload = MIMEText.MIMEText(body.encode('utf-8'), mime_subtype,
-                                        'utf-8')
-        else:
-            payload = MIMEText.MIMEText(body, mime_subtype, '8-bit')
+        string, encoding = self.string_with_encoding(body)
+        payload = MIMEText.MIMEText(string, mime_subtype, encoding)
 
         if filename is not None:
             content_type = payload['Content-Type']
@@ -99,8 +112,13 @@
         See EmailMessage.__init__() and EmailMessage.add_inline_attachment()
         for an explanation of the rest of parameters.
         """
-        msg = EmailMessage(from_address, to_address, subject, body)
-        if attachment is not None:
+        if attachment is None:
+            # MIMEMultipart not needed
+            msg = Message.Message()
+            EmailMessage._init_message(msg, from_address, to_address, subject,
+                    body)
+        else:
+            msg = EmailMessage(from_address, to_address, subject, body)
             msg.add_inline_attachment(attachment, attachment_filename,
                     attachment_mime_subtype)
         SMTPConnection(config).send_email(msg)
@@ -120,3 +138,32 @@
         else:
             return Utils.formataddr((str(Header.Header(safe_unicode(user))),
                 email))
+
+    @staticmethod
+    def string_with_encoding(string):
+        """Return a str object together with an encoding.
+
+        :param string: A str or unicode object.
+        :return: A tuple (str, encoding), where encoding is one of 'ascii',
+            'utf-8', or '8-bit', in that preferred order.
+        """
+        # Python's email module base64-encodes the body whenever the charset is
+        # not explicitly set to ascii. Because of this, and because we want to
+        # avoid base64 when it's not necessary in order to be most compatible
+        # with the capabilities of the receiving side, we check with encode()
+        # and decode() whether the body is actually ascii-only.
+        if isinstance(string, unicode):
+            try:
+                return (string.encode('ascii'), 'ascii')
+            except UnicodeEncodeError:
+                return (string.encode('utf-8'), 'utf-8')
+        else:
+            try:
+                string.decode('ascii')
+                return (string, 'ascii')
+            except UnicodeDecodeError:
+                try:
+                    string.decode('utf-8')
+                    return (string, 'utf-8')
+                except UnicodeDecodeError:
+                    return (string, '8-bit')

=== modified file 'bzrlib/tests/test_email_message.py'
--- bzrlib/tests/test_email_message.py	2007-07-09 13:34:17 +0000
+++ bzrlib/tests/test_email_message.py	2007-07-09 19:17:13 +0000
@@ -49,15 +49,30 @@
 
 COMPOUND_MESSAGE = _HEAD + '''\
 --%(boundary)s
-Content-Type: text/%%s; charset="utf-8"; name="lines.txt"
 MIME-Version: 1.0
-Content-Transfer-Encoding: base64
+Content-Type: text/%%s; charset="us-ascii"; name="lines.txt"
+Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
-YQpiCmMKZAplCg==
+a
+b
+c
+d
+e
 
 --%(boundary)s--''' % _DICT
 
+NON_MULTIPART_MESSAGE = '''\
+From: from at from.com
+User-Agent: Bazaar (0.18.0dev0)
+To: to at to.com
+Subject: subject
+MIME-Version: 1.0
+Content-Type: text/plain; charset="us-ascii"
+Content-Transfer-Encoding: 7bit
+
+body'''
+
 
 class TestEmailMessage(TestCase):
 
@@ -76,7 +91,7 @@
         msg = EmailMessage('from at from.com', 'to at to.com', 'subject',
                 'b\xc3\xb3dy')
         msg.set_boundary(BOUNDARY)
-        self.assertEqualDiff(SIMPLE_MESSAGE % '8-bit', msg.as_string())
+        self.assertEqualDiff(SIMPLE_MESSAGE % 'utf-8', msg.as_string())
 
     def test_utf8_accepted(self):
         utf8 = 'Pepe P\xc3\xa9red <pperez at ejemplo.com>'
@@ -105,23 +120,31 @@
             def get_user_option(self, option):
                 return None
 
-        def send_email(mime_subtype='plain'):
+        def verify_message_with_mime_type(mime_subtype='plain'):
             def wrapper(_self, msg):
                 msg.set_boundary(BOUNDARY)
                 self.assertEqualDiff(COMPOUND_MESSAGE % ('utf-8',
                         mime_subtype), msg.as_string())
             return wrapper
 
+        def verify_message_without_attachment(_self, msg):
+            self.assertEqualDiff(NON_MULTIPART_MESSAGE , msg.as_string())
+
         old_send_email = SMTPConnection.send_email
         try:
-            SMTPConnection.send_email = send_email()
+            SMTPConnection.send_email = verify_message_with_mime_type()
             EmailMessage.send(FakeConfig(), 'from at from.com', 'to at to.com',
                     'subject', u'b\xf3dy', u'a\nb\nc\nd\ne\n', 'lines.txt')
 
-            SMTPConnection.send_email = send_email('x-patch')
+            SMTPConnection.send_email = verify_message_with_mime_type(
+                    'x-patch')
             EmailMessage.send(FakeConfig(), 'from at from.com', 'to at to.com',
                     'subject', u'b\xf3dy', u'a\nb\nc\nd\ne\n', 'lines.txt',
                     'x-patch')
+
+            SMTPConnection.send_email = verify_message_without_attachment
+            EmailMessage.send(FakeConfig(), 'from at from.com', 'to at to.com',
+                    'subject', 'body')
         finally:
             SMTPConnection.send_email = old_send_email
 
@@ -156,3 +179,14 @@
         address = 'Pepe P\xe9rez <pperez at ejemplo.com>' # ISO-8859-1 not ok
         self.assertRaises(BzrBadParameterNotUnicode,
                 EmailMessage.address_to_encoded_header, address)
+
+    def test_string_with_encoding(self):
+        pairs = {
+                u'Pepe':        ('Pepe', 'ascii'),
+                u'P\xe9rez':    ('P\xc3\xa9rez', 'utf-8'),
+                'Perez':         ('Perez', 'ascii'), # u'Pepe' == 'Pepe'
+                'P\xc3\xa9rez': ('P\xc3\xa9rez', 'utf-8'),
+                'P\xe9rez':     ('P\xe9rez', '8-bit'),
+        }
+        for string, pair in pairs.items():
+            self.assertEqual(pair, EmailMessage.string_with_encoding(string))

=== modified file 'bzrlib/tests/blackbox/test_merge_directive.py'
--- bzrlib/tests/blackbox/test_merge_directive.py	2007-06-25 13:17:32 +0000
+++ bzrlib/tests/blackbox/test_merge_directive.py	2007-07-09 19:36:53 +0000
@@ -24,17 +24,17 @@
     )
 
 
-EMAIL1 = """To: pqm at example.com
-From: J. Random Hacker <jrandom at example.com>
-Subject: bar
-
-# Bazaar merge directive format 1
+EMAIL1 = (
+        'To: pqm at example.com',
+        'From: "J. Random Hacker" <jrandom at example.com>',
+        'Subject: bar',
+        '''# Bazaar merge directive format 1
 # revision_id: bar-id
 # target_branch: ../tree2
 # testament_sha1: .*
 # timestamp: .*
 # source_branch: .
-#"""
+#''')
 
 
 class TestMergeDirective(tests.TestCaseWithTransport):
@@ -139,7 +139,8 @@
         call = sendmail_calls[0]
         self.assertEqual(('jrandom at example.com', ['pqm at example.com']),
                 call[1:3])
-        self.assertContainsRe(call[3], EMAIL1)
+        for regex in EMAIL1:
+            self.assertContainsRe(call[3], regex)
 
     def test_pull_raw(self):
         self.prepare_merge_directive()

=== modified file 'bzrlib/tests/test_merge_directive.py'
--- bzrlib/tests/test_merge_directive.py	2007-06-19 21:22:56 +0000
+++ bzrlib/tests/test_merge_directive.py	2007-07-09 19:33:01 +0000
@@ -159,31 +159,31 @@
         self.assertIs(None, md4.patch_type)
 
 
-EMAIL1 = """To: pqm at example.com
-From: J. Random Hacker <jrandom at example.com>
-Subject: Commit of rev2a
-
-# Bazaar merge directive format 1
+EMAIL1 = (
+        'To: pqm at example.com',
+        'From: "J. Random Hacker" <jrandom at example.com>',
+        'Subject: Commit of rev2a',
+        '''# Bazaar merge directive format 1
 # revision_id: rev2a
 # target_branch: (.|\n)*
 # testament_sha1: .*
 # timestamp: 1970-01-01 00:08:56 \\+0001
 # source_branch: (.|\n)*
-"""
-
-
-EMAIL2 = """To: pqm at example.com
-From: J. Random Hacker <jrandom at example.com>
-Subject: Commit of rev2a with special message
-
-# Bazaar merge directive format 1
+''')
+
+
+EMAIL2 = (
+        'To: pqm at example.com',
+        'From: "J. Random Hacker" <jrandom at example.com>',
+        'Subject: Commit of rev2a with special message',
+        '''# Bazaar merge directive format 1
 # revision_id: rev2a
 # target_branch: (.|\n)*
 # testament_sha1: .*
 # timestamp: 1970-01-01 00:08:56 \\+0001
 # source_branch: (.|\n)*
 # message: Commit of rev2a with special message
-"""
+''')
 
 
 class TestMergeDirectiveBranch(tests.TestCaseWithTransport):
@@ -309,10 +309,12 @@
             tree_a.branch.repository, 'rev2a', 476, 60, tree_b.branch.base,
             patch_type=None, public_branch=tree_a.branch.base)
         message = md.to_email('pqm at example.com', tree_a.branch)
-        self.assertContainsRe(message.as_string(), EMAIL1)
+        for regex in EMAIL1:
+            self.assertContainsRe(message.as_string(), regex)
         md.message = 'Commit of rev2a with special message'
         message = md.to_email('pqm at example.com', tree_a.branch)
-        self.assertContainsRe(message.as_string(), EMAIL2)
+        for regex in EMAIL2:
+            self.assertContainsRe(message.as_string(), regex)
 
     def test_install_revisions_branch(self):
         tree_a, tree_b, branch_c = self.make_trees()

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
Proper treatment will cure a cold in seven days, but left to itself, a
cold will hang on for a week.
                -- Darrell Huff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bzr.email_message.diff
Type: text/x-diff
Size: 103135 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070709/65d646a8/attachment-0001.bin 


More information about the bazaar mailing list