[MERGE][bzr-email] Implement mail sending using smtplib

Fri Jan 26 14:24:51 GMT 2007

On Thu, Jan 25, 2007 at 05:38:15PM -0600, John Arbash Meinel wrote:
> Then I implemented an SMTPConnection class, which has a little bit of
> knowledge about what configuration parameters bzrlib should use to
> define an SMTP connection (smtp_server, smtp_username, smtp_password).
> 
> That part was easy. The tricky part was to make everything support
> Unicode. And support it in a way that wouldn't confuse my SMTP server,
> and would look nice in Thunderbird.

Ah, yes.  Been there, done that
(http://mg.pov.lt/blog/unicode-emails-in-python.html).

> # Bazaar revision bundle v0.8
> #
> # message:
> #   Update the README
> # committer: John Arbash Meinel <john at arbash-meinel.com>
> # date: Thu 2007-01-25 17:04:53.723999977 -0600
> 
> === added file emailer.py // file-id:emailer.py-20070123220937-ec5y2n2oeoa0p4ue
> ... -1 // last-changed:john at arbash-meinel.com-20070125220812-yceplg1ck4qt85la
> --- /dev/null
> +++ emailer.py
...
> +class EmailSender(object):
...
> +    def body(self):
...
> +        # We must use StringIO.StringIO because we want a Unicode string
> +        from StringIO import StringIO
> +        outf = StringIO()
...
> +
> +    def get_diff(self):
...
> +
> +        # We want a cStringIO because we want an 8-bit string

StringIO.StringIO can handle both 8-bit strings and Unicode strings.
(It cannot handle mixed content, though).  Perhaps rephrase the comment
to

           # We can use cStringIO because we have an 8-bit string

?

Or is there some type-based dispatching in bzrlib that chooses Unicode
versus 8-bit from the file object?

> +        from cStringIO import StringIO
> +        diff_content = StringIO()
> +        show_diff_trees(tree_old, tree_new, diff_content)
> +        lines = diff_content.getvalue().split("\n")
> +        numlines = len(lines)

It would be more efficient to do

           numlines = diff_content.getvalue().count("\n") + 1

since you don't use ``lines`` for anything else.

> +        if numlines <= difflimit:
> +            return diff_content.getvalue()
> +        else:
> +            return ("\nDiff too large for email"
> +                    " (%d lines, the limit is %d).\n"
> +                    % (numlines, difflimit))
> +
> +    def difflimit(self):
> +        """maximum number of lines of diff to show."""

Capitalize "Maximum"?

...
> +    def _send_using_process(self):
> +        """Spawn a 'mail' subprocess to send the email."""
> +        # TODO think up a good test for this, but I think it needs
> +        # a custom binary shipped with. RBC 20051021

Why a custom binary?  You could set post_commit_mailer to the name of a
test-helper Python script that stores its arguments and stdin somewhere
on the disk.  Perhaps copy the test script into a temporary directory,
and make it write its result to os.path.join(os.path.dirname(sys.argv[0]),
'out.txt').

> +        try:
> +            process = subprocess.Popen(self._command_line(),
> +                                       stdin=subprocess.PIPE)
> +            try:
> +                message = self.body().encode('utf8') + self.get_diff()
> +                result = process.communicate(message)[0]
...

> === added file smtp_connection.py // file-id:smtp_connection.py-20070125220755-
> ... k6ueimjqwn16wvr9-1 // last-changed:john at arbash-meinel.com-20070125225439-5b
> ... trv82m7dhekt8z
> --- /dev/null
> +++ smtp_connection.py
...
> +class SMTPConnection(object):
...
> +    def _basic_message(self, from_address, to_addresses, subject):
> +        """Create the basic Message using the right Header info.
> +
> +        This creates an email Message with no payload.
> +        :param from_address: The Unicode from address.
> +        :param to_addresses: A list of Unicode destination addresses.
> +        :param subject: A Unicode subject for the email.
> +        """
> +        # It would be nice to use a single part if we only had one, but we
> +        # would have to know ahead of time how many parts we needed.
> +        # So instead, just default to multipart.
> +        msg = MIMEMultipart()
> +
> +        # Header() does a good job of doing the proper encoding. However it
> +        # confuses my SMTP server because it doesn't decode the strings. So it
> +        # is better to send the addresses as:
> +        #   =?utf-8?q?username?= <email at addr.com>
> +        # Which is how Thunderbird does it

I think encoding the <email at addr.com> part is actually a violation of RFC-2822.
The stdlib email package is quirky.

> +        from_user, from_email = self._split_address(from_address)
> +        msg['From'] = '%s <%s>' % (Header(from_user, 'utf8'), from_email)

If you use Header(some_str, 'utf8'), then it will *always* use the
=?utf-8?..?= quoting (at least it acts this way in Python 2.4).  This is
redundant for names that can be expressed in plain text.
Header(some_unicode) does the quoting only when necessary.  The type of
the string is important, the email package treats 'foo' and u'foo'
differently.

> +        msg['User-Agent'] = 'bzr/%s' % _bzrlib_version
> +
> +        to_emails = []
> +        to_header = []
> +        for addr in to_addresses:
> +            to_user, to_email = self._split_address(addr)
> +            to_emails.append(to_email)
> +            to_header.append('%s <%s>' % (Header(to_user, 'utf8'), to_email))
> +
> +        msg['To'] = ', '.join(to_header)
> +        msg['Subject'] = Header(subject)
> +        return msg, from_email, to_emails
...

> === added file tests/test_smtp_connection.py // file-id:test_smtp_connection-20
> ... 070125220755-k6ueimjqwn16wvr9-2 // last-changed:john at arbash-meinel.com-2007
> ... 0125225439-5btrv82m7dhekt8z
> --- /dev/null
> +++ tests/test_smtp_connection.py
...
> +class InstrumentedSMTPConnection(SMTPConnection):
> +    """Instrument SMTPConnection.
> +
> +    We don't want to actually connect or send messages, so this just
> +    fakes it.
> +    """
> +    

What does bzr's coding style say about trailing whitespace?

...
> +class TestSMTPConnection(TestCase):
> +
> +    def get_connection(self, text):
> +        my_config = config.GlobalConfig()
> +        config_file = StringIO(text)
> +        (my_config._get_parser(config_file))

Why the parentheses around it?

> +        return InstrumentedSMTPConnection(my_config)
...

> === modified file README
> --- README
> +++ README
...
> +Also, if using 'smtplib', the messages will be sent as a utf-8 text message,

Some pedantic people insist that UTF-8 be spelled "UTF-8", with all the
capital letters, as it is an abbreviation for UCS (or Unicode)
Transformation Format.

> +with a 8-bit text diff attached (rather than all-as-one). Work has also been
> +done to make sure usernames do not have to be ascii.
> +
...

Marius Gedminas
-- 
CBQ is merely the oldest kid on the block - yet it is by far the least useful
qdisc and also the most complex one. I advise *against* using it. This may come
as something of a shock to many who fell for the 'sendmail effect', which
learns us that any complex technology which doesn't come with documentation
must be the best available.
		-- Linux Advanced Routing HOWTO
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070126/b8b2a14b/attachment.pgp