logrotate configuration seems wrong

John Meinel john at arbash-meinel.com
Mon Sep 15 05:38:27 UTC 2014


Going further on this, as I'm discovering new oddities:

1) ll /var/log/juju
-rw------- 1 syslog adm    512000006 Sep 15 05:16 all-machines.log

Seems surprising that it is 5120000* bytes long.

2) tail -f /var/log/juju/all-machines.log
nothing happening

3) I see more data coming into /var/log/juju/machine-0.log, why isn't it
ending up in all-machines.log

4) Given that we are using 2 different rotation mechanisms, did we test
that messages that were just written to machine-0.log still get copied to
all-machines.log when machine-0.log ends up getting rotated?

5) I'm pretty sure we didn't, because clearly we didn't test that
all-machines.log actually gets any more data once it 'fills' up.

6) Note that I do still see rsyslog running and it is using a fair amount
of CPU, so it is still getting messages from other units, etc. machine-0
agent and mongod are both quite active as well.

7) "copytruncate" seems the wrong setting for interactive with rsyslog. I
believe rsyslog is already aware that the file needs to be rotated, and
thus it shouldn't be trying to write to the same file handle (and thus we
don't need to truncate in place). I'm not 100% sure on the interactions
here, but "copytruncate" seems to have an inherent likelyhood of dropping
data (while you are copying, if any data gets written then you'll miss
those last few bytes when you go to truncate, right?)

I'm happy to see logrotation being added, but it seems quite half-baked in
our current trunk.

Am I just doing something wrong? Did someone actually test all of this out
and it was working for them?

John
=:->

On Mon, Sep 15, 2014 at 9:13 AM, John Meinel <john at arbash-meinel.com> wrote:

> So I was testing scaling today which generally generates huge volumes of
> logging (I actually wanted to keep it because I used it for seeing how
> everything went, but I understand why we are rotating the logs.)
>
> However, I found this as it was running:
> # ls -sh /var/log/juju
> total 909M
> 323M all-machines.log
> 4.0K ca-cert.pem
> 4.0K logrotate.conf
> 4.0K logrotate.run
> 301M machine-0-2014-09-15T05-02-27.486.log
> 301M machine-0-2014-09-15T05-06-53.779.log
>  80M machine-0.log
> 4.0K rsyslog-cert.pem
> 4.0K rsyslog-key.pem
>
> Notice that there is only 1 all-machines.log that is 300MB in size, while
> there are 2 machine-0 logs.
>
> And when I track down the various configuration files, I find
> # cat logrotate.conf
>
> /var/log/juju/all-machines.log {
>     size 512M
>     # don't move, but copy-and-truncate so the application won't have to be
>     # told that the file has moved.
>     copytruncate
>     # maximum of one old file
>     rotate 1
>     # counting old files starts at 1 rather than 0
>     start 1
>     # use compression
>     compress
> }
>
>
> I have the feeling that someone didn't realize "rotate 1" means only keep
> the original log file. As in, there are *no* backup files.
>
> Did the person who implemented this actually test it?
>
> Did we ever fix things so that "juju debug-log" doesn't become immediately
> useless once you reach the rotate threshold (that it can look in the backup
> log files)?
>
> I can understand not fixing debug-log, but I'm a bit surprised that our
> idea of "all-machines.log needs to be rotated" became "all-machines.log
> needs to be truncated".
>
> John
> =:->
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20140915/c9c81d69/attachment-0001.html>


More information about the Juju-dev mailing list