OT: help needed to debug Perl script

Joel Rees joel.rees at gmail.com
Wed Oct 17 09:36:36 UTC 2018


Can you post just the two lines where the behavior occurs, maybe
obfuscating hash names and whatnot?

Are you sure you are not using objects with hidden, and perhaps conflicting
methods? That's a common cause of inconsistant behavior.

2018年10月17日(水) 14:50 M. Fioretti <mfioretti at nexaima.net>:

> On 2018-10-17 07:18, Ken D'Ambrosio wrote:
> > I'm sorry, but lacking both source data and the full script, this is
> > much akin to finding the black cat in the coal cellar at midnight that
> > isn't there.
>
> Hello Ken, and thanks for the prompt answer.
>
> I am painfully aware that without the full source data things are much
> harder than they should be, but cannot do anything about it.
>
> As far as the Perl code goes... I cannot release that either, for the
> same reason, but I partially disagree with you there. Probably is my
> fault, meaning that I may have not described the problem in the best
> way. Let me try like this:
>
> a) Consider a Perl hash with a lots of keys (+25k in this case)
>
> b) What on on Earth may make two **consecutive** and practically
> identical statements ("sort and print all the keys of this specific
> hash") print +25k lines the first time, and ~15% of that the second
> time?
>
> Yes, it is very likely that the last batch of input data contains weird
> characters. But why don't they create any problem, until after the first
> foreach loop? The first problem here is the *difference** in the outputs
> of those two consecutive statements. Isn't what happens before or after
> (=the full script code that I cannot share) irrelevant?
>
> Again, thanks for any comment,
>
> Marco
>
>
>
> I have two consecutive
> > to see what's going wrong, where.  Perl may not be cool, but I promise
> > you, it's not suddenly changing its mind on how to handle stuff.
> > Something is inconsistent between the datasets.  Pay special attention
> > for possible unicode intrusion, which can be tricky to detect.
> >
> > -Ken
> >
> >
> > On 2018-10-17 01:05, M. Fioretti wrote:
> >> Greetings,
> >>
> >> A few weeks ago I quickly put together a Perl script to parse big CSV
> >> files, for a project I am working on (I need to do this several times
> >> a day, always with new data). All was fine until yesterday, when the
> >> script started behaving in a consistent, but totally wrong way.
> >>
> >> The script runs with "use strict" and -w switch, but I only get a few
> >> warnings for using uninitialized values in certain statements.
> >>
> >> The relevant part of the code is this:
> >>
> >>    147       my $keycounter = 1;
> >>    148
> >>    149       foreach my $qtq (sort keys %all) {
> >>    150
> >>    151           printf "\nALLCHECK: %6.6s >> %s;\n", $keycounter, $qtq;
> >>    152           $keycounter++;
> >>    153       }
> >>    154
> >>    155        foreach my $qq (sort keys %all) {
> >>    156           $url = $qq;
> >>    157           print "\nADDINGURX: $url;\n";
> >>    158           print "\nADDINGURQ: $qq;\n";
> >>
> >> lines 157, 158 and from 147 to 153 are added only for diagnostics.
> >> What happens is that, when I dump  the script output to a file, i.e.:
> >>
> >> ./myscript.pl > logfile
> >>
> >> then:
> >>
> >> a) logfile contains 26k+ lines starting with "ALLCHECK" = the %all
> >> hash contains 26k+ keys (
> >>
> >> b) the *same* logfile contains:
> >>
> >>    ~4700 lines starting with ADDINGURX
> >>    ZERO lines starting with ADDINGURQ
> >>
> >> in other words:
> >>
> >> the script worked perfectly for weeks. Starting yesterday, the same
> >> script says in line 151 that
> >> the hash has 26k keys, and 5 lines later, that the keys ofthe same
> >> hash are only 4700???
> >>
> >> I honestly have no idea of what is happening, or of why it only
> >> started happening now. The input CSV files (which I cannot share,
> >> sorry, not my data...) are different every time, so I initially
> >> thought that the last ones contained some weird character that
> >> confuses my code. But if that were the case, even the first printing
> >> statement would only print ~4700 lines.
> >>
> >> So, any help is appreciated,
> >>
> >> Thanks,
> >> Marco
>
> --
> http://mfioretti.com
>
> --
> ubuntu-users mailing list
> ubuntu-users at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20181017/3b2406e0/attachment.html>


More information about the ubuntu-users mailing list