Looking for raw IRC chat logs.

Etienne Papegnies etienne.papegnies at univ-avignon.fr
Tue Sep 13 12:43:58 UTC 2016


I'm a PHD student at the "Laboratoire Informatique d'Avignon" (Avignon 
Computer Lab), part of the University of Avignon ( 
http://www.univ-avignon.fr/ )

The title of my thesis is "User Models For The Automatic Supervision Of 
Social Networks", and I'm looking into automating the detection of 
abusive users online.

One of the networks I'm experimenting on contains chatrooms, the 
language is French.

I'm trying to get an English corpus of chat logs so I can test methods 
developed for the French corpus against the English one.

I would very much like to get my hands on raw logs for the #ubuntu channel.
The published logs on irclogs.ubuntu.com have apparently been stripped 
of join/parts status messages and bans.

Those bans would be of great value to me because they constitute a 
ground truth to aggregate natural language data of abusive users.

If someone happens to have the raw IRC logs for #ubuntu, getting back a 
few years (I currently have only 8 months, containing only 511 bans) I'd 
be very grateful.

Note: I'm not interested in private messages. If you feel pre-processing 
is needed because your own messages are logged differently, I can do 
that, and I don't mind if you want to take care of it.

My research activity falls under the Open Access directives, and credit 
to the source in publications will be given unless instructed otherwise.

I can provide shell access to a server for the transfer of the logs.

Best regards,

Etienne Papegnies


