librelist archives

« back to archive

logging functionality

logging functionality

From:
Ralf Gommers
Date:
2011-08-19 @ 14:19
Hi Gael, all,

During the sprint next week (and after that) I'd like to work on improving
the logging functionality in joblib. Here are some questions and notes on
which I hope to get some feedback before then.

Questions & notes about logging
--------------------------------
- what should the default logging behavior be?
    --> right now no logging seems to happen by default
    --> if logging to file is enabled by default, where should the log be
        located?
    --> rotating logs?  if so, how many and default file size?

- why are Memory, MemorizedFunc and Parallel subclasses of Logger?
    --> it would be cleaner if it just had a log attribute which is a logger
        instance imho.

- the format should be customizable (i.e. add time/date, module name, etc.).
  what is a good default?
    --> I like something like:
            logging.basicConfig(filename='???',
                            format='%(asctime)s %(levelname)s: %(message)s',
                            datefmt='%d/%m/%Y %H:%M:%S',
                            level=logging.DEBUG)
        Not sure where to keep a logfile by default, maybe ~/.joblib/ ?

- need examples for users that use joblib directly, and for other libraries
  (such as scikit-learn) that use joblib.

- logging for multiprocessing processes to a single file (which is
desirable)
  requires a custom handler.  The suggestion in the logging cookbook is to
use
  locking of the logfile with `multiprocessing.Lock`.

- a custom handler should probably only be added conditionally (or be easily
  disabled); the Python logging docs strongly advise not to add any handlers
  other than a NullHandler because adding handlers may interfere with the
  needs of applications using joblib.

- need documentation of what the `verbose` keyword does.  For example,
  verbose>1 prints "'[Memory]% 16s: Loading %s.." etc. for each result
loaded
  from cache.  The behavior should be defined for each verbosity level, but
  preferably also configurable in a more fine-grained way.
  Example: I would like detailed logging, but no printing to the console.

- there's a manual file rotating mechanism in PrintTime that should be
replaced
  by a RotatingFileHandler.

- it would be nice if there was a Memory.name attribute to be used for
logging.
  The current repr is way too long.  Name should be settable during
  instantiation.


Other notes
-----------
- Memory.__init__ shouldn't accept a cachedir argument with '~' in it; right
now
  this creates an actual dir ~/ instead of expanding to $HOME.

- If you change code that is used by the function decorated with
@memory.cache,
  you have to manually clear the cache.  This should probably be mentioned
  prominently in the docs, because it will be a common source of errors.

Cheers,
Ralf