logging functionality
- From:
- Ralf Gommers
- Date:
- 2011-08-19 @ 14:19
Hi Gael, all,
During the sprint next week (and after that) I'd like to work on improving
the logging functionality in joblib. Here are some questions and notes on
which I hope to get some feedback before then.
Questions & notes about logging
--------------------------------
- what should the default logging behavior be?
--> right now no logging seems to happen by default
--> if logging to file is enabled by default, where should the log be
located?
--> rotating logs? if so, how many and default file size?
- why are Memory, MemorizedFunc and Parallel subclasses of Logger?
--> it would be cleaner if it just had a log attribute which is a logger
instance imho.
- the format should be customizable (i.e. add time/date, module name, etc.).
what is a good default?
--> I like something like:
logging.basicConfig(filename='???',
format='%(asctime)s %(levelname)s: %(message)s',
datefmt='%d/%m/%Y %H:%M:%S',
level=logging.DEBUG)
Not sure where to keep a logfile by default, maybe ~/.joblib/ ?
- need examples for users that use joblib directly, and for other libraries
(such as scikit-learn) that use joblib.
- logging for multiprocessing processes to a single file (which is
desirable)
requires a custom handler. The suggestion in the logging cookbook is to
use
locking of the logfile with `multiprocessing.Lock`.
- a custom handler should probably only be added conditionally (or be easily
disabled); the Python logging docs strongly advise not to add any handlers
other than a NullHandler because adding handlers may interfere with the
needs of applications using joblib.
- need documentation of what the `verbose` keyword does. For example,
verbose>1 prints "'[Memory]% 16s: Loading %s.." etc. for each result
loaded
from cache. The behavior should be defined for each verbosity level, but
preferably also configurable in a more fine-grained way.
Example: I would like detailed logging, but no printing to the console.
- there's a manual file rotating mechanism in PrintTime that should be
replaced
by a RotatingFileHandler.
- it would be nice if there was a Memory.name attribute to be used for
logging.
The current repr is way too long. Name should be settable during
instantiation.
Other notes
-----------
- Memory.__init__ shouldn't accept a cachedir argument with '~' in it; right
now
this creates an actual dir ~/ instead of expanding to $HOME.
- If you change code that is used by the function decorated with
@memory.cache,
you have to manually clear the cache. This should probably be mentioned
prominently in the docs, because it will be a common source of errors.
Cheers,
Ralf