librelist archives

« back to archive

Line number -1

Line number -1

From:
Miroslav Batchkarov
Date:
2014-10-14 @ 09:50
Hi,

I recently got an interesting warning from joblib 0.8.2 (Python 3.3)

JobLibCollisionWarning: Possible name collisions between functions 
'get_tokenized_data' (utils.py:-1) and 'get_tokenized_data' (utils.py:64)
I am confident the function only occurs once. The code that generates the 
warning looks like this:

def cache_single_labelled_corpus(conf):
    memory = Memory(cachedir='.', verbose=0)
    get_cached_tokenized_data = memory.cache(get_tokenized_data, 
ignore=['*', '**'])
    get_cached_tokenized_data(conf['data'])

def cache_all_labelled_corpora():
    Parallel(n_jobs=10)(delayed(cache_single_labelled_corpus)(conf) for 
conf in get_confs())
Is this an issue with joblib failing to determine the line number 
correctly, or with my creating multiple Memory objects in the same 
location?

Best,
Miroslav

---
Miroslav Batchkarov
PhD Student,
Text Analysis Group,
Department of Informatics,
University of Sussex


Re: [joblib] Line number -1

From:
Olivier Grisel
Date:
2014-10-14 @ 10:09
Interesting, I have never seen such an error message.

Could you please try to write a short standalone script (without any
dependency on your code or data) that reproduces the issue?

Out of curiosity, why do you have `ignore=['*', '**']`?

-- 
Olivier

Re: [joblib] Line number -1

From:
Miroslav Batchkarov
Date:
2014-10-14 @ 11:29
<sorry if this was sent twice>
Hi Olivier,

the code is :

import joblib, sys

print(joblib.__version__)
print(sys.version_info)


def do_some_work(a, *args, **kwargs):
    print(a)
    return [a] * 10000


def create_memory(a):
    memory = joblib.Memory(cachedir='.', verbose=999)
    cached_work = memory.cache(do_some_work, ignore=['*', '**'])
    return cached_work(a)


joblib.Parallel(n_jobs=10)(joblib.delayed(create_memory)(i) for i in range(10))

The * and ** args are there because I thought they might be related, and 
they are needed in my code.

The (truncated) output  is here (sorry, can't email as message gets 
rejected). I am unable to reproduce on my Mac or my Ubuntu desktop. Here 
is some possibly relevant information about the machine where this 
happens.

Thanks,
Miroslav

---
Miroslav Batchkarov
PhD Student,
Text Analysis Group,
Department of Informatics,
University of Sussex



On 14 Oct 2014, at 11:09, Olivier Grisel <olivier.grisel@ensta.org> wrote:

> Interesting, I have never seen such an error message.
> 
> Could you please try to write a short standalone script (without any
> dependency on your code or data) that reproduces the issue?
> 
> Out of curiosity, why do you have `ignore=['*', '**']`?
> 
> -- 
> Olivier

Re: [joblib] Line number -1

From:
Olivier Grisel
Date:
2014-10-14 @ 21:49
I cannot reproduce the problem on my mac either. What system is
running the machine that has the problem? Do you replicate the problem
on the latest version of joblib?

-- 
Olivier

Re: [joblib] Line number -1

From:
Miroslav Batchkarov
Date:
2014-10-15 @ 13:26
It's a Scientific Linux 6.5 machine, writing to a Lustre drive. I get the 
same warning with the latest joblib (0.8.3-r1). The actual cached files 
appear to be intact and can be read just fine subsequently.

---
Miroslav Batchkarov
PhD Student,
Text Analysis Group,
Department of Informatics,
University of Sussex



On 14 Oct 2014, at 22:49, Olivier Grisel <olivier.grisel@ensta.org> wrote:

> I cannot reproduce the problem on my mac either. What system is
> running the machine that has the problem? Do you replicate the problem
> on the latest version of joblib?
> 
> -- 
> Olivier