Re: [sidekiq] Batch Redis Inserts
- From:
- Jake Mack
- Date:
- 2012-08-23 @ 18:56
Another issue I noticed is that each job doesn't get a unique jid
(because the PR was created before jids were implemented). That's
relatively easy to fix, at least.
The middleware API issue is bigger though. I haven't been able to come
up with a nice, elegant way to resolve that yet, not sure if you have
any ideas there. They seem to be inherently incompatible.
One solution I thought up involves changing the middleware API to always
receive an array of jobs. Most of the time, the array would have just
one job in it, but for batches it could receive the full list. This
seems like it would complicate writing the middleware a bit (and I
understand changing the middleware API signature is probably not
desirable either).
The other solution that came to mind was having the batch insert skirt
around the middleware. It would be similar to Rails' update_all function
which skips all of the ActiveRecord callbacks and validations and jumps
right to the DB to do a batch update. It would skip the client side
middleware and simply push all jobs in one large chunk. The server side
middleware would be unaffected. This slight change in semantics could be
made clear in the documentation.
Thoughts?
Mike Perham wrote:
>
> I took another look at that PR and saw one major issue: it sends all N
> jobs through the middleware pipeline at once. The middleware API is
> one distinct job at a time.
>
> On Tue, Aug 21, 2012 at 1:55 PM, Jake Mack<jakemack@gmail.com> wrote:
>>
>> I wanted to bring this up again because I think it's a very useful
>> addition, especially considering the amount of work sidekiq allows to be
>> processed because of its speed. There has already been a pull request
>> for it that I think could be revived:
>>
>> https://github.com/mperham/sidekiq/pull/264
>>
>> Basically, I'd like to be able to call a function to insert our ~800k
>> jobs into sidekiq in one shot (or a few, if there is some sort of max
>> request size in redis that I don't see documented in the redis command)
>> rather than having 800k round trip requests to the redis instance.
>> Splitting up the insertions into multiple workers inserting chunks of
>> jobs really only papers over the issue and is neither efficient nor
>> scalable.
>>
>> Thoughts?
>>
>> Jake
Re: [sidekiq] Batch Redis Inserts
- From:
- Mike Perham
- Date:
- 2012-08-23 @ 19:23
On Thu, Aug 23, 2012 at 11:56 AM, Jake Mack <jakemack@gmail.com> wrote:
> Another issue I noticed is that each job doesn't get a unique jid (because
> the PR was created before jids were implemented). That's relatively easy to
> fix, at least.
>
> The middleware API issue is bigger though. I haven't been able to come up
> with a nice, elegant way to resolve that yet, not sure if you have any ideas
> there. They seem to be inherently incompatible.
>
> One solution I thought up involves changing the middleware API to always
> receive an array of jobs. Most of the time, the array would have just one
> job in it, but for batches it could receive the full list. This seems like
> it would complicate writing the middleware a bit (and I understand changing
> the middleware API signature is probably not desirable either).
Correct, the signature can't change for a feature this edge-casey.
> The other solution that came to mind was having the batch insert skirt
> around the middleware. It would be similar to Rails' update_all function
> which skips all of the ActiveRecord callbacks and validations and jumps
> right to the DB to do a batch update. It would skip the client side
> middleware and simply push all jobs in one large chunk. The server side
> middleware would be unaffected. This slight change in semantics could be
> made clear in the documentation.
Yeah, that's definitely a reasonable trade-off.
mike