librelist archives

« back to archive

GWC in GeoNode

GWC in GeoNode

From:
Chris Holmes
Date:
2010-06-22 @ 22:18
I've been thinking a bit about how we can bring GeoWebCache in to 
GeoNode, to get at some of the great performance enhancements it can 
bring.  Ideally we seamlessly cache all layers viewed in GeoNode, both 
local and remote, even when those change.   There are twists with each, 
and both revolve around stale caches.

With local caches we need a way to truncate the cache if the style 
changes.  Ideally when one is in style edit mode we don't use GWC at 
all, only when someone does a final 'save' does it start caching the 
change.

With remote caches we need a way for a user to manage the cache, to 
invalidate it when the remote server changes, either data or style. 
Ideally it would have a GeoRSS feed of changes that GWC automatically 
truncates based on.  Less ideally there's a manual way to restart the 
caching.

A rough roadmap of how we might achieve the end goal:

* Start with just caching remote layers.  So when anyone puts in a 
remote WMS it automatically gets added as a GWC layer.  Gabriel is about 
to commit a Least Recently Used cache to GWC, which will allow an admin 
to set a total max for the cache.  So we could let people add any layer, 
but the admin of GeoNode can configure it to just cache the most used 
tiles, up to a limit they set, be it 100 megs or 2 terabytes.  For this 
first step the caches may just get invalid, but the admin would have the 
ability to truncate them in the GWC admin.

* Cache local layers, coordinating with Style changes.  I think Arne may 
have coded this up, at least for the embedded GWC.  We could perhaps 
start with just doing the cache on the embedded maps, since those won't 
have people switching to 'style mode'.  Maybe that intermediary step 
isn't necessary, but when we're in the map composer view we want to be 
sure that when people are styling they're not seeing GWC tiles.  When 
they finish styling we should then truncate the existing cache and start 
over.  Another simplifying assumption we could also consider making is 
only cache on the default style.  Not sure how much that actually helps.

* Remote layer management.  This is sort of more general, I think in the 
future we should figure out some more full representation in each 
GeoNode of a remote layer.  Right now remote layers can be added, but no 
metadata can be found out about it.  This is another whole topic, but 
the implication for here is that such a page should/could have a way to 
manage the cache of the local GeoNode.  So you could truncate the cache 
there (maybe just the person who added?  Maybe you can set permissions 
of who can truncate?).  And then possibly also add a GeoRSS location to 
automatically truncate from.


The cool thing this set of things should lead to is to give a benefit to 
people adding remote servers.  They get increased speed and reliability 
if they just add it to a map on a geoNode.  So we can come in with a 
GeoNode to an existing nice SDI implementation that already has a bunch 
of WMS services, and then people can start creating maps on top of it, 
and those maps perform even faster than the straight WMS.

Thoughts?  I think this could be a nice performance win, as most all our 
maps are tiled.  Should obviously be complemented by other 
optimizations, like on the javascript side, but the two together should 
make things quite zippy.

Re: [geonode] GWC in GeoNode

From:
David Winslow
Date:
2010-06-23 @ 15:08
shooting from the hip with some feedback on these ideas

On 06/22/2010 06:18 PM, Chris Holmes wrote:
> I've been thinking a bit about how we can bring GeoWebCache in to
> GeoNode, to get at some of the great performance enhancements it can
> bring.  Ideally we seamlessly cache all layers viewed in GeoNode, both
> local and remote, even when those change.   There are twists with each,
> and both revolve around stale caches.
>
> With local caches we need a way to truncate the cache if the style
> changes.  Ideally when one is in style edit mode we don't use GWC at
> all, only when someone does a final 'save' does it start caching the
> change.
>
> With remote caches we need a way for a user to manage the cache, to
> invalidate it when the remote server changes, either data or style.
> Ideally it would have a GeoRSS feed of changes that GWC automatically
> truncates based on.  Less ideally there's a manual way to restart the
> caching.
>
> A rough roadmap of how we might achieve the end goal:
>
> * Start with just caching remote layers.  So when anyone puts in a
> remote WMS it automatically gets added as a GWC layer.  Gabriel is about
> to commit a Least Recently Used cache to GWC, which will allow an admin
> to set a total max for the cache.  So we could let people add any layer,
> but the admin of GeoNode can configure it to just cache the most used
> tiles, up to a limit they set, be it 100 megs or 2 terabytes.  For this
> first step the caches may just get invalid, but the admin would have the
> ability to truncate them in the GWC admin.
>    
LRU worries me a bit; if we set the disk limit too low we may just end 
up with a lot of cache churn for little/no performance benefit.  And the 
disk requirements can grow with minimal warning, since anyone can add a 
layer. There's also an easy DOS attack - anyone can fetch 18 zoom levels 
of some layer nobody uses and trash the cache (not a huge deal, how long 
would it take an attacker to do that anyway?).  I'm not saying an LRU is 
a bad idea.  I think caching will be a great improvement.  It's just 
that there is a lot of room for refinement here (probably once we have 
better usage tracking we can use that to prioritize tilesets, for example.)

> * Cache local layers, coordinating with Style changes.  I think Arne may
> have coded this up, at least for the embedded GWC.  We could perhaps
> start with just doing the cache on the embedded maps, since those won't
> have people switching to 'style mode'.  Maybe that intermediary step
> isn't necessary, but when we're in the map composer view we want to be
> sure that when people are styling they're not seeing GWC tiles.  When
> they finish styling we should then truncate the existing cache and start
> over.  Another simplifying assumption we could also consider making is
> only cache on the default style.  Not sure how much that actually helps.
>    
I don't think we need to avoid caching alternative styles.

I do think we need to skip the cache while editing styles.

It would be nice if we could use cached layers everywhere, and have only 
the layer being styled switch to "straight" WMS when styling is active.
> * Remote layer management.  This is sort of more general, I think in the
> future we should figure out some more full representation in each
> GeoNode of a remote layer.  Right now remote layers can be added, but no
> metadata can be found out about it.  This is another whole topic, but
> the implication for here is that such a page should/could have a way to
> manage the cache of the local GeoNode.  So you could truncate the cache
> there (maybe just the person who added?  Maybe you can set permissions
> of who can truncate?).  And then possibly also add a GeoRSS location to
> automatically truncate from.
>    
Yeah, it would be awesome if adding a WMS to the composer application 
got that service added to the GeoNode's GeoNetwork index, complete with 
metadata pages in the Django web app.  And GeoNode can periodically scan 
the capabilities for added/removed layers, updated descriptions, new 
styles.  These would be reflected in GeoNetwork and GeoWebCache as well 
as the Django database.

It might be nice to also provide a listing of indexed services so users 
can track down the originating WMS services if they want.
> The cool thing this set of things should lead to is to give a benefit to
> people adding remote servers.  They get increased speed and reliability
> if they just add it to a map on a geoNode.  So we can come in with a
> GeoNode to an existing nice SDI implementation that already has a bunch
> of WMS services, and then people can start creating maps on top of it,
> and those maps perform even faster than the straight WMS.
>
> Thoughts?  I think this could be a nice performance win, as most all our
> maps are tiled.  Should obviously be complemented by other
> optimizations, like on the javascript side, but the two together should
> make things quite zippy.
>    
Having the WMS capabilities handled on the server side (and cached 
there) would probably be a nice win for loading services.  We could do 
away with reading capabilities entirely until the user pulls up the add 
layers dialog (which is not available in the embedded viewers).

We don't do GFI requests now but it might be worth thinking about how 
they interact with the cache.  I also don't see this map caching doing 
much for offline/distributed data management, which seems like caching 
of another sort.  It would be good to work out some answers related to that.

--
David Winslow
OpenGeo - http://opengeo.org/

Re: [geonode] GWC in GeoNode

From:
Gabriel Roldan
Date:
2010-06-23 @ 18:41
This is gonna be awesome. Some comments inline.

On 6/23/10 12:08 PM, David Winslow wrote:
> shooting from the hip with some feedback on these ideas
>
> On 06/22/2010 06:18 PM, Chris Holmes wrote:
>> I've been thinking a bit about how we can bring GeoWebCache in to
>> GeoNode, to get at some of the great performance enhancements it can
>> bring.  Ideally we seamlessly cache all layers viewed in GeoNode, both
>> local and remote, even when those change.   There are twists with each,
>> and both revolve around stale caches.
>>
>> With local caches we need a way to truncate the cache if the style
>> changes.  Ideally when one is in style edit mode we don't use GWC at
>> all, only when someone does a final 'save' does it start caching the
>> change.
>>
>> With remote caches we need a way for a user to manage the cache, to
>> invalidate it when the remote server changes, either data or style.
>> Ideally it would have a GeoRSS feed of changes that GWC automatically
>> truncates based on.  Less ideally there's a manual way to restart the
>> caching.
>>
>> A rough roadmap of how we might achieve the end goal:
>>
>> * Start with just caching remote layers.  So when anyone puts in a
>> remote WMS it automatically gets added as a GWC layer.
The GWC REST API is definitely clunky currently and would highly benefit 
if we do this

>>  Gabriel is about
>> to commit a Least Recently Used cache to GWC, which will allow an admin
>> to set a total max for the cache.
Right now the diskquota is an opt-in process meaning there's no global 
cache size cap, but you need to set the limit on a layer by layer basis. 
I think it would be easy to add a global limit so any non explicitly 
configured layer gets evenly capped to cope up with the global limit. 
How does that sound?

   So we could let people add any layer,
>> but the admin of GeoNode can configure it to just cache the most used
>> tiles, up to a limit they set, be it 100 megs or 2 terabytes.  For this
>> first step the caches may just get invalid, but the admin would have the
>> ability to truncate them in the GWC admin.
>>
> LRU worries me a bit; if we set the disk limit too low we may just end
> up with a lot of cache churn for little/no performance benefit.
In my mind, GeoWebcache is "incomplete" as a product until we add the 
following enhancements:
  - configuration option to cache layers only up to a certain zoom 
level, and from that level on, defer to pure proxy mode
  - diskquota, which is kind of in beta testing now
  - Identify and avoid seeding empty tiles. This can be easily done with 
the JAI Extrema operation (or even Histogram) or the user might 
configure a no-data color for the layer?
  - Definition of an area of interest, so that a geometry defines the 
allowed seeding area for a layer

   And the
> disk requirements can grow with minimal warning, since anyone can add a
> layer. There's also an easy DOS attack - anyone can fetch 18 zoom levels
> of some layer nobody uses and trash the cache (not a huge deal, how long
> would it take an attacker to do that anyway?).
That would put the LRU diskquota enforcement job to work and hence wipe 
out those tiles that are least used. This plus the ability to set a 
limit on the number of zoom levels to actually cache would bring us 
closer to the safe zone?

   I'm not saying an LRU is
> a bad idea.  I think caching will be a great improvement.  It's just
> that there is a lot of room for refinement here (probably once we have
> better usage tracking we can use that to prioritize tilesets, for example.)
Wouldn't the LRU stats be enough for that? Note we also have an LFU 
(Least Frequently Used) expiration policy for diskquota enforcement, 
which looks closer to the kind of usage tracking you mention?
>
>> * Cache local layers, coordinating with Style changes.  I think Arne may
>> have coded this up, at least for the embedded GWC.
Yes. The problem with the embedded GWC is that is completely wipes out 
the entire layer cache upon _any_ modification, including WFS 
transactions, resulting too heavily truncated caches. You 
add/remote/edit a single feature, the whole layer cache is discarded.
There's room to improve that based on bounding box/bounding polygon with 
some stuff created for the GeoRSS module though.
   We could perhaps
>> start with just doing the cache on the embedded maps, since those won't
>> have people switching to 'style mode'.  Maybe that intermediary step
>> isn't necessary, but when we're in the map composer view we want to be
>> sure that when people are styling they're not seeing GWC tiles.
Related: I've been wondering since some time now if it wouldn't make 
sense to also integrate the WMS service endpoints for WMS and GWC, like 
in GWC being a front barrier for /geosever/wms instead of having to 
explicitly go through /geoserver/gwc?service=WMS...

Back to topic: couldn't the styles just use a CGI flag to indicate when 
to ignore the cache and go straight to the WMS? AFAIK tiled=false would 
make the trick.

   When
>> they finish styling we should then truncate the existing cache and start
>> over.  Another simplifying assumption we could also consider making is
>> only cache on the default style.  Not sure how much that actually helps.
>>
> I don't think we need to avoid caching alternative styles.
I think right now GWC only seeds on the default style, and lazily caches 
non default styles. Are we talking about preseeding here or just lazy 
cacheing?

>
> I do think we need to skip the cache while editing styles.
>
> It would be nice if we could use cached layers everywhere, and have only
> the layer being styled switch to "straight" WMS when styling is active.
>> * Remote layer management.  This is sort of more general, I think in the
>> future we should figure out some more full representation in each
>> GeoNode of a remote layer.  Right now remote layers can be added, but no
>> metadata can be found out about it.  This is another whole topic, but
>> the implication for here is that such a page should/could have a way to
>> manage the cache of the local GeoNode.  So you could truncate the cache
>> there (maybe just the person who added?  Maybe you can set permissions
>> of who can truncate?).  And then possibly also add a GeoRSS location to
>> automatically truncate from.
>>
> Yeah, it would be awesome if adding a WMS to the composer application
> got that service added to the GeoNode's GeoNetwork index, complete with
> metadata pages in the Django web app.  And GeoNode can periodically scan
> the capabilities for added/removed layers, updated descriptions, new
> styles.  These would be reflected in GeoNetwork and GeoWebCache as well
> as the Django database.
>
> It might be nice to also provide a listing of indexed services so users
> can track down the originating WMS services if they want.
>> The cool thing this set of things should lead to is to give a benefit to
>> people adding remote servers.  They get increased speed and reliability
>> if they just add it to a map on a geoNode.  So we can come in with a
>> GeoNode to an existing nice SDI implementation that already has a bunch
>> of WMS services, and then people can start creating maps on top of it,
>> and those maps perform even faster than the straight WMS.
>>
>> Thoughts?  I think this could be a nice performance win, as most all our
>> maps are tiled.  Should obviously be complemented by other
>> optimizations, like on the javascript side, but the two together should
>> make things quite zippy.
>>
> Having the WMS capabilities handled on the server side (and cached
> there) would probably be a nice win for loading services.  We could do
> away with reading capabilities entirely until the user pulls up the add
> layers dialog (which is not available in the embedded viewers).
>
> We don't do GFI requests now but it might be worth thinking about how
> they interact with the cache.  I also don't see this map caching doing
> much for offline/distributed data management, which seems like caching
> of another sort.  It would be good to work out some answers related to that.
I don't get it. Could you elaborate?

Cheers,
Gabriel
>
> --
> David Winslow
> OpenGeo - http://opengeo.org/


-- 
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Re: [geonode] GWC in GeoNode

From:
Chris Holmes
Date:
2010-06-30 @ 13:14
>>>   Gabriel is about
>>> to commit a Least Recently Used cache to GWC, which will allow an admin
>>> to set a total max for the cache.
> Right now the diskquota is an opt-in process meaning there's no global
> cache size cap, but you need to set the limit on a layer by layer basis.
> I think it would be easy to add a global limit so any non explicitly
> configured layer gets evenly capped to cope up with the global limit.
> How does that sound?
>

That's an ok measure, if the truly global limit is hard.  But I think 
it's a lot more likely that we'll have a few layers that get a ton of 
access, and a number that just get viewed a couple times.  It'd be much 
better if the few that get accessed a ton have bigger caches at the 
expense of those that get rarely accessed.

But if that's hard then go ahead and build it the easy way, and we can 
see the actual nature of use.

>     So we could let people add any layer,
>>> but the admin of GeoNode can configure it to just cache the most used
>>> tiles, up to a limit they set, be it 100 megs or 2 terabytes.  For this
>>> first step the caches may just get invalid, but the admin would have the
>>> ability to truncate them in the GWC admin.
>>>
>> LRU worries me a bit; if we set the disk limit too low we may just end
>> up with a lot of cache churn for little/no performance benefit.

I think we should just default it to be quite high.  And recommend 
people turn it off if they really don't have much disk space.  I just 
think we should put some limit (50 gigs?) so everyone running geonodes 
doesn't always run out of space.  I can see an argument that we should 
just wait till that happens, but I'm pretty sure if we start caching 
every remote and local layer we'll get there pretty fast.

> In my mind, GeoWebcache is "incomplete" as a product until we add the
> following enhancements:
>    - configuration option to cache layers only up to a certain zoom
> level, and from that level on, defer to pure proxy mode
>    - diskquota, which is kind of in beta testing now
>    - Identify and avoid seeding empty tiles. This can be easily done with
> the JAI Extrema operation (or even Histogram) or the user might
> configure a no-data color for the layer?
>    - Definition of an area of interest, so that a geometry defines the
> allowed seeding area for a layer
>

I agree on all this, though we should definitely integrate before we're 
complete.  Indeed for fully complete I'd add gui configuration, ideally 
through the GeoNode.


>>> * Cache local layers, coordinating with Style changes.  I think Arne may
>>> have coded this up, at least for the embedded GWC.
> Yes. The problem with the embedded GWC is that is completely wipes out
> the entire layer cache upon _any_ modification, including WFS
> transactions, resulting too heavily truncated caches. You
> add/remote/edit a single feature, the whole layer cache is discarded.
> There's room to improve that based on bounding box/bounding polygon with
> some stuff created for the GeoRSS module though.

+1.  We're not doing any GeoNode editing now though, so it's fairly moot.

>     We could perhaps
>>> start with just doing the cache on the embedded maps, since those won't
>>> have people switching to 'style mode'.  Maybe that intermediary step
>>> isn't necessary, but when we're in the map composer view we want to be
>>> sure that when people are styling they're not seeing GWC tiles.
> Related: I've been wondering since some time now if it wouldn't make
> sense to also integrate the WMS service endpoints for WMS and GWC, like
> in GWC being a front barrier for /geosever/wms instead of having to
> explicitly go through /geoserver/gwc?service=WMS...
>

At some point yes.  Ideally for me a WMS request to GS would 
transparently check if it was a tile that GWC is configured to 
understand, cache it if it's not there, retrieve it if it is.

> Back to topic: couldn't the styles just use a CGI flag to indicate when
> to ignore the cache and go straight to the WMS? AFAIK tiled=false would
> make the trick.
>

Yeah, I was thinking something along those lines.

>     When
>>> they finish styling we should then truncate the existing cache and start
>>> over.  Another simplifying assumption we could also consider making is
>>> only cache on the default style.  Not sure how much that actually helps.
>>>
>> I don't think we need to avoid caching alternative styles.
> I think right now GWC only seeds on the default style, and lazily caches
> non default styles. Are we talking about preseeding here or just lazy
> cacheing?
>

Lazy caching - all lazy caching, even for default style.

>>
>> I do think we need to skip the cache while editing styles.
>>
>> It would be nice if we could use cached layers everywhere, and have only
>> the layer being styled switch to "straight" WMS when styling is active.
>>> * Remote layer management.  This is sort of more general, I think in the
>>> future we should figure out some more full representation in each
>>> GeoNode of a remote layer.  Right now remote layers can be added, but no
>>> metadata can be found out about it.  This is another whole topic, but
>>> the implication for here is that such a page should/could have a way to
>>> manage the cache of the local GeoNode.  So you could truncate the cache
>>> there (maybe just the person who added?  Maybe you can set permissions
>>> of who can truncate?).  And then possibly also add a GeoRSS location to
>>> automatically truncate from.
>>>
>> Yeah, it would be awesome if adding a WMS to the composer application
>> got that service added to the GeoNode's GeoNetwork index, complete with
>> metadata pages in the Django web app.  And GeoNode can periodically scan
>> the capabilities for added/removed layers, updated descriptions, new
>> styles.  These would be reflected in GeoNetwork and GeoWebCache as well
>> as the Django database.
>>

+1  Ideally it would also follow the WMS's links to metadata documents 
if they are there, and use that to populate.  But that's a pretty minor 
use case I think, at least to start.

>> It might be nice to also provide a listing of indexed services so users
>> can track down the originating WMS services if they want.
>>> The cool thing this set of things should lead to is to give a benefit to
>>> people adding remote servers.  They get increased speed and reliability
>>> if they just add it to a map on a geoNode.  So we can come in with a
>>> GeoNode to an existing nice SDI implementation that already has a bunch
>>> of WMS services, and then people can start creating maps on top of it,
>>> and those maps perform even faster than the straight WMS.
>>>
>>> Thoughts?  I think this could be a nice performance win, as most all our
>>> maps are tiled.  Should obviously be complemented by other
>>> optimizations, like on the javascript side, but the two together should
>>> make things quite zippy.
>>>
>> Having the WMS capabilities handled on the server side (and cached
>> there) would probably be a nice win for loading services.  We could do
>> away with reading capabilities entirely until the user pulls up the add
>> layers dialog (which is not available in the embedded viewers).
>>

Makes sense.

C

Re: [geonode] GWC in GeoNode

From:
Gabriel Roldan
Date:
2010-06-30 @ 13:34
On 6/30/10 10:14 AM, Chris Holmes wrote:
>
>>>>    Gabriel is about
>>>> to commit a Least Recently Used cache to GWC, which will allow an admin
>>>> to set a total max for the cache.
>> Right now the diskquota is an opt-in process meaning there's no global
>> cache size cap, but you need to set the limit on a layer by layer basis.
>> I think it would be easy to add a global limit so any non explicitly
>> configured layer gets evenly capped to cope up with the global limit.
>> How does that sound?
>>
>
> That's an ok measure, if the truly global limit is hard.  But I think
> it's a lot more likely that we'll have a few layers that get a ton of
> access, and a number that just get viewed a couple times.  It'd be much
> better if the few that get accessed a ton have bigger caches at the
> expense of those that get rarely accessed.
>
> But if that's hard then go ahead and build it the easy way, and we can
> see the actual nature of use.

Giving priority to the most used layers makes a lot of sense.
What we could do is to configure the global limit to use LFU (Least 
Frequently Used) expiration policy instead of LRU, and to aggregate the 
stats from all layers when whipping out. I think that'd make the trick.

Agreed on all the other comments too.

Cheers,
Gabriel

Re: [geonode] GWC in GeoNode

From:
Gabriel Roldan
Date:
2010-07-15 @ 15:53
Hello all,

Having to prioritize other tasks I'm giving the GWC Integration proposal 
the status of "finished under protest" :)
Meaning it'd be great for any/all of you to give it a read and comment 
on any issue/concern, at your earliest convenience, no rush.
We'll be back when time permits.

<http://projects.opengeo.org/CAPRA/wiki/GWC_Integration_Proposal>

Cheers,
Gabriel

On 6/30/10 10:34 AM, Gabriel Roldan wrote:
> On 6/30/10 10:14 AM, Chris Holmes wrote:
>>
>>>>>     Gabriel is about
>>>>> to commit a Least Recently Used cache to GWC, which will allow an admin
>>>>> to set a total max for the cache.
>>> Right now the diskquota is an opt-in process meaning there's no global
>>> cache size cap, but you need to set the limit on a layer by layer basis.
>>> I think it would be easy to add a global limit so any non explicitly
>>> configured layer gets evenly capped to cope up with the global limit.
>>> How does that sound?
>>>
>>
>> That's an ok measure, if the truly global limit is hard.  But I think
>> it's a lot more likely that we'll have a few layers that get a ton of
>> access, and a number that just get viewed a couple times.  It'd be much
>> better if the few that get accessed a ton have bigger caches at the
>> expense of those that get rarely accessed.
>>
>> But if that's hard then go ahead and build it the easy way, and we can
>> see the actual nature of use.
>
> Giving priority to the most used layers makes a lot of sense.
> What we could do is to configure the global limit to use LFU (Least
> Frequently Used) expiration policy instead of LRU, and to aggregate the
> stats from all layers when whipping out. I think that'd make the trick.
>
> Agreed on all the other comments too.
>
> Cheers,
> Gabriel


-- 
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Re: [geonode] GWC in GeoNode

From:
Gabriel Roldán
Date:
2010-09-27 @ 15:49
Hey all,

trying to give it a second spin to the GWC Integration proposal work
items found Andreas' totally right complain about
<http://projects.opengeo.org/CAPRA/ticket/610>. Just commented on it.
I know everyone is busy to death but if you have a chance to go through
the tickets that will affect you please try to adapt the estimates and
content as appropriate.

Gabriel.

On Thu, 2010-07-15 at 12:53 -0300, Gabriel Roldan wrote:
> Hello all,
> 
> Having to prioritize other tasks I'm giving the GWC Integration proposal 
> the status of "finished under protest" :)
> Meaning it'd be great for any/all of you to give it a read and comment 
> on any issue/concern, at your earliest convenience, no rush.
> We'll be back when time permits.
> 
> <http://projects.opengeo.org/CAPRA/wiki/GWC_Integration_Proposal>
> 
> Cheers,
> Gabriel
> 
> On 6/30/10 10:34 AM, Gabriel Roldan wrote:
> > On 6/30/10 10:14 AM, Chris Holmes wrote:
> >>
> >>>>>     Gabriel is about
> >>>>> to commit a Least Recently Used cache to GWC, which will allow an admin
> >>>>> to set a total max for the cache.
> >>> Right now the diskquota is an opt-in process meaning there's no global
> >>> cache size cap, but you need to set the limit on a layer by layer basis.
> >>> I think it would be easy to add a global limit so any non explicitly
> >>> configured layer gets evenly capped to cope up with the global limit.
> >>> How does that sound?
> >>>
> >>
> >> That's an ok measure, if the truly global limit is hard.  But I think
> >> it's a lot more likely that we'll have a few layers that get a ton of
> >> access, and a number that just get viewed a couple times.  It'd be much
> >> better if the few that get accessed a ton have bigger caches at the
> >> expense of those that get rarely accessed.
> >>
> >> But if that's hard then go ahead and build it the easy way, and we can
> >> see the actual nature of use.
> >
> > Giving priority to the most used layers makes a lot of sense.
> > What we could do is to configure the global limit to use LFU (Least
> > Frequently Used) expiration policy instead of LRU, and to aggregate the
> > stats from all layers when whipping out. I think that'd make the trick.
> >
> > Agreed on all the other comments too.
> >
> > Cheers,
> > Gabriel
> 
> 

-- 
Gabriel Roldan
groldan@opengeo.org
Expert service straight from the developers

Re: [geonode] GWC in GeoNode

From:
Sebastian Benthall
Date:
2010-06-25 @ 16:51
Chris, correct me if I'm wrong, but we go the goahead to work on this, yeah?

I guess that means we should have a new git branch for this?

On Wed, Jun 23, 2010 at 2:41 PM, Gabriel Roldan <groldan@opengeo.org> wrote:

> This is gonna be awesome. Some comments inline.
>
> On 6/23/10 12:08 PM, David Winslow wrote:
> > shooting from the hip with some feedback on these ideas
> >
> > On 06/22/2010 06:18 PM, Chris Holmes wrote:
> >> I've been thinking a bit about how we can bring GeoWebCache in to
> >> GeoNode, to get at some of the great performance enhancements it can
> >> bring.  Ideally we seamlessly cache all layers viewed in GeoNode, both
> >> local and remote, even when those change.   There are twists with each,
> >> and both revolve around stale caches.
> >>
> >> With local caches we need a way to truncate the cache if the style
> >> changes.  Ideally when one is in style edit mode we don't use GWC at
> >> all, only when someone does a final 'save' does it start caching the
> >> change.
> >>
> >> With remote caches we need a way for a user to manage the cache, to
> >> invalidate it when the remote server changes, either data or style.
> >> Ideally it would have a GeoRSS feed of changes that GWC automatically
> >> truncates based on.  Less ideally there's a manual way to restart the
> >> caching.
> >>
> >> A rough roadmap of how we might achieve the end goal:
> >>
> >> * Start with just caching remote layers.  So when anyone puts in a
> >> remote WMS it automatically gets added as a GWC layer.
> The GWC REST API is definitely clunky currently and would highly benefit
> if we do this
>
> >>  Gabriel is about
> >> to commit a Least Recently Used cache to GWC, which will allow an admin
> >> to set a total max for the cache.
> Right now the diskquota is an opt-in process meaning there's no global
> cache size cap, but you need to set the limit on a layer by layer basis.
> I think it would be easy to add a global limit so any non explicitly
> configured layer gets evenly capped to cope up with the global limit.
> How does that sound?
>
>   So we could let people add any layer,
> >> but the admin of GeoNode can configure it to just cache the most used
> >> tiles, up to a limit they set, be it 100 megs or 2 terabytes.  For this
> >> first step the caches may just get invalid, but the admin would have the
> >> ability to truncate them in the GWC admin.
> >>
> > LRU worries me a bit; if we set the disk limit too low we may just end
> > up with a lot of cache churn for little/no performance benefit.
> In my mind, GeoWebcache is "incomplete" as a product until we add the
> following enhancements:
>  - configuration option to cache layers only up to a certain zoom
> level, and from that level on, defer to pure proxy mode
>  - diskquota, which is kind of in beta testing now
>  - Identify and avoid seeding empty tiles. This can be easily done with
> the JAI Extrema operation (or even Histogram) or the user might
> configure a no-data color for the layer?
>  - Definition of an area of interest, so that a geometry defines the
> allowed seeding area for a layer
>
>   And the
> > disk requirements can grow with minimal warning, since anyone can add a
> > layer. There's also an easy DOS attack - anyone can fetch 18 zoom levels
> > of some layer nobody uses and trash the cache (not a huge deal, how long
> > would it take an attacker to do that anyway?).
> That would put the LRU diskquota enforcement job to work and hence wipe
> out those tiles that are least used. This plus the ability to set a
> limit on the number of zoom levels to actually cache would bring us
> closer to the safe zone?
>
>   I'm not saying an LRU is
> > a bad idea.  I think caching will be a great improvement.  It's just
> > that there is a lot of room for refinement here (probably once we have
> > better usage tracking we can use that to prioritize tilesets, for
> example.)
> Wouldn't the LRU stats be enough for that? Note we also have an LFU
> (Least Frequently Used) expiration policy for diskquota enforcement,
> which looks closer to the kind of usage tracking you mention?
> >
> >> * Cache local layers, coordinating with Style changes.  I think Arne may
> >> have coded this up, at least for the embedded GWC.
> Yes. The problem with the embedded GWC is that is completely wipes out
> the entire layer cache upon _any_ modification, including WFS
> transactions, resulting too heavily truncated caches. You
> add/remote/edit a single feature, the whole layer cache is discarded.
> There's room to improve that based on bounding box/bounding polygon with
> some stuff created for the GeoRSS module though.
>    We could perhaps
> >> start with just doing the cache on the embedded maps, since those won't
> >> have people switching to 'style mode'.  Maybe that intermediary step
> >> isn't necessary, but when we're in the map composer view we want to be
> >> sure that when people are styling they're not seeing GWC tiles.
> Related: I've been wondering since some time now if it wouldn't make
> sense to also integrate the WMS service endpoints for WMS and GWC, like
> in GWC being a front barrier for /geosever/wms instead of having to
> explicitly go through /geoserver/gwc?service=WMS...
>
> Back to topic: couldn't the styles just use a CGI flag to indicate when
> to ignore the cache and go straight to the WMS? AFAIK tiled=false would
> make the trick.
>
>   When
> >> they finish styling we should then truncate the existing cache and start
> >> over.  Another simplifying assumption we could also consider making is
> >> only cache on the default style.  Not sure how much that actually helps.
> >>
> > I don't think we need to avoid caching alternative styles.
> I think right now GWC only seeds on the default style, and lazily caches
> non default styles. Are we talking about preseeding here or just lazy
> cacheing?
>
> >
> > I do think we need to skip the cache while editing styles.
> >
> > It would be nice if we could use cached layers everywhere, and have only
> > the layer being styled switch to "straight" WMS when styling is active.
> >> * Remote layer management.  This is sort of more general, I think in the
> >> future we should figure out some more full representation in each
> >> GeoNode of a remote layer.  Right now remote layers can be added, but no
> >> metadata can be found out about it.  This is another whole topic, but
> >> the implication for here is that such a page should/could have a way to
> >> manage the cache of the local GeoNode.  So you could truncate the cache
> >> there (maybe just the person who added?  Maybe you can set permissions
> >> of who can truncate?).  And then possibly also add a GeoRSS location to
> >> automatically truncate from.
> >>
> > Yeah, it would be awesome if adding a WMS to the composer application
> > got that service added to the GeoNode's GeoNetwork index, complete with
> > metadata pages in the Django web app.  And GeoNode can periodically scan
> > the capabilities for added/removed layers, updated descriptions, new
> > styles.  These would be reflected in GeoNetwork and GeoWebCache as well
> > as the Django database.
> >
> > It might be nice to also provide a listing of indexed services so users
> > can track down the originating WMS services if they want.
> >> The cool thing this set of things should lead to is to give a benefit to
> >> people adding remote servers.  They get increased speed and reliability
> >> if they just add it to a map on a geoNode.  So we can come in with a
> >> GeoNode to an existing nice SDI implementation that already has a bunch
> >> of WMS services, and then people can start creating maps on top of it,
> >> and those maps perform even faster than the straight WMS.
> >>
> >> Thoughts?  I think this could be a nice performance win, as most all our
> >> maps are tiled.  Should obviously be complemented by other
> >> optimizations, like on the javascript side, but the two together should
> >> make things quite zippy.
> >>
> > Having the WMS capabilities handled on the server side (and cached
> > there) would probably be a nice win for loading services.  We could do
> > away with reading capabilities entirely until the user pulls up the add
> > layers dialog (which is not available in the embedded viewers).
> >
> > We don't do GFI requests now but it might be worth thinking about how
> > they interact with the cache.  I also don't see this map caching doing
> > much for offline/distributed data management, which seems like caching
> > of another sort.  It would be good to work out some answers related to
> that.
> I don't get it. Could you elaborate?
>
> Cheers,
> Gabriel
> >
> > --
> > David Winslow
> > OpenGeo - http://opengeo.org/
>
>
> --
> Gabriel Roldan
> OpenGeo - http://opengeo.org
> Expert service straight from the developers.
>



-- 
Sebastian Benthall
OpenGeo - http://opengeo.org

Re: [geonode] GWC in GeoNode

From:
David Winslow
Date:
2010-06-25 @ 16:56
We don't have a build of GeoWebCache in the GeoNode source tree right 
now so if we are going to start customizing it we will need more than a 
branch.  Gabriel, let me know if you think we need to do something about 
this.

--
David Winslow
OpenGeo - http://opengeo.org/

On 06/25/2010 12:51 PM, Sebastian Benthall wrote:
> Chris, correct me if I'm wrong, but we go the goahead to work on this, 
> yeah?
>
> I guess that means we should have a new git branch for this?
>
> On Wed, Jun 23, 2010 at 2:41 PM, Gabriel Roldan <groldan@opengeo.org 
> <mailto:groldan@opengeo.org>> wrote:
>
>     This is gonna be awesome. Some comments inline.
>
>     On 6/23/10 12:08 PM, David Winslow wrote:
>     > shooting from the hip with some feedback on these ideas
>     >
>     > On 06/22/2010 06:18 PM, Chris Holmes wrote:
>     >> I've been thinking a bit about how we can bring GeoWebCache in to
>     >> GeoNode, to get at some of the great performance enhancements
>     it can
>     >> bring.  Ideally we seamlessly cache all layers viewed in
>     GeoNode, both
>     >> local and remote, even when those change.   There are twists
>     with each,
>     >> and both revolve around stale caches.
>     >>
>     >> With local caches we need a way to truncate the cache if the style
>     >> changes.  Ideally when one is in style edit mode we don't use
>     GWC at
>     >> all, only when someone does a final 'save' does it start
>     caching the
>     >> change.
>     >>
>     >> With remote caches we need a way for a user to manage the cache, to
>     >> invalidate it when the remote server changes, either data or style.
>     >> Ideally it would have a GeoRSS feed of changes that GWC
>     automatically
>     >> truncates based on.  Less ideally there's a manual way to
>     restart the
>     >> caching.
>     >>
>     >> A rough roadmap of how we might achieve the end goal:
>     >>
>     >> * Start with just caching remote layers.  So when anyone puts in a
>     >> remote WMS it automatically gets added as a GWC layer.
>     The GWC REST API is definitely clunky currently and would highly
>     benefit
>     if we do this
>
>     >>  Gabriel is about
>     >> to commit a Least Recently Used cache to GWC, which will allow
>     an admin
>     >> to set a total max for the cache.
>     Right now the diskquota is an opt-in process meaning there's no global
>     cache size cap, but you need to set the limit on a layer by layer
>     basis.
>     I think it would be easy to add a global limit so any non explicitly
>     configured layer gets evenly capped to cope up with the global limit.
>     How does that sound?
>
>       So we could let people add any layer,
>     >> but the admin of GeoNode can configure it to just cache the
>     most used
>     >> tiles, up to a limit they set, be it 100 megs or 2 terabytes.
>      For this
>     >> first step the caches may just get invalid, but the admin would
>     have the
>     >> ability to truncate them in the GWC admin.
>     >>
>     > LRU worries me a bit; if we set the disk limit too low we may
>     just end
>     > up with a lot of cache churn for little/no performance benefit.
>     In my mind, GeoWebcache is "incomplete" as a product until we add the
>     following enhancements:
>      - configuration option to cache layers only up to a certain zoom
>     level, and from that level on, defer to pure proxy mode
>      - diskquota, which is kind of in beta testing now
>      - Identify and avoid seeding empty tiles. This can be easily done
>     with
>     the JAI Extrema operation (or even Histogram) or the user might
>     configure a no-data color for the layer?
>      - Definition of an area of interest, so that a geometry defines the
>     allowed seeding area for a layer
>
>       And the
>     > disk requirements can grow with minimal warning, since anyone
>     can add a
>     > layer. There's also an easy DOS attack - anyone can fetch 18
>     zoom levels
>     > of some layer nobody uses and trash the cache (not a huge deal,
>     how long
>     > would it take an attacker to do that anyway?).
>     That would put the LRU diskquota enforcement job to work and hence
>     wipe
>     out those tiles that are least used. This plus the ability to set a
>     limit on the number of zoom levels to actually cache would bring us
>     closer to the safe zone?
>
>       I'm not saying an LRU is
>     > a bad idea.  I think caching will be a great improvement.  It's just
>     > that there is a lot of room for refinement here (probably once
>     we have
>     > better usage tracking we can use that to prioritize tilesets,
>     for example.)
>     Wouldn't the LRU stats be enough for that? Note we also have an LFU
>     (Least Frequently Used) expiration policy for diskquota enforcement,
>     which looks closer to the kind of usage tracking you mention?
>     >
>     >> * Cache local layers, coordinating with Style changes.  I think
>     Arne may
>     >> have coded this up, at least for the embedded GWC.
>     Yes. The problem with the embedded GWC is that is completely wipes out
>     the entire layer cache upon _any_ modification, including WFS
>     transactions, resulting too heavily truncated caches. You
>     add/remote/edit a single feature, the whole layer cache is discarded.
>     There's room to improve that based on bounding box/bounding
>     polygon with
>     some stuff created for the GeoRSS module though.
>       We could perhaps
>     >> start with just doing the cache on the embedded maps, since
>     those won't
>     >> have people switching to 'style mode'.  Maybe that intermediary
>     step
>     >> isn't necessary, but when we're in the map composer view we
>     want to be
>     >> sure that when people are styling they're not seeing GWC tiles.
>     Related: I've been wondering since some time now if it wouldn't make
>     sense to also integrate the WMS service endpoints for WMS and GWC,
>     like
>     in GWC being a front barrier for /geosever/wms instead of having to
>     explicitly go through /geoserver/gwc?service=WMS...
>
>     Back to topic: couldn't the styles just use a CGI flag to indicate
>     when
>     to ignore the cache and go straight to the WMS? AFAIK tiled=false
>     would
>     make the trick.
>
>       When
>     >> they finish styling we should then truncate the existing cache
>     and start
>     >> over.  Another simplifying assumption we could also consider
>     making is
>     >> only cache on the default style.  Not sure how much that
>     actually helps.
>     >>
>     > I don't think we need to avoid caching alternative styles.
>     I think right now GWC only seeds on the default style, and lazily
>     caches
>     non default styles. Are we talking about preseeding here or just lazy
>     cacheing?
>
>     >
>     > I do think we need to skip the cache while editing styles.
>     >
>     > It would be nice if we could use cached layers everywhere, and
>     have only
>     > the layer being styled switch to "straight" WMS when styling is
>     active.
>     >> * Remote layer management.  This is sort of more general, I
>     think in the
>     >> future we should figure out some more full representation in each
>     >> GeoNode of a remote layer.  Right now remote layers can be
>     added, but no
>     >> metadata can be found out about it.  This is another whole
>     topic, but
>     >> the implication for here is that such a page should/could have
>     a way to
>     >> manage the cache of the local GeoNode.  So you could truncate
>     the cache
>     >> there (maybe just the person who added?  Maybe you can set
>     permissions
>     >> of who can truncate?).  And then possibly also add a GeoRSS
>     location to
>     >> automatically truncate from.
>     >>
>     > Yeah, it would be awesome if adding a WMS to the composer
>     application
>     > got that service added to the GeoNode's GeoNetwork index,
>     complete with
>     > metadata pages in the Django web app.  And GeoNode can
>     periodically scan
>     > the capabilities for added/removed layers, updated descriptions, new
>     > styles.  These would be reflected in GeoNetwork and GeoWebCache
>     as well
>     > as the Django database.
>     >
>     > It might be nice to also provide a listing of indexed services
>     so users
>     > can track down the originating WMS services if they want.
>     >> The cool thing this set of things should lead to is to give a
>     benefit to
>     >> people adding remote servers.  They get increased speed and
>     reliability
>     >> if they just add it to a map on a geoNode.  So we can come in
>     with a
>     >> GeoNode to an existing nice SDI implementation that already has
>     a bunch
>     >> of WMS services, and then people can start creating maps on top
>     of it,
>     >> and those maps perform even faster than the straight WMS.
>     >>
>     >> Thoughts?  I think this could be a nice performance win, as
>     most all our
>     >> maps are tiled.  Should obviously be complemented by other
>     >> optimizations, like on the javascript side, but the two
>     together should
>     >> make things quite zippy.
>     >>
>     > Having the WMS capabilities handled on the server side (and cached
>     > there) would probably be a nice win for loading services.  We
>     could do
>     > away with reading capabilities entirely until the user pulls up
>     the add
>     > layers dialog (which is not available in the embedded viewers).
>     >
>     > We don't do GFI requests now but it might be worth thinking
>     about how
>     > they interact with the cache.  I also don't see this map caching
>     doing
>     > much for offline/distributed data management, which seems like
>     caching
>     > of another sort.  It would be good to work out some answers
>     related to that.
>     I don't get it. Could you elaborate?
>
>     Cheers,
>     Gabriel
>     >
>     > --
>     > David Winslow
>     > OpenGeo - http://opengeo.org/
>
>
>     --
>     Gabriel Roldan
>     OpenGeo - http://opengeo.org
>     Expert service straight from the developers.
>
>
>
>
> -- 
> Sebastian Benthall
> OpenGeo - http://opengeo.org
>

Re: [geonode] GWC in GeoNode

From:
Gabriel Roldan
Date:
2010-06-25 @ 18:05
we can figure out the workflow details later.
What I'd want is for us to lay down a clear scope, something with can 
easily track progress against.
How should we proceed? use cases -> feature specs -> tech spec?

Gabriel
On 6/25/10 1:56 PM, David Winslow wrote:
> We don't have a build of GeoWebCache in the GeoNode source tree right
> now so if we are going to start customizing it we will need more than a
> branch. Gabriel, let me know if you think we need to do something about
> this.
>
> --
> David Winslow
> OpenGeo - http://opengeo.org/
>
> On 06/25/2010 12:51 PM, Sebastian Benthall wrote:
>> Chris, correct me if I'm wrong, but we go the goahead to work on this,
>> yeah?
>>
>> I guess that means we should have a new git branch for this?
>>
>> On Wed, Jun 23, 2010 at 2:41 PM, Gabriel Roldan <groldan@opengeo.org
>> <mailto:groldan@opengeo.org>> wrote:
>>
>> This is gonna be awesome. Some comments inline.
>>
>> On 6/23/10 12:08 PM, David Winslow wrote:
>> > shooting from the hip with some feedback on these ideas
>> >
>> > On 06/22/2010 06:18 PM, Chris Holmes wrote:
>> >> I've been thinking a bit about how we can bring GeoWebCache in to
>> >> GeoNode, to get at some of the great performance enhancements
>> it can
>> >> bring. Ideally we seamlessly cache all layers viewed in
>> GeoNode, both
>> >> local and remote, even when those change. There are twists
>> with each,
>> >> and both revolve around stale caches.
>> >>
>> >> With local caches we need a way to truncate the cache if the style
>> >> changes. Ideally when one is in style edit mode we don't use
>> GWC at
>> >> all, only when someone does a final 'save' does it start
>> caching the
>> >> change.
>> >>
>> >> With remote caches we need a way for a user to manage the cache, to
>> >> invalidate it when the remote server changes, either data or style.
>> >> Ideally it would have a GeoRSS feed of changes that GWC
>> automatically
>> >> truncates based on. Less ideally there's a manual way to
>> restart the
>> >> caching.
>> >>
>> >> A rough roadmap of how we might achieve the end goal:
>> >>
>> >> * Start with just caching remote layers. So when anyone puts in a
>> >> remote WMS it automatically gets added as a GWC layer.
>> The GWC REST API is definitely clunky currently and would highly
>> benefit
>> if we do this
>>
>> >> Gabriel is about
>> >> to commit a Least Recently Used cache to GWC, which will allow
>> an admin
>> >> to set a total max for the cache.
>> Right now the diskquota is an opt-in process meaning there's no global
>> cache size cap, but you need to set the limit on a layer by layer
>> basis.
>> I think it would be easy to add a global limit so any non explicitly
>> configured layer gets evenly capped to cope up with the global limit.
>> How does that sound?
>>
>> So we could let people add any layer,
>> >> but the admin of GeoNode can configure it to just cache the
>> most used
>> >> tiles, up to a limit they set, be it 100 megs or 2 terabytes.
>> For this
>> >> first step the caches may just get invalid, but the admin would
>> have the
>> >> ability to truncate them in the GWC admin.
>> >>
>> > LRU worries me a bit; if we set the disk limit too low we may
>> just end
>> > up with a lot of cache churn for little/no performance benefit.
>> In my mind, GeoWebcache is "incomplete" as a product until we add the
>> following enhancements:
>> - configuration option to cache layers only up to a certain zoom
>> level, and from that level on, defer to pure proxy mode
>> - diskquota, which is kind of in beta testing now
>> - Identify and avoid seeding empty tiles. This can be easily done
>> with
>> the JAI Extrema operation (or even Histogram) or the user might
>> configure a no-data color for the layer?
>> - Definition of an area of interest, so that a geometry defines the
>> allowed seeding area for a layer
>>
>> And the
>> > disk requirements can grow with minimal warning, since anyone
>> can add a
>> > layer. There's also an easy DOS attack - anyone can fetch 18
>> zoom levels
>> > of some layer nobody uses and trash the cache (not a huge deal,
>> how long
>> > would it take an attacker to do that anyway?).
>> That would put the LRU diskquota enforcement job to work and hence
>> wipe
>> out those tiles that are least used. This plus the ability to set a
>> limit on the number of zoom levels to actually cache would bring us
>> closer to the safe zone?
>>
>> I'm not saying an LRU is
>> > a bad idea. I think caching will be a great improvement. It's just
>> > that there is a lot of room for refinement here (probably once
>> we have
>> > better usage tracking we can use that to prioritize tilesets,
>> for example.)
>> Wouldn't the LRU stats be enough for that? Note we also have an LFU
>> (Least Frequently Used) expiration policy for diskquota enforcement,
>> which looks closer to the kind of usage tracking you mention?
>> >
>> >> * Cache local layers, coordinating with Style changes. I think
>> Arne may
>> >> have coded this up, at least for the embedded GWC.
>> Yes. The problem with the embedded GWC is that is completely wipes out
>> the entire layer cache upon _any_ modification, including WFS
>> transactions, resulting too heavily truncated caches. You
>> add/remote/edit a single feature, the whole layer cache is discarded.
>> There's room to improve that based on bounding box/bounding
>> polygon with
>> some stuff created for the GeoRSS module though.
>> We could perhaps
>> >> start with just doing the cache on the embedded maps, since
>> those won't
>> >> have people switching to 'style mode'. Maybe that intermediary
>> step
>> >> isn't necessary, but when we're in the map composer view we
>> want to be
>> >> sure that when people are styling they're not seeing GWC tiles.
>> Related: I've been wondering since some time now if it wouldn't make
>> sense to also integrate the WMS service endpoints for WMS and GWC,
>> like
>> in GWC being a front barrier for /geosever/wms instead of having to
>> explicitly go through /geoserver/gwc?service=WMS...
>>
>> Back to topic: couldn't the styles just use a CGI flag to indicate
>> when
>> to ignore the cache and go straight to the WMS? AFAIK tiled=false
>> would
>> make the trick.
>>
>> When
>> >> they finish styling we should then truncate the existing cache
>> and start
>> >> over. Another simplifying assumption we could also consider
>> making is
>> >> only cache on the default style. Not sure how much that
>> actually helps.
>> >>
>> > I don't think we need to avoid caching alternative styles.
>> I think right now GWC only seeds on the default style, and lazily
>> caches
>> non default styles. Are we talking about preseeding here or just lazy
>> cacheing?
>>
>> >
>> > I do think we need to skip the cache while editing styles.
>> >
>> > It would be nice if we could use cached layers everywhere, and
>> have only
>> > the layer being styled switch to "straight" WMS when styling is
>> active.
>> >> * Remote layer management. This is sort of more general, I
>> think in the
>> >> future we should figure out some more full representation in each
>> >> GeoNode of a remote layer. Right now remote layers can be
>> added, but no
>> >> metadata can be found out about it. This is another whole
>> topic, but
>> >> the implication for here is that such a page should/could have
>> a way to
>> >> manage the cache of the local GeoNode. So you could truncate
>> the cache
>> >> there (maybe just the person who added? Maybe you can set
>> permissions
>> >> of who can truncate?). And then possibly also add a GeoRSS
>> location to
>> >> automatically truncate from.
>> >>
>> > Yeah, it would be awesome if adding a WMS to the composer
>> application
>> > got that service added to the GeoNode's GeoNetwork index,
>> complete with
>> > metadata pages in the Django web app. And GeoNode can
>> periodically scan
>> > the capabilities for added/removed layers, updated descriptions, new
>> > styles. These would be reflected in GeoNetwork and GeoWebCache
>> as well
>> > as the Django database.
>> >
>> > It might be nice to also provide a listing of indexed services
>> so users
>> > can track down the originating WMS services if they want.
>> >> The cool thing this set of things should lead to is to give a
>> benefit to
>> >> people adding remote servers. They get increased speed and
>> reliability
>> >> if they just add it to a map on a geoNode. So we can come in
>> with a
>> >> GeoNode to an existing nice SDI implementation that already has
>> a bunch
>> >> of WMS services, and then people can start creating maps on top
>> of it,
>> >> and those maps perform even faster than the straight WMS.
>> >>
>> >> Thoughts? I think this could be a nice performance win, as
>> most all our
>> >> maps are tiled. Should obviously be complemented by other
>> >> optimizations, like on the javascript side, but the two
>> together should
>> >> make things quite zippy.
>> >>
>> > Having the WMS capabilities handled on the server side (and cached
>> > there) would probably be a nice win for loading services. We
>> could do
>> > away with reading capabilities entirely until the user pulls up
>> the add
>> > layers dialog (which is not available in the embedded viewers).
>> >
>> > We don't do GFI requests now but it might be worth thinking
>> about how
>> > they interact with the cache. I also don't see this map caching
>> doing
>> > much for offline/distributed data management, which seems like
>> caching
>> > of another sort. It would be good to work out some answers
>> related to that.
>> I don't get it. Could you elaborate?
>>
>> Cheers,
>> Gabriel
>> >
>> > --
>> > David Winslow
>> > OpenGeo - http://opengeo.org/
>>
>>
>> --
>> Gabriel Roldan
>> OpenGeo - http://opengeo.org
>> Expert service straight from the developers.
>>
>>
>>
>>
>> --
>> Sebastian Benthall
>> OpenGeo - http://opengeo.org
>>
>
>


-- 
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Re: [geonode] GWC in GeoNode

From:
Sebastian Benthall
Date:
2010-06-25 @ 18:13
Good point.

I'll try to turn this thread into a feature spec draft, if that's OK.

That will also help us integrate with our (upcoming) implementation roadmap
and also look for client funding to help with these improvements.

On Fri, Jun 25, 2010 at 2:05 PM, Gabriel Roldan <groldan@opengeo.org> wrote:

> we can figure out the workflow details later.
> What I'd want is for us to lay down a clear scope, something with can
> easily track progress against.
> How should we proceed? use cases -> feature specs -> tech spec?
>
> Gabriel
> On 6/25/10 1:56 PM, David Winslow wrote:
> > We don't have a build of GeoWebCache in the GeoNode source tree right
> > now so if we are going to start customizing it we will need more than a
> > branch. Gabriel, let me know if you think we need to do something about
> > this.
> >
> > --
> > David Winslow
> > OpenGeo - http://opengeo.org/
> >
> > On 06/25/2010 12:51 PM, Sebastian Benthall wrote:
> >> Chris, correct me if I'm wrong, but we go the goahead to work on this,
> >> yeah?
> >>
> >> I guess that means we should have a new git branch for this?
> >>
> >> On Wed, Jun 23, 2010 at 2:41 PM, Gabriel Roldan <groldan@opengeo.org
> >> <mailto:groldan@opengeo.org>> wrote:
> >>
> >> This is gonna be awesome. Some comments inline.
> >>
> >> On 6/23/10 12:08 PM, David Winslow wrote:
> >> > shooting from the hip with some feedback on these ideas
> >> >
> >> > On 06/22/2010 06:18 PM, Chris Holmes wrote:
> >> >> I've been thinking a bit about how we can bring GeoWebCache in to
> >> >> GeoNode, to get at some of the great performance enhancements
> >> it can
> >> >> bring. Ideally we seamlessly cache all layers viewed in
> >> GeoNode, both
> >> >> local and remote, even when those change. There are twists
> >> with each,
> >> >> and both revolve around stale caches.
> >> >>
> >> >> With local caches we need a way to truncate the cache if the style
> >> >> changes. Ideally when one is in style edit mode we don't use
> >> GWC at
> >> >> all, only when someone does a final 'save' does it start
> >> caching the
> >> >> change.
> >> >>
> >> >> With remote caches we need a way for a user to manage the cache, to
> >> >> invalidate it when the remote server changes, either data or style.
> >> >> Ideally it would have a GeoRSS feed of changes that GWC
> >> automatically
> >> >> truncates based on. Less ideally there's a manual way to
> >> restart the
> >> >> caching.
> >> >>
> >> >> A rough roadmap of how we might achieve the end goal:
> >> >>
> >> >> * Start with just caching remote layers. So when anyone puts in a
> >> >> remote WMS it automatically gets added as a GWC layer.
> >> The GWC REST API is definitely clunky currently and would highly
> >> benefit
> >> if we do this
> >>
> >> >> Gabriel is about
> >> >> to commit a Least Recently Used cache to GWC, which will allow
> >> an admin
> >> >> to set a total max for the cache.
> >> Right now the diskquota is an opt-in process meaning there's no global
> >> cache size cap, but you need to set the limit on a layer by layer
> >> basis.
> >> I think it would be easy to add a global limit so any non explicitly
> >> configured layer gets evenly capped to cope up with the global limit.
> >> How does that sound?
> >>
> >> So we could let people add any layer,
> >> >> but the admin of GeoNode can configure it to just cache the
> >> most used
> >> >> tiles, up to a limit they set, be it 100 megs or 2 terabytes.
> >> For this
> >> >> first step the caches may just get invalid, but the admin would
> >> have the
> >> >> ability to truncate them in the GWC admin.
> >> >>
> >> > LRU worries me a bit; if we set the disk limit too low we may
> >> just end
> >> > up with a lot of cache churn for little/no performance benefit.
> >> In my mind, GeoWebcache is "incomplete" as a product until we add the
> >> following enhancements:
> >> - configuration option to cache layers only up to a certain zoom
> >> level, and from that level on, defer to pure proxy mode
> >> - diskquota, which is kind of in beta testing now
> >> - Identify and avoid seeding empty tiles. This can be easily done
> >> with
> >> the JAI Extrema operation (or even Histogram) or the user might
> >> configure a no-data color for the layer?
> >> - Definition of an area of interest, so that a geometry defines the
> >> allowed seeding area for a layer
> >>
> >> And the
> >> > disk requirements can grow with minimal warning, since anyone
> >> can add a
> >> > layer. There's also an easy DOS attack - anyone can fetch 18
> >> zoom levels
> >> > of some layer nobody uses and trash the cache (not a huge deal,
> >> how long
> >> > would it take an attacker to do that anyway?).
> >> That would put the LRU diskquota enforcement job to work and hence
> >> wipe
> >> out those tiles that are least used. This plus the ability to set a
> >> limit on the number of zoom levels to actually cache would bring us
> >> closer to the safe zone?
> >>
> >> I'm not saying an LRU is
> >> > a bad idea. I think caching will be a great improvement. It's just
> >> > that there is a lot of room for refinement here (probably once
> >> we have
> >> > better usage tracking we can use that to prioritize tilesets,
> >> for example.)
> >> Wouldn't the LRU stats be enough for that? Note we also have an LFU
> >> (Least Frequently Used) expiration policy for diskquota enforcement,
> >> which looks closer to the kind of usage tracking you mention?
> >> >
> >> >> * Cache local layers, coordinating with Style changes. I think
> >> Arne may
> >> >> have coded this up, at least for the embedded GWC.
> >> Yes. The problem with the embedded GWC is that is completely wipes out
> >> the entire layer cache upon _any_ modification, including WFS
> >> transactions, resulting too heavily truncated caches. You
> >> add/remote/edit a single feature, the whole layer cache is discarded.
> >> There's room to improve that based on bounding box/bounding
> >> polygon with
> >> some stuff created for the GeoRSS module though.
> >> We could perhaps
> >> >> start with just doing the cache on the embedded maps, since
> >> those won't
> >> >> have people switching to 'style mode'. Maybe that intermediary
> >> step
> >> >> isn't necessary, but when we're in the map composer view we
> >> want to be
> >> >> sure that when people are styling they're not seeing GWC tiles.
> >> Related: I've been wondering since some time now if it wouldn't make
> >> sense to also integrate the WMS service endpoints for WMS and GWC,
> >> like
> >> in GWC being a front barrier for /geosever/wms instead of having to
> >> explicitly go through /geoserver/gwc?service=WMS...
> >>
> >> Back to topic: couldn't the styles just use a CGI flag to indicate
> >> when
> >> to ignore the cache and go straight to the WMS? AFAIK tiled=false
> >> would
> >> make the trick.
> >>
> >> When
> >> >> they finish styling we should then truncate the existing cache
> >> and start
> >> >> over. Another simplifying assumption we could also consider
> >> making is
> >> >> only cache on the default style. Not sure how much that
> >> actually helps.
> >> >>
> >> > I don't think we need to avoid caching alternative styles.
> >> I think right now GWC only seeds on the default style, and lazily
> >> caches
> >> non default styles. Are we talking about preseeding here or just lazy
> >> cacheing?
> >>
> >> >
> >> > I do think we need to skip the cache while editing styles.
> >> >
> >> > It would be nice if we could use cached layers everywhere, and
> >> have only
> >> > the layer being styled switch to "straight" WMS when styling is
> >> active.
> >> >> * Remote layer management. This is sort of more general, I
> >> think in the
> >> >> future we should figure out some more full representation in each
> >> >> GeoNode of a remote layer. Right now remote layers can be
> >> added, but no
> >> >> metadata can be found out about it. This is another whole
> >> topic, but
> >> >> the implication for here is that such a page should/could have
> >> a way to
> >> >> manage the cache of the local GeoNode. So you could truncate
> >> the cache
> >> >> there (maybe just the person who added? Maybe you can set
> >> permissions
> >> >> of who can truncate?). And then possibly also add a GeoRSS
> >> location to
> >> >> automatically truncate from.
> >> >>
> >> > Yeah, it would be awesome if adding a WMS to the composer
> >> application
> >> > got that service added to the GeoNode's GeoNetwork index,
> >> complete with
> >> > metadata pages in the Django web app. And GeoNode can
> >> periodically scan
> >> > the capabilities for added/removed layers, updated descriptions, new
> >> > styles. These would be reflected in GeoNetwork and GeoWebCache
> >> as well
> >> > as the Django database.
> >> >
> >> > It might be nice to also provide a listing of indexed services
> >> so users
> >> > can track down the originating WMS services if they want.
> >> >> The cool thing this set of things should lead to is to give a
> >> benefit to
> >> >> people adding remote servers. They get increased speed and
> >> reliability
> >> >> if they just add it to a map on a geoNode. So we can come in
> >> with a
> >> >> GeoNode to an existing nice SDI implementation that already has
> >> a bunch
> >> >> of WMS services, and then people can start creating maps on top
> >> of it,
> >> >> and those maps perform even faster than the straight WMS.
> >> >>
> >> >> Thoughts? I think this could be a nice performance win, as
> >> most all our
> >> >> maps are tiled. Should obviously be complemented by other
> >> >> optimizations, like on the javascript side, but the two
> >> together should
> >> >> make things quite zippy.
> >> >>
> >> > Having the WMS capabilities handled on the server side (and cached
> >> > there) would probably be a nice win for loading services. We
> >> could do
> >> > away with reading capabilities entirely until the user pulls up
> >> the add
> >> > layers dialog (which is not available in the embedded viewers).
> >> >
> >> > We don't do GFI requests now but it might be worth thinking
> >> about how
> >> > they interact with the cache. I also don't see this map caching
> >> doing
> >> > much for offline/distributed data management, which seems like
> >> caching
> >> > of another sort. It would be good to work out some answers
> >> related to that.
> >> I don't get it. Could you elaborate?
> >>
> >> Cheers,
> >> Gabriel
> >> >
> >> > --
> >> > David Winslow
> >> > OpenGeo - http://opengeo.org/
> >>
> >>
> >> --
> >> Gabriel Roldan
> >> OpenGeo - http://opengeo.org
> >> Expert service straight from the developers.
> >>
> >>
> >>
> >>
> >> --
> >> Sebastian Benthall
> >> OpenGeo - http://opengeo.org
> >>
> >
> >
>
>
> --
> Gabriel Roldan
> OpenGeo - http://opengeo.org
> Expert service straight from the developers.
>



-- 
Sebastian Benthall
OpenGeo - http://opengeo.org

Re: [geonode] GWC in GeoNode

From:
Gabriel Roldan
Date:
2010-06-25 @ 18:20
On 6/25/10 3:13 PM, Sebastian Benthall wrote:
> Good point.
>
> I'll try to turn this thread into a feature spec draft, if that's OK.
will you be so kind? :)
I'd like to help, not sure what tool you do prefer, sphinx, wiki, etherpad?


>
> That will also help us integrate with our (upcoming) implementation roadmap
> and also look for client funding to help with these improvements.
>
> On Fri, Jun 25, 2010 at 2:05 PM, Gabriel Roldan<groldan@opengeo.org>  wrote:
>
>> we can figure out the workflow details later.
>> What I'd want is for us to lay down a clear scope, something with can
>> easily track progress against.
>> How should we proceed? use cases ->  feature specs ->  tech spec?

Re: [geonode] GWC in GeoNode

From:
Sebastian Benthall
Date:
2010-06-25 @ 19:33
How about etherpad

http://etherplans.org/hlHTOxKlBP

On Fri, Jun 25, 2010 at 2:20 PM, Gabriel Roldan <groldan@opengeo.org> wrote:

> On 6/25/10 3:13 PM, Sebastian Benthall wrote:
> > Good point.
> >
> > I'll try to turn this thread into a feature spec draft, if that's OK.
> will you be so kind? :)
> I'd like to help, not sure what tool you do prefer, sphinx, wiki, etherpad?
>
>
> >
> > That will also help us integrate with our (upcoming) implementation
> roadmap
> > and also look for client funding to help with these improvements.
> >
> > On Fri, Jun 25, 2010 at 2:05 PM, Gabriel Roldan<groldan@opengeo.org>
>  wrote:
> >
> >> we can figure out the workflow details later.
> >> What I'd want is for us to lay down a clear scope, something with can
> >> easily track progress against.
> >> How should we proceed? use cases ->  feature specs ->  tech spec?
>



-- 
Sebastian Benthall
OpenGeo - http://opengeo.org

Re: [geonode] GWC in GeoNode

From:
David Winslow
Date:
2010-06-23 @ 19:37
>> We don't do GFI requests now but it might be worth thinking about how
>> they interact with the cache.  I also don't see this map caching doing
>> much for offline/distributed data management, which seems like caching
>> of another sort.  It would be good to work out some answers related to that.
>>      
> I don't get it. Could you elaborate?
>
> Cheers,
> Gabriel
>
I was basically talking about caching (and editing, and merging) feature 
data instead of map tiles.  As far as I know, these are totally 
unrelated to GWC, so probably don't belong on this thread.

-d

Re: [geonode] GWC in GeoNode

From:
Gabriel Roldan
Date:
2010-06-23 @ 19:39
On 6/23/10 4:37 PM, David Winslow wrote:
>
>>> We don't do GFI requests now but it might be worth thinking about how
>>> they interact with the cache.  I also don't see this map caching doing
>>> much for offline/distributed data management, which seems like caching
>>> of another sort.  It would be good to work out some answers related to that.
>>>
>> I don't get it. Could you elaborate?
>>
>> Cheers,
>> Gabriel
>>
> I was basically talking about caching (and editing, and merging) feature
> data instead of map tiles.  As far as I know, these are totally
> unrelated to GWC, so probably don't belong on this thread.

ok... I'm still intrigued so if it's worth discussing would you mind 
opening a new thread?

thanks
Gabriel
>
> -d


-- 
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Re: [geonode] GWC in GeoNode

From:
David Winslow
Date:
2010-06-23 @ 14:34
yeah awesome!!

+1

On 06/22/2010 06:18 PM, Chris Holmes wrote:
> I've been thinking a bit about how we can bring GeoWebCache in to
> GeoNode, to get at some of the great performance enhancements it can
> bring.  Ideally we seamlessly cache all layers viewed in GeoNode, both
> local and remote, even when those change.   There are twists with each,
> and both revolve around stale caches.
>
> With local caches we need a way to truncate the cache if the style
> changes.  Ideally when one is in style edit mode we don't use GWC at
> all, only when someone does a final 'save' does it start caching the
> change.
>
> With remote caches we need a way for a user to manage the cache, to
> invalidate it when the remote server changes, either data or style.
> Ideally it would have a GeoRSS feed of changes that GWC automatically
> truncates based on.  Less ideally there's a manual way to restart the
> caching.
>
> A rough roadmap of how we might achieve the end goal:
>
> * Start with just caching remote layers.  So when anyone puts in a
> remote WMS it automatically gets added as a GWC layer.  Gabriel is about
> to commit a Least Recently Used cache to GWC, which will allow an admin
> to set a total max for the cache.  So we could let people add any layer,
> but the admin of GeoNode can configure it to just cache the most used
> tiles, up to a limit they set, be it 100 megs or 2 terabytes.  For this
> first step the caches may just get invalid, but the admin would have the
> ability to truncate them in the GWC admin.
>
> * Cache local layers, coordinating with Style changes.  I think Arne may
> have coded this up, at least for the embedded GWC.  We could perhaps
> start with just doing the cache on the embedded maps, since those won't
> have people switching to 'style mode'.  Maybe that intermediary step
> isn't necessary, but when we're in the map composer view we want to be
> sure that when people are styling they're not seeing GWC tiles.  When
> they finish styling we should then truncate the existing cache and start
> over.  Another simplifying assumption we could also consider making is
> only cache on the default style.  Not sure how much that actually helps.
>
> * Remote layer management.  This is sort of more general, I think in the
> future we should figure out some more full representation in each
> GeoNode of a remote layer.  Right now remote layers can be added, but no
> metadata can be found out about it.  This is another whole topic, but
> the implication for here is that such a page should/could have a way to
> manage the cache of the local GeoNode.  So you could truncate the cache
> there (maybe just the person who added?  Maybe you can set permissions
> of who can truncate?).  And then possibly also add a GeoRSS location to
> automatically truncate from.
>
>
> The cool thing this set of things should lead to is to give a benefit to
> people adding remote servers.  They get increased speed and reliability
> if they just add it to a map on a geoNode.  So we can come in with a
> GeoNode to an existing nice SDI implementation that already has a bunch
> of WMS services, and then people can start creating maps on top of it,
> and those maps perform even faster than the straight WMS.
>
> Thoughts?  I think this could be a nice performance win, as most all our
> maps are tiled.  Should obviously be complemented by other
> optimizations, like on the javascript side, but the two together should
> make things quite zippy.
>
>