Developer's Corner: Introducing Database Level Security in GeoServer

,
During our work we support manu GeoServer Enterprise installations which pull data from a spatial database of some sort, normally via a connection pool, a tool that keeps database connections around so that we don't have to open and close them at every request (something that could be very expensive).
The pool accesses the database via a shared user, that all GeoServer requests end up using. Some requests only require data reading (WMS GetMap), others modify data (WFS Transaction), some even create new tables (RESTConfig data uploading for example).
The pool user must be able to perform all and any of the operations that GeoServer needs, meaning that more often than not it has very wide powers of what it can do on the database.

GeoServer built in security, as well as extensions such as GeoRepository, allow to control what specific users can do and shield the database from security issues.
However in some enviroment the preferred security management policy is to have security restrictions operate at the database level instead, with the pool user being given minimal rights (normally, to list and describe the tables, but without any actual access to them). This has some advantages:
  • the security is setup just once for the variety of applications that might access the database
  • each user can actually perform only the operations that he/she was allowed to, regardless of eventual bugs/security holes in the application level software
  • leverages the DBA expertise
GeoSolutions recently implemented the ability to use DBMS session startup and teardown scripts that can be used to alter the user accessing the database for the duration of the current request, turning back to the pool user when the request is complete.
These commands can be specified in the configuration User Interface while setting up the data store. For example, if we wanted to have each and every PostgreSQL session use the credentials of the current GeoServer user we'd use the following setup:

Different databases will of course use different commands, or custom, in house package calls, to setup the current session user. See the GeoServer documentation for more details on how this new functionality can be used.

We'd like to thank Astrium GEO-Information Services for sponsoring this improvement and sharing it with the GeoServer and GeoTools communities.

Application security is certainly one of the topics we like to deal with. There is of course a lot more to explore and improve, this topic is both rich and interesting. Want for example CAS or Shibboleth security in your GeoServer intallation? Maybe integration with Active Directory? Talk to us first!

The GeoSolutions team,

Improving GeoServer SQL Server support

,
Dear All,
in recent times we were hired to improve GeoServer SQL Server support story.

The SQL Server store was created and maintained during spare time by Justin DeOliveira, however due to lack of production usage, and work time to pour on it, it failed to reach to the same level of robustness and speed as the best supported stores, such as Oracle and PostGIS.

Our work this week tried to close this gap with a number of little and big improvements that make the code run faster and in a more reliable way:
  • add support for connection validation (very important for SQL Azure, which is very keen on closing pooled connections in your face)
  • use binary encoding, instead of text, to transfer geometries from the database
  • support for data paging at the database level
  • make sure the rich database test suite we have in GeoTools is fully implented for SQL server, ensuring good support for use cases such as dynamic SQL views, proper date/time encoding in filters, and the like, both on the development series and on the stable series
Our develoment focused on testing the code against both SQL Server 2008 and SQL Azure. SQL Azure is the SQL database one can use in the Microsoft Azure cloud system: while it does look a lot like SQL Server 2008, it does not quite behave the same way in all cases, and requires a specific JDBC driver to work properly.

There are still some improvements missing on the table, such as geography columns support, but we're sure you'll be able to get more out of a production usage of GeoServer and SQL Server now.

Interested in sponsoring further improvements? Looking for professional support service that deliver for your group? Let us know!

The GeoSolutions team,

GWT-OpenLayers release 0.7

,
Dear All,
we would like to inform you that the new release 0.7 for the GWT-OpenLayers project is available.

The releases is ready for the download and can be found here.

Notable improvements with this release are as follows:
  • Upgrade to GWT 2.4.0
  • Support to Google Maps V3
  • Improved source code formatting
  • Added and fixed several base methods and bindings such WMS Params, Layer methods
Alessio Fabiani has taken care of the 0.7 release as an active committer and administrator of the project.
We would like to thank all the other committers for their dedication and hard work!

Regards,
the GeoSolutions Team.

BRISEIDE Project, Geoserver, GeoNetwork and GeoBatch for the management of dynamic MetOc data

,
Dear All,
In this post I wanted to talk about the work we are doing for the BRISEIDE european project.

The ambitious aim of the project, as stated on its website, is the delivery of:
  • time-aware extension of MetOc data models developed in the context of previous/ongoing EU INSPIRE related projects (e.g. in the context of GMES, eContentPlus)
  • application (e.g. Civil Protection) based on the integration of existing, user operational information
  • value added services for spatio-temporal data management, authoring, processing, analysis and interactive visualisation
Within the context of the project, GeoSolutions will work under the leadership of SinerGIS in order to provide near-real time ingestion, cataloging and publishing of meteorological data provided by the stakeholders to be used as inputs for running processes to perform, as an instance, fire propagation models in emergency situations.

The infrastructure we are setting up is depicted in the deployment diagram here below. Basic building blocks are as follow:
  • GeoServer for providing WMS, WCS and WFS services with support for the TIME dimension. It is worth to point out that it will also provide WPS capabilities.
  • GeoNetwork, for publishing metadata for all data with specific customizations for managing the TIME dimensions in the dataset (we are going to biriefly describe them later on)
  • GeoBatch, for performing preprocessing and ingestion in near real time of data and related metadata with minimal human intervention.
** PLEASE DESCRIBE THIS IMAGE **

For this project we have customized the metadata indexing (thanks Lucene!) in GeoNetwork in order to be able to index meteorological model runs in terms of their run time as well as in term of their forecast times. Generally speaking the data we are dealing with is driven by a meterological model which produces daily a certain number of geophysical parameters with forecast valid for around 5 days, moreover some additional fire risk indexes are produced by processing these results in near real time. As days go by forecasts from different runs of the model are available as indicated in the picture below.

** PLEASE DESCRIBE THIS IMAGE **

To achieve our gaols we have slightly customized the indexing configuration as well as the user interface in order to be able to make searches on run times and forecast times fast. If you are interested in having a look at one of the ISO metadata XML documents that we are publishing, here you can find an example here.

Here below you can find a diagram depicting the automatic ingestion flow we have created for the BRISEIDE project (actually, one of the few we have created) using the GeoBatch framework (we will soon release version 1.0 on which we are working in our own repository).



This flow makes extensive use of an orchestration Groovy script that implements specific business logic for the use case. Internally it also make use of various other atomic action for performing tasks like, publishing an ImageMosaic in GeoServer or its metadata in GeoNetwork.


Incoming files are composed by a compressed set of .asc (ascii images) files which are:
  1. Converted into re-tiled GeoTiff images
  2. Embedded overviews are added to each image
The groovy script produces an ImageMosaicCommand which is essentially an xml command which is sent to the ImageMosaicAction which:
  1. Check for the layer existence on the target GeoServer
  2. If success copy all the files to the target directory
  3. Create the Store and the layer to contain the ImageMosaic
  4. Configure the layer on the GeoServer using desired parameters
  5. Produce an XML file with the ImageMosaic properties
The groovy script read the produced ImageMosaic output, enrich it with some other useful information then pass that object to the FreeMarkerAction which using a template and the passed data model will produce the xml metadata file (as described above).

The GroovyScriptAction ends sending this file to the next action which is the GeoNetworkAction which will send the metadata to the target GeoNetwork server using the desired (specified by configuration) options.

If you have questions about the work described in this post or if you want to know more about our services could help your organization to reach its goals, do not hesitate to contact us.


Regards,
the GeoSolutions Team.

GeoNetwork 2.6.4 con traduzione italiana disponibile!

,
Salve a tutti,
dopo il ticket di rilascio di una versione di preview della localizzazione italiana per la versione di sviluppo 2.6.5 di GeoNetwork (si veda qui per maggiori informazioni), abbiamo effettuato il backport alla versione 2.6.4 stabile. Da questo indirizzo è possibile scaricarne il war. Questo war installerà GeoNetwork versione 2.6.4 con in più la localizzazione in italiano; la pagina principale porterà automaticamente alla versione italiana.



La migrazione di installazioni esistenti sarà effettuata correttamente solo se la versione di partenza è una di quelle attualmente supportate da GeoNetwork standard, ossia 2.4.3, 2.6.0 o 2.6.1. Per le nuove installazioni i dati in italiano saranno normalmente caricati durante il setup del database di GeoNetwork.

The GeoSolutions team,

WFS for the masses: adding support for paging and sorting in GeoServer

,
Today we are going to introduce you our latest contribution to GeoServer, WFS paging and sorting for retrieving features.

First off, let's take a step back and see what sorting and paging support is available in the official OGC protocols:
  • Neither WMS 1.1 nor WMS 1.3 (or SLD/SE for that matter), have any ability to order the results so that features are painted in a certain order. If features can be organized in categories filters and FeatureTypeStyle elements can do the trick, but that won't work over continous fields
  • Same goes for WPS 1.0, which can return significant amounts of vector data that might be useful to page over
  • WFS 1.0 does not support either, WFS 1.1 supports sorting, WFS 2.0 supports sorting and paging via the sortBy and startIndex/maxFeature parameters
The last stable GeoServer release does not support WFS 2.0, and allows for sorting only on DBMS based stores. In our latest contribution to the stable series we removed all limitations concerning WFS paging and sorting support:
  • WFS 1.0 and 1.1 can now support sorting on top of each and every store kind, using the sortBy parameter as a vendor extension
  • WFS 1.0 and 1.1 support paging on top of each and every store, using the startIndex/maxFeatures parameters
Protocol and paging wise we back-ported the work added on trunk along with WFS 2.0 support, and then we merged in some previous work on generic sorting we did for the aggregating store.

Technically speaking, in case the store does not support sorting natively (e.g., shapefile) we gather the features into an optimized merge/sort algorithm that never keeps more than 1000 features in memory, and uses secondary storage to scale up to larger result sets.

But, enough talking, let's see some examples against the states demo layer, using the CSV as the output format for brevity.

First ten features:
http://localhost:8080/geoserver/topp/owsservice=WFS&version=1.0.0&request=GetFeature
&typeName=topp:states&outputFormat=csv&propertyName=STATE_NAME,PERSONS
&maxFeatures=10

FID,STATE_NAME,PERSONS
states.1,Illinois,11430602
states.2,District of Columbia,606900
states.3,Delaware,666168
states.4,West Virginia,1793477
states.5,Maryland,4781468
states.6,Colorado,3294394
states.7,Kentucky,4551524
states.8,Kansas,2477574
states.9,Virginia,6180651
states.10,Missouri,5117073

The next 10 features:
http://localhost:8080/geoserver/topp/ows?service=WFS&version=1.0.0&request=GetFeature
&typeName=topp:states&outputFormat=csv&propertyName=STATE_NAME,PERSONS
&maxFeatures=10&startIndex=10

FID,STATE_NAME,PERSONS
states.11,Arizona,3665228
states.12,Oklahoma,3145585
states.13,North Carolina,6628629
states.14,Tennessee,4829958
states.15,Texas,17122020
states.16,New Mexico,1379559
states.17,Alabama,4040587
states.18,Mississippi,2573216
states.19,Georgia,6457339
states.20,South Carolina,3486703

The first ten states with most people (sort on PERSONS, descending):
http://localhost:8080/geoserver/topp/ows?service=WFS&version=1.0.0&request=GetFeature
&typeName=topp:states&outputFormat=csv&propertyName=STATE_NAME,PERSONS
&maxFeatures=10&sortBy=PERSONS%20D

FID,STATE_NAME,PERSONS
states.47,California,29760021
states.39,New York,18235907
states.15,Texas,17122020
states.23,Florida,12937926
states.40,Pennsylvania,11881643
states.1,Illinois,11430602
states.48,Ohio,9980887
states.24,Michigan,9295297
states.43,New Jersey,7484736
states.13,North Carolina,6628629

The second page of the above result set:
http://localhost:8080/geoserver/topp/ows?service=WFS&version=1.0.0&request=GetFeature
&typeName=topp:states&outputFormat=csv&propertyName=STATE_NAME,PERSONS
&maxFeatures=10&startIndex=10&sortBy=PERSONS%20D

FID,STATE_NAME,PERSONS
states.19,Georgia,6457339
states.9,Virginia,6180651
states.37,Massachusetts,6016425
states.44,Indiana,5544159
states.10,Missouri,5117073
states.49,Washington,4866692
states.14,Tennessee,4829958
states.30,Wisconsin,4796441
states.5,Maryland,4781468
states.7,Kentucky,4551524

These modifications are already available in the GeoServer 2.1.3 release.

There is more work to be done in this area, ordering features before paiting them in WMS, and paging WPS results would both be nice additions. Interested? Let us know!

The GeoSolutions team,

Robust Clustering Solution for GeoServer

,
Dear All,
in this post we'd like to introduce some work that we have performed in order to provide robust support support for clustered GeoServer deployments with an emphasis on publishing new layers in real-time.

As you might know there are various approaches with GeoServer that can be used to implement a clustered deployment, based on different mixes of data directory sharing plus configuration reload. However, these techniques have intrinsic limitations therefore we decided to create a specific GeoServer Clustering Extension in order to overcome them. It is worth to point out that what we are going to describe is designed to work with GeoServer 2.1 stable series.

In the picture below our approach is shown. We propose a robust Master/Slave approach which leverages on a Message Oriented Middleware (MOM) where:
  1. The Masters (yes, we can have more than one, read on...) receive changes to the internal configuration, persiste them on their own data directory but also forward them to the Slaves via the MOM
  2. The Slaves do not accept changes to their configuration from eithe REST or the User Interface, but are configured to inject configuration changes disseminated by the Master(s) via the MOM
  3. The MOM is used to make the Master and the Slave exhange messages in a durable fashion
  4. Each Slave has its own data directory which it is responsible for keeping it aligned with the Master's one. In case a Slave goes down when it goes up again he might receive a bunch of messages to align its configuration to the Master's one.
  5. A Node can be both Master and Slave at the same time, this means that we don't have a single poinf of failure, the Master itself



Improved Clustering in Action

In the following we provide a few additional technical details on our solution, describing a deployment which we use for our tests (as such, it is designed to show every possible combination as opposed to be a best practice for deploy). We will refer to the following picture.






This deployment is composed by:
  • A pure Master GeoServer(s), this instance can only send events to the topic.It cannot act as a slave
  • A set of Geoserver which can work as both Master and Slave. These instances can send and receive messages to/from the topic. They can work as Masters (sending message to other subscribers) as well as Slaves (these instances are also subscribers of the topic).
  • A set of pure Slaves GeoServer instances whic can only receive messages from the topic.
  • A set of MOM brokers so that each GeoServerinstance is configured with a set of available brokers (failover). Each broker use the shared DB as persistence. Doing so if a broker fails for some reason, messages can still be written and read from the shared database.
We are now going to illustrate, step by step, how to publish a layer from a GeoTiff file using the GeoServer User Interface of the Pure Master instance. The resulting layer will be published on all the active GeoServers.

Manually create the GeoTiff store using the User Interface:




Publish the layer





Click save and check results on the clients




Now check the result using the LayerPreview:



As expected using the pure Master to publish the GeoTiff file, the resulting layer will be published automatically on the salve instances with no intervention. There is one thing to notice, we are not moving data around but only their configuration (styles included) since we are assuming that all instances sees the same resources with the same absolute paths, which is common in distributed and cluestered set ups where resources are shared among multiple servers, like for example network storage.


Conclusions
If you are responsible for administering a a series of GeoServer instances and/or you are publishing lots of data in real-time then this extension is a perfect fit for your organization.

In case you are interested in test-driving this extension in your own set-up, you might want to know that we are going to provide this extension free of charge to clients who will subscribe to our  GeoServer Professional Services for the 2012 as well as to our partners. Contact us if you are interested!

Regards,
the GeoSolutions Team.