geoserver on steroids at foss4g europe 2014
Post on 14-Sep-2014
682 Views
Preview:
DESCRIPTION
TRANSCRIPT
GeoServer on steroids All you wanted to know about how to make GeoServer faster
but you never asked (or you did and no one answered)
Ing. Andrea Aime, GeoSolutions
Ing. Simone Giannecchini, GeoSolutions
FOSS4G-Europe 2011, Bremen 14th -17th July 2014
GeoSolutions
Founded in Italy in late 2006
Expertise • Image Processing, GeoSpatial Data Fusion
• Java, Java Enterprise, C++, Python
• JPEG2000, JPIP, Advanced 2D visualization
Supporting/Developing FOSS4G
MapStore, GeoServer
GeoNetwork, GeoBatch
Clients
Public Agencies
Private Companies
http://www.geo-solutions.it
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Summary
Contents
Preparing raster data and Preparing vector data
Optimizing styling
Output tuning
Tiling and caching
Resource control
Deploy configurations
Two slide decks in a row
A “short” one, fitting the presentation time
A detailed one following, 90+ slides worth of details
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Preparing raster inputs
Raster Data CheckList
Objectives
Fast extraction of a subset of the data
Fast extraction of the desired resolution
Check-list
Avoid having to open a large number of files per request
Avoid parsing of complex structures and slow compressions
Get to know your bottlenecks
CPU vs Disk Access Time vs Memory
Experiment with
Format, compression, different color models, tile size, overviews, GeoServer configuration
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Peculiar Formats
PNG/JPEG direct serving
Bad formats (especially in Java)
No tiling (or rarely supported)
Chew a lot of memory and CPU for decompression
Mitigate with external overviews
Any input ASCII format (GML grid, ASCII grid)
JPEG2000
Extensible and rich, not (always) fast, can be difficult to tune for performance (might require specific encoding options)
MrSID (can work, needs tuning)
ECW (licensing issues)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff for the win
To remember: GeoTiff is a swiss knife
But you don’t want to cut a tree with it!
Tremendously flexible, good for for most (not all) use cases
BigTiff pushes the GeoTiff limits farther
Use GeoTiff when
Overviews and Tiling stay within 4GB
No additional dimensions
Consider BigTiff for very large file (> 4 GB)
Support for tiling
Support for Overviews
Can be inefficient with very large files + small tiling
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
(Optional) Use gdal_warp to transform the data in the most used output reference system (mind, any reprojection ruins the data a bit)
Use gdal_translate to add inner tiling and fix eventual issues with coordinate reference system
Add compression options if you disks are small/slow/not local (consider JPEG compression with YCBCR color interpretation for photos, LZW/Deflate for scientific data)
Use gdaladdo to add internal overviews (remember to replicate compression here)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Possible structures
Single GeoTiff with internal tiling and overviews (GeoTiff < 2GB, BigTiff < 20-50GB)
Mosaic of GeoTiff, each one with internal tiling and overviews (< 500GB, not too many files)
Image pyramid, each level has lower resolution than the previous, each level has tiles with inner tiling but no overviews (also possible to mix, less levels and use internal overviews) > 500GB
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Choosing Formats and Layouts
Use ImageMosaic when:
A single file gets too big (inefficient seeks, too much metadata to read, etc..)
Multiple Dimensions (time, elevation, others..)
Avoid mosaics made of many very small files
Single granules can be large
Use Tiling + Overviews + Compression on granules
Use ImagePyramid when:
Tremendously large dataset
Too many files / too large files
Need to serve at all scales
Especially low resolution
For single granules (< 2Gb) GeoTiff is generally a good fit
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper Mosaic Preparation
Optimize files as if you were serving them individually
Keep a balance between number and dimensions of granules
If memory is scarce: USE_JAI_IMAGREAD to true USE_MULTITHREADING to false*
Otherwise USE_JAI_IMAGREAD to false ALLOW_MULTITHREADING to true
(Load data from different granules in parallel)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Multidimensional mosaics
Use Cases:
MetOc data (support for time, elevation)
Data with additional indipendent dimensions
Suggestions
Use ImageMosaic
Use a DBMS for indexing granules
Use File Name based property collectors to turn properties into DB rows attributes
Filter by time, elevation and other attributes via OGC and CQL filters
Check detailed deck for details!
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Multidimensional mosaics
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper Pyramid Preparation
Use gdal_retile for creating the pyramid
Prepare the list of tiles to be retiled
Create the pyramid with GDAL retile (grab a coffee!)
Chunks should not be too small (here 2048x2048), if you go larger, use also inner tiling
If the input dataset is huge use the useDirForEachRow option
Too many files in a dir is bad practice
Make sure the number of level is consistent with usage
Too few bad performance at high scale
GeoServer congif same as mosaic
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper GeoServer Coverage Options Configuration
Make sure native JAI is installed
Install the TurboJPEG extension
Enable JAI Mosaicking native acceleration
Give JAI enough memory
Don’t raise JAI memory Threshold too high
Rule of thumb: use 2 X #Core Tile Threads (check next slide)
Play with tile Recycling against your workflows (might help, might not)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper GeoServer Coverage Options Configuration
Multithreaded Granule Loading
Allows to fine tuning multithreading for ImageMosaic
Orthogonal to JAI Tile Threads
Rule of Thumb: use 2 X #Core Tile Threads
Perform testing to fine tune depending on layer configuration as well as on typical requests
ImageIO Cache threshold
decide when we switch to disk cache (very large WCS requests)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Preparing vector inputs
Vector data checklikst
What do we want from vector data:
Binary data
No complex parsing of data structures
Fast extraction of a geographic subset
Fast filtering on the most commonly used attributes
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Choosing a format
Slow formats
WFS
GML
DXF
Good formats, local and indexable
Shapefile
Directory of shapefiles
SDE
Spatial databases: PostGIS, Oracle Spatial, DB2, MySQL*, SQL server*
FOSS4G Europe 2014, Bremen 14th-17th July 2014
DBMS Checklist
Rich support for complex native filters
Use connection pooling
Validate connections (with proper pooling)
Table Clustering
Spatial Indexing
Spatial Indexing
Spatial Indexing
Alphanumeric Indexing
Alphanumeric Indexing
Alphanumeric Indexing
Did we mention indexes?
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Shapefile preparation
Remove .qix file if present
If there are large DBF attributes that are not in use, get rid of them using ogr2ogr, e.g.: ogr2ogr -select FULLNAME,MTFCC arealm.shp tl_2010_08013_arealm.shp
If on Linux, enable memory mapping, faster, more scalable (but will kill Windows):
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Shapefile filtering
Stuck with shapefiles and have scale dependent rules like the following?
Show highways first
Show all streets when zoomed in
Use ogr2ogr to build two shapefiles, one with just the highways, one with everything, and build two layers, e.g.: ogr2ogr -sql "SELECT * FROM
tl_2010_08013_roads WHERE MTFCC in ('S1100',
'S1200')" primaryRoads.shp
tl_2010_08013_roads.shp
Or hire us to develop non-spatial indexing for shapefile!
FOSS4G Europe 2014, Bremen 14th-17th July 2014
PostGIS specific hints
PostgreSQL out of the box configured for very small hardware: http://wiki.postgresql.org/wiki/Performance_Optimization
Make sure to run ANALYZE after data imports (updates optimizer stats)
As usual, avoid large joins in SQL views, consider materialized views
If the dataset is massive, CLUSTER on the spatial index:
http://postgis.refractions.net/documentation/manual-1.3/ch05.html
FOSS4G Europe 2014, Bremen 14th-17th July 2014
PostGIS specific hints
Careful with prepared statements (bad performances)
USE CASE: the layer’s style allows to display the whole layer in a single shot (no scale dependencies) prepared statements will slow down execution
EXPLANATION: Postgis will choose to use the spatial index in all cases, this makes retrieving the full data set 2-4 times slower than when using a sequential scan
COUNTERMEASURE: Not using prepared statement allows postgis to figure out a suitable plan based on the request bbox instead (assuming someone run "vacuum analyze" on the database to update the index statistics, and of course, provided there is a spatial index to start with)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Connection Pooling Tricks
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Connection pool size should be proportional to the number of concurrent requests you want to serve (obvious no?)
Activate connection validation
Mind networking tools that might cut connections sitting idle (yes, your server is not always busy), they might cut the connection in “bad” ways (10 minutes timeout before the pool realizes the TCP connection attempt gives up)
Read more at http://geoserver.geo-solutions.it/edu/en/adv_gsconfig/db_pooling.html
Optimize styling
Use scale dependencies
Never show too much data
the map should be readable, not a graphic blob. Rule of thumb: 1000 features max in the display, have labels show up when zoomed in
Show details as you zoom in
Eagerly add MinScaleDenominator to your SLD rules
Add more expensive rendering when there are less features
Key to get both a good looking and fast map
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Example
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Labeling
Labeling conflict resolution is expensive, limit to the most inner zooms
Halo is important for readability, but adds significant overhead
Careful with maxDisplacement, makes for various label location attempts
FOSS4G Europe 2014, Bremen 14th-17th July 2014
FeatureTypeStyle
GeoServer uses SLD FeatureTypeStyle objects as Z layers for painting
Each one allocates its own rendering surface (which can use a lot of memory), use as few as possible
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Use translucency sparingly
Translucent display is expensive, use it sparingly
e.g. translucent fill <CssParameter name="fill-opacity">0.5</CssParameter>
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Tiling & Caching
Tile caching with GeoWebCache
Tile oriented maps, fixed zoom levels and fixed grid
Useful for stable layers, backgrounds
Protocols: WMTS, TMS, WMS-C, Google Maps/Earth, VE
Speedup compared to dynamic WMS: 10 to 100 times, assuming tiles are already cached (whole layer pre-seeded)
Suitable for:
Mostly static layer
No/few dynamic parameters (CQL filters, SLD params, SQL query params, time/elevation, format options)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Embedded GWC advantage
No double encoding when using meta-tiling, faster seeding
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Space considerations
Seeding Colorado, assuming 8 cores, one layer, 0.1 sec 756x756 metatile, 15KB for each tile
Do yours: http://tinyurl.com/3apkpss
Not enough disk space? Set a disk quota
Zoom
level Tile count Size (MB)
Time to seed
(hours)
Time to seed
(days)
13 58,377 1 0 0
14 232,870 4 0 0
15 929,475 14 0 0
16 3,713,893 57 1 0
17 14,855,572 227 6 0
18 59,396,070 906 23 1
19 237,584,280 3,625 92 4
20 950,273,037 14,500 367 15
FOSS4G Europe 2014, Bremen 14th-17th July 2014
More Tweaks
Client-side caching of tiles
Does not work with browsers in private mode
<expireClientsList> <expirationRule minZoom="0" expiration="7200" /> <expirationRule minZoom="10" expiration="600" /> </expireClientsList>
FOSS4G Europe 2014, Bremen 14th-17th July 2014
More Tweaks
Use the right formats
JPEG for background data (e.g. ortos)
PNG8 + precomputed palette for background data (e.g. ortos)
PNG8 full for overlays with transparency
The format impacts also the disk space needed! (as well as the generation time)
Check this blog post
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Resource control
Resource Limits
Limit the amount of resources dedicated to an individual request
Improve fairness between requests, by preventing individual requests from hijacking the server and/or running for a very long time
EXTREMELY IMPORTANT in production environment
WHEN TO TWEAK THEM?
Frequent OOM Errors despite plenty of RAM
Requests that keep running for a long time (e.g. CPU usage peaks even if no requests are being sent)
DB Connection being killed by the DBMS while in usage (ok, you might also need to talk to the DBA..)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
WMS request limits
Max memory per request: avoid large requests, allows to size the server memory (max concurrent request * max memory)
Max time per request: avoid requests taking too much time (e.g., using a custom style provided with dynamic SLD in the request)
Max errors: best effort renderer, but handling errors takes time
FOSS4G Europe 2014, Bremen 14th-17th July 2014
WFS request limits
Max feature returned, configured as a global limit
Return feature bbox: reduce amount of generated GML
Per layer max feature count
FOSS4G Europe 2014, Bremen 14th-17th July 2014
WCS request limits
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Control flow
Control how many requests are executed in parallel, queue others:
Increase throughput
Control memory usage
Enforce fairness
More info here
FOSS4G Europe 2014, Bremen 14th-17th July 2014
$GEOSERVER_DATA_DIR/controlflow.properties
# don't allow more than 16 GetMap requests in parallel
ows.wms.getmap=16
Control flow
17%
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Resource Limits
They need to be tweaked together with Control- Flow
Limiting individual requests Resource Limits
Limiting amount of parallel request Control Flow
When time is involved make sure you keep into account all pieces of the response chain
E.G. Limited rendering time for WMS on data coming from WMS, items to take into account are
LifeTime of DB Connection (usually long)
WaitTime for a new connection (we don’t want to queue requests at the connection pool when they area already eating memory!)
FOSS4G Europe 2014, Bremen
14th-17th July 2014
JVM and deploy configuration
Premise
The options discussed here are not going to help visibly if you did not prepare the data and the styles
They are finishing touches that can get performance up once the major data bottlenecks have been dealt with
Check “Running in production” instructions here
FOSS4G Europe 2014, Bremen 14th-17th July 2014
JVM settings
--server: enables the server JIT compiler
--Xms2048m -Xmx2048m: sets the JVM use two gigabytes of memory
--XX:+UseParallelOldGC -XX:+UserParallelGC: enables multi-threaded garbage collections, useful if you have more than two cores
--XX:NewRatio=2: informs the JVM there will be a high number of short lived objects
--XX:+AggressiveOpt: enable experimental optimizations that will be defaults in future versions of the JVM
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Setup a local cluster
Oracle Java2D locks when drawing antialiased vectors
Limits scalability severely
Two options
Use OpenJDK, it’s slower at rendering but scales up well
Use Apache mod_proxy_balance and setup a GeoServer each 2/4 cores
mod_proxy_balance
GeoServer GeoServer GeoServer
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Clustering advantage
66%
FOSS4G 2010 vector benchmarks (roads/buildings/isolines and so on, over the entire Spain)
GeoServer 2.2.x was benchmarked using Oracle JDK without local clustering
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Marlin renderer
The OpenJDK Java2D renderer scales up, but it’s not super-fast when the load is small (1 request at a time)
Marlin-renderer to the rescue: https://github.com/bourgesl/marlin-renderer
Complex map, 10 parallel requests, different zoom levels have different details showing up
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Upgrade!
Performance tends to go up version by version
Please do use a recent GeoServer version
FOSS4G 2010 vector benchmark with different versions of GeoServer
FOSS4G Europe 2014, Bremen 14th-17th July 2014
The End
Questions? andrea.aime@geo-solutions.it
simone.giannecchini@geo-solutions.it
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoServer on steroids, the detailed version All you wanted to know about how to make GeoServer faster
but you never asked (or you did and no one answered)
Ing. Andrea Aime, GeoSolutions
Ing. Simone Giannecchini, GeoSolutions
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Preparing raster inputs
Raster Data CheckList
Objectives
Fast extraction of a subset of the data
Fast extraction of overviews
Check-list
Avoid having to open a large number of files per request
Avoid parsing of complex structures
Avoid on-the-fly reprojection (if possible)
Get to know your bottlenecks
CPU vs Disk Access Time vs Memory
Experiment with
Format, compression, different color models, tile size, overviews, configuration (in GeoServer of course)
Problematic Formats
PNG/JPEG direct serving
Bad formats (especially in Java)
No tiling (or rarely supported)
Chew a lot of memory and CPU for decompression
Mitigate with external overviews
NetCDF/grib1 for large images (large width and height)
Complex formats (often with many subdatasets)
Often contains un-calibrated data
Must usually use multiple dimensions
Use ImageMosaic
Must usually massage the data before serving
e.g. transpose X,Y,
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Problematic Formats
Ascii Grid, GTOPO30, IDRISI and similar formats are bad
ASCII formats are bad
No internal tiling, no compression, no internal overviews
JPEG2000 (with Kakadu)
Extensible and rich, not (always) fast
Can be difficult to tune for performance (might require specific encoding options)
ECW and MrSID
Very fast on some types of data
Needs to be tuned to be performant
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Choosing Formats and Layouts
To remember: GeoTiff is a swiss knife
But you don’t want to cut a tree with it!
Tremendously flexible, good fir for most (not all) use cases
BigTiff pushes the GeoTiff limits farther
Single File VS Mosaic VS Pyramids
Use single GeoTiff when
Overviews and Tiling stay within 4GB
No additional dimensions
Consider BigTiff for very large file (> 4 GB)
Support for tiling
Support for Overviews
Can be inefficient with very large files + small tiling
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Choosing Formats and Layouts
Use ImageMosaic when:
A single file gets too big (inefficient seeks, too much metadata to read, etc..)
Multiple Dimensions (time, elevation, others..)
Avoid mosaics made of many very small files
Single granules can be large
Use Tiling + Overviews + Compression on granules
Use ImagePyramid when:
Tremendously large dataset
Too many files / too large files
Need to serve at all scales
Especially low resolution
For single granules (< 2Gb) GeoTiff is generally a good fit
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Choosing Formats and Layouts
Use ImageMosaic when:
A single file gets too big (inefficient seeks, too much metadata to read, etc..)
Multiple Dimensions (time, elevation, others..)
Avoid mosaics made of many very small files
Single granules can be large
Use Tiling + Overviews + Compression on granules
Use ImagePyramid when:
Tremendously large dataset
Too many files / too large files
Need to serve at all scales
Especially low resolution
For single granules (< 2Gb) GeoTiff is generally a good fit
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Choosing Formats and Layouts
Examples:
Small dataset: single 2GB GeoTiff file
Medium dataset: single 40GB BigTiff
Large dataset: 400GB mosaic made of 10GB BigTiff files
Extra large: 4TB of imagery, built as pyramid of mosaics of BigTiff/GeoTiff files to keep the file count low
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
STEP 0: get to know your data
gdalinfo is your friend CheckList
Missing CRS Add a .prj file
Fix with gdal_translate
Missing georeferencing Add a World File
Fix with gdal_translate
Bad Tiling Fix with gdal_translate
Missing Overviews Use gdaladdo
Compression Use gdal_translate
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
STEP 1: fix and optimize with gdal_translate
CRS and GeoReferencing gdal_translate –a_srs “EPSG:4326” –a_ullr -180 0 -90 90 in.tif out.tif
Inner Tiling gdal_translate -co "TILED=YES" -co "BLOCKXSIZE=512" -co
"BLOCKYSIZE=512" in.tif out.tif
Check also GeoTiff driver creation options here
STEP 2: add overviews with gdal_addo Leverages on tiff support for multipage files and reduced
resolution pages
gdaladdo -r cubic output.tif 2 4 8 16 32 64 128
Choose the resampling algorithm wisely
Chose the tile size and compression wisely (use GDAL_TIFF_OVR_BLOCKSIZE)
Consider external overviews
FOSS4G Europe 2014, Bremen
14th-17th July 2014
GeoTiff preparation
STEP 1: fix and optimize with gdal_translate
• CRS and GeoReferencing
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
STEP 1: fix and optimize with gdal_translate
• Inner Tiling
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
STEP 2: add overviews with gdal_addo
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
Compression
Consider when disk speed/space is an issue
Control it with gdal_translate and creation options
GeoTiff tiles can be compressed
LZW/Deflate are good for lossless compression
JPEG is good for visually lossless compression
From experience
Use LZW/Deflate on geophysical data (DEM, acquisitions)
USE JPEG visually lossless with Photometric Interpretation to YCbCr for RGB
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
Compression:
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
Test on a GeoTIFF image with(and without) the following features:
• Tiling
• Overview
• Compression
Results:
• Overview increases performances by more than 6 times in respect of an image without it.
• An uncompressed image increases the GeoServer performances by 70%.
NOTE: All the tests in this section are performed on a 4 core PC with 16Gb RAM and GeoServer 2.4.
FOSS4G Europe 2014, Bremen 14th-17th July 2014
GeoTiff preparation
Test on a GeoTIFF JPEG Compressed image with(and without) TurboJPEG acceleration:
Results:
• TurboJPEG gives a 20% better response in presence of the overview, and 6% for the other cases.
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Time, Elevation and other dimensions
Use Cases:
MetOc data (support for time, elevation)
Data with additional indipendent dimensions
WorkFlow
Split in multiple GeoTiff files
Optimize the files individually
Use ImageMosaic
Use a DBMS for indexing granules
Use File Name based property collectors to turn properties into DB rows attributes
Filter by time, elevation and other attributes via OGC and CQL filters
Check back up slides for more info!
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Time, Elevation and other dimensions
Indexing multiple dimensions with DB support (video here)
datastore.properties
stringregex.properties
timeregex.properties
indexer.properties
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Time, Elevation and other dimensions
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper Mosaic Preparation
ImageMosaic stitches single granules together with basic processing
Filtered selection
Overviews/Decimation on read
Over/DownSampling in memory
ColorMask (optional)
Mosaic/Stitch
ColorMask again (optional)
Optimize files as if you were serving them individually
Keep a balance between number and dimensions of granules
FOSS4G Europe 2014, Bremen
14th-17th July 2014
Proper Mosaic Configuration
STEP 0: Configure Coverage Access (see slide 34)
STEP 1: Configure Mosaic Parameters
ALLOW_MULTITHREADING Load data from different granules in
parallel Needs USE_JAI_IMAGE_READ set to
false (Immediate Mode)
Use a proper Tile Size In-memory processing, must not be too
large
Disk tiling should larger
If memory is scarce: USE_JAI_IMAGREAD to true USE_MULTITHREADING to false*
Otherwise USE_JAI_IMAGREAD to false ALLOW_MULTITHREADING to true
FOSS4G Europe 2014, Bremen
14th-17th July 2014
Advanced Mosaic Configuration
Optional (Advanced): Configure Mosaic Parameters Directly
Caching
Load the index in memory (using JTS SRTree)
Super fast granule lookup, good for shapefiles
Bad if you have additional dimension to filter on
Based on Soft References, controlled via Java switch SoftRefLRUPolicyMSPerMB
ExpandToRGB Expand colormapped imagery to RGB in
memory
Trade performance for quality
SuggestedSPI
Default ImageIO Decoder class to use
Don’t touch unless expert
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper Mosaic Configuration
Test on a Mosaic Image
USE_JAI_IMAGREAD(IR) set to true and ALLOW_MULTITHREADING(MT) set to false.
ALLOW_MULTITHREADING set to true and USE_JAI_IMAGREAD set to false.
Results:
• The use of MULTITHREADING gives a 30% better performance.
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper Pyramid Preparation
Use gdal_retile for creating the pyramid
Prepare the list of tiles to be retiled
Create the pyramid with GDAL retile (grab a coffee!)
Chunks should not be too small (here 2048x2048)
Too many files is bad anyway
Use internal Tiling for Larger chunks size
If the input dataset is huge use the useDirForEachRow option
Too many files in a dir is bad practice
Make sure the number of level is consistent
Too few bad performance at high scale
FOSS4G Europe 2014, Bremen
14th-17th July 2014
Proper Pyramid Configuration
STEP 0: Configure Coverage Access (see slide 34)
STEP 1: Configure Pyramid Parameters
ImagePyramid relies on ImageMosaic
ALLOW_MULTITHREADING Load data from different granules in
parallel Needs USE_JAI_IMAGE_READ set to
false (Immediate Mode)
Use a proper Tile Size In-memory processing, must not be too
large
Disk tiling should be larger
If memory is scarce: USE_JAI_IMAGREAD to true USE_MULTITHREADING to false*
Otherwise USE_JAI_IMAGREAD to false ALLOW_MULTITHREADING to true
FOSS4G Europe 2014, Bremen
14th-17th July 2014
Proper Pyramid Configuration
Optional (Advanced): Configure Mosaic Parameters Directly
Caching
Load the index in memory (using JTS SRTree)
Super fast granule lookup, good for shapefiles
Bad if you have additional dimension to filter on
Based on Soft References, controlled via Java switch SoftRefLRUPolicyMSPerMB
ExpandToRGB Expand colormapped imagery to RGB in
memory
Trade performance for quality
SuggestedSPI
Default ImageIO Decoder class to use
Don’t touch unless expert
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper GDAL Formats Configuration
Fix Missing/Improper CRS with PRJ or coverage config
Fix Missing GeoReferencing with World File
Make sure GDAL_DATA is properly configured
Use a proper Tile Size In-memory processing, must not be
too large
Fundamental for striped data! JNI overhead
Disk tiling should be larger
If memory is scarce: USE_JAI_IMAGREAD to true
USE_MULTITHREADING to true*
Otherwise USE_JAI_IMAGREAD to false
USE_MULTITHREADING is ignored
FOSS4G Europe 2014, Bremen
14th-17th July 2014
Proper GDAL Formats Configuration
Test on a ECW image with and without enabling ImageRead:
• ECW is a GDAL supported format.
Results:
If ImageRead is not used, then the performances are increased by more than 1,5 times.
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper JPEG2000 Kakadu Configuration
Fix Missing/Improper CRS with PRJ or coverage config
Fix Missing GeoReferencing with World File
Make sure Kakadu dll/so is properly loaded
Use a proper Tile Size
In-memory processing
Must not be too large
Disk tiling should larger
If memory is scarce:
USE_JAI_IMAGREAD to true
USE_MULTITHREADING to true*
Otherwise
USE_JAI_IMAGREAD to false
USE_MULTITHREADING is ignored
FOSS4G Europe 2014, Bremen
14th-17th July 2014
Proper GeoServer Coverage Options Configuration
Make sure native JAI and Image is installed
Enable ImageIO native acceleration
Enable JAI Mosaicking native acceleration
Give JAI enough memory
Don’t raise JAI memory Threshold too high
Rule of thumb: use 2 X #Core Tile Threads (check next slide)
Enable Tile Recycling only on trunk
Enable Tile Recycling if memory is not a problem
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Proper GeoServer Coverage Options Configuration
Multithreaded Granule Loading
Allows to fine tuning multithreading for ImageMosaic
Orthogonal to JAI Tile Threads
Rule of Thumb: use 2 X #Core Tile Threads
Perform testing to fine tune depending on layer configuration as well as on typical requests
ImageIO Cache threshold
decide when we switch to disk cache (very large WCS requests)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Reprojection Performance Vs Quality
GeoServer (since 2.1.x) reprojects raster data using a piecewise-linear algorithm
The area is divided in rectangular blocks, each having its own affine transform
The transformation between the full trigonometric expressions and the linear ones is driven by a tolerance, default value is 0.333
Larger value will make reprojection faster, but lower the quality
-Dorg.geotools.referencing.resampleTolerance=0.5
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Preparing vector inputs
Vector data checklikst
What do we want from vector data:
Binary data
No complex parsing of data structures
Fast extraction of a geographic subset
Fast filtering on the most commonly used attributes
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Choosing a format
Slow formats
WFS
GML
DXF
Good formats, local and indexable
Shapefile
Directory of shapefiles
SDE
Spatial databases: PostGIS, Oracle Spatial, DB2, MySQL*, SQL server*
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Shapefiles vs DBMS
Speed comparison vs spatial extent depicted: Shapefile very fast when rendering the full dataset
Database faster when extracting a small subset of a very large data set
Shapefile no attribute indexing, avoid if filtering on attribute is
important (filtering == reading less data, not applying symbols)
Database Rich support for complex native filters
Use connection pooling (preferably via JNDI)
Validate connections (with proper pooling)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
DBMS Checklist
Rich support for complex native filters
Use connection pooling (preferably via JNDI)
Validate connections (with proper pooling)
Spatial Indexing
Spatial Indexing
Spatial Indexing
Alphanumeric Indexing
Alphanumeric Indexing
Alphanumeric Indexing
Table Clustering
Use views to remove unused attributes
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Shapefile preparation
Remove .qix file if present, let GeoServer 2.1.x rebuild it (more efficient)
If there are large DBF attributes that are not in use, get rid of them using ogr2ogr, e.g.: ogr2ogr -select FULLNAME,MTFCC arealm.shp tl_2010_08013_arealm.shp
If on Linux, enable memory mapping, faster, more scalable (but will kill Windows):
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Shapefile filtering
Stuck with shapefiles and have scale dependent rules like the following?
Show highways first
Show all streets when zoomed in
Use ogr2ogr to build two shapefiles, one with just the highways, one with everything, and build two layers, e.g.: ogr2ogr -sql "SELECT * FROM
tl_2010_08013_roads WHERE MTFCC in ('S1100',
'S1200')" primaryRoads.shp
tl_2010_08013_roads.shp
Or hire us to develop non-spatial indexing for shapefile!
FOSS4G Europe 2014, Bremen 14th-17th July 2014
PostGIS specific hints
PostgreSQL out of the box configured for very small hardware: http://wiki.postgresql.org/wiki/Performance_Optimization
Make sure to run ANALYZE after data imports (updates optimizer stats)
As usual, avoid large joins in SQL views, consider materialized views
If the dataset is massive, CLUSTER on the spatial index:
http://postgis.refractions.net/documentation/manual-1.3/ch05.html
FOSS4G Europe 2014, Bremen 14th-17th July 2014
PostGIS specific hints
Careful with prepared statements (bad performances)
USE CASE: the layer’s style allows to display the whole layer in a single shot (no scale dependencies) prepared statements will slow down execution
EXPLANATION: Postgis will choose to use the spatial index in all cases, this makes retrieving the full data set 2-4 times slower than when using a sequential scan
COUNTERMEASURE: Not using prepared statement allows postgis to figure out a suitable plan based on the request bbox instead (assuming someone run "vacuum analyze" on the database to update the index statistics, and of course, provided there is a spatial index to start with)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Optimize styling
Use scale dependencies
Never show too much data
the map should be readable, not a graphic blob. Rule of thumb: 1000 features max in the display
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Labeling
Labeling conflict resolution is expensive, limit to the most inner zooms
Halo is important for readability, but adds significant overhead
Careful with maxDisplacement, makes for various label location attempts
FOSS4G Europe 2014, Bremen 14th-17th July 2014
FeatureTypeStyle
GeoServer uses SLD FeatureTypeStyle objects as Z layers for painting
Each one allocates its own rendering surface (which can use a lot of memory), use as few as possible
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Use translucency sparingly
Translucent display is expensive, use it sparingly
e.g. translucent fill <CssParameter name="fill-opacity">0.5</CssParameter>
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Scale dependent rules
Too often forgotten or little used, yet very important:
Hide layers when too zoomed in (raster/vector example)
Progressively show details
Add more expensive rendering when there are less features
Key to any high performance / good looking map
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Example
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Hide as you zoom in
Add a MinScaleDenominator to the rule
This will make the layer disappear at 1:75000 (towards 1:1)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Alternative rendering
Simple rendering at low scale (up to 1:2000)
More complex rendering when zoomed in (1:1999 and above)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Alternative rendering
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Point symbols
• 600 loc for 6 different points types
• Painful…
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Prepare data
alter table pointlm add column image varchar;
update pointlm set image = 'shop_supermarket.p.16.png' where MTFCC =
'C3081' and (FULLNAME like '%Shopping%' or FULLNAME like '%Mall%');
update pointlm set image = 'peak.png' where MTFCC = 'C3022'
update pointlm set image = 'amenity_prison.p.20.png' where MTFCC =
'K1236';
update pointlm set image = 'museum.p.16.png' where MTFCC = 'K2165';
update pointlm set image = 'airport.p.16.png' where MTFCC = 'K2451';
update pointlm set image = 'school.png' where MTFCC = 'K2543';
update pointlm set image = 'christian3.p.14.png' where MTFCC =
'K2582';
update pointlm set image = 'gate2.png' where MTFCC = 'K3066';
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Dynamic symbolizers
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Output tuning
WMS output formats
JPEG PNG 8bit PNG 24bit
23.8KB 169.4KB 66KB
64KB 27KB 27KB
Compression artifacts Color reduction Large size
FOSS4G Europe 2014, Bremen 14th-17th July 2014
LibJPEG-Turbo WMS Output Format
GeoServer Extension
Leverages LibJPEG-Turbo for accelerate JPEG encoding
40% to 80% increase in throughput
Up to 40% decrease in average response times
Check our blog post here
FOSS4G Europe 2014, Bremen 14th-17th July 2014
LibJPEG-Turbo WMS Output Format
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Available Color Quantizer
Paletted Images are lighter to move around!
Options: Precompute VS Compute on-the-fly
Precomputed palettes are fast but ugly
ON/OFF Transparency, no antinalising
On-the-fly palette computation options
Octree fast, supports ON/OFF Transparency. Default for opaque images
Mediancut slower, supports full Transparency. Default for translucent images
Check this page and this one as well in the GeoServer doc
FOSS4G Europe 2014, Bremen 14th-17th July 2014
WFS output formats
05
101520253035
Dimension MB
HTTP GZip compression is transparent in GeoServer, make sure proxies keep it (or pay 10x price)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Tiling & Caching
Tile caching with GeoWebCache
Tile oriented maps, fixed zoom levels and fixed grid
Useful for stable layers, backgrounds
Protocols: WMTS, TMS, WMS-C, Google Maps/Earth, VE
Speedup compared to dynamic WMS: 10 to 100 times, assuming tiles are already cached (whole layer pre-seeded)
Suitable for:
Mostly static layer
No (or few) dynamic parameters (CQL filters, SLD params, SQL query params, time/elevation, format options)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Embedded GWC advantage
No double encoding when using meta-tiling, faster seeding
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Space considerations
Seeding Colorado, assuming 8 cores, one layer, 0.1 sec 756x756 metatile, 15KB for each tile
Do yours: http://tinyurl.com/3apkpss
Not enough disk space? Set a disk quota
Zoom
level Tile count Size (MB)
Time to seed
(hours)
Time to seed
(days)
13 58,377 1 0 0
14 232,870 4 0 0
15 929,475 14 0 0
16 3,713,893 57 1 0
17 14,855,572 227 6 0
18 59,396,070 906 23 1
19 237,584,280 3,625 92 4
20 950,273,037 14,500 367 15
FOSS4G Europe 2014, Bremen 14th-17th July 2014
More Tweaks
Client-side caching of tiles
Does not work with browsers in private mode
<expireClientsList> <expirationRule minZoom="0" expiration="7200" /> <expirationRule minZoom="10" expiration="600" /> </expireClientsList>
FOSS4G Europe 2014, Bremen 14th-17th July 2014
More Tweaks
Use the right formats
JPEG for background data (e.g. ortos)
PNG8 + precomputed palette for background data (e.g. ortos)
PNG full for overlays with transparency
PNG8 full for overlays with transparency
Don’t compress things twice!
The format impacts also the disk space needed! (as well as the generation time)
Check this blog post FOSS4G Europe 2014, Bremen
14th-17th July 2014
Resource control
Resource Limits
Improve reliability and stability via limiting the amount of resources dedicated to an individual request
Improve fairness between requests, by preventing individual requests from hijacking the server and/or running for a very long time
EXTREMELY IMPORTANT in production environment
WHEN TO TWEAK THEM?
Frequent OOM Errors despite plenty of RAM
Requests that keep running for a long time (e.g. CPU usage peaks even if no requests are being sent)
DB Connection being killed by the DBMS while in usage (ok, you might also need to talk to the DBA..)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Resource Limits
They need to be tweaked together with Control- Flow
Limiting individual requests Resource Limits
Limiting amount of parallel request Control Flow
When time is involved make sure you keep into account all pieces of the response chain
E.G. Limited rendering time for WMS on data coming from WMS, items to take into account are
LifeTime of DB Connection (usually long)
WaitTime for a new connection (we don’t want to queue requests at the connection pool when they area already eating memory!)
FOSS4G Europe 2014, Bremen
14th-17th July 2014
WMS request limits
Max memory per request: avoid large requests, allows to size the server memory (max concurrent request * max memory)
Max time per request: avoid requests taking too much time (e.g., using a custom style provided with dynamic SLD in the request)
Max errors: best effort renderer, but handling errors takes time
FOSS4G Europe 2014, Bremen 14th-17th July 2014
WFS request limits
Max feature returned, configured as a global limit
Return feature bbox: reduce amount of generated GML
Per layer max feature count
FOSS4G Europe 2014, Bremen 14th-17th July 2014
WCS request limits
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Control flow
Control how many requests are executed in parallel, queue others:
Increase throughput
Control memory usage
Enforce fairness
More info here
FOSS4G Europe 2014, Bremen 14th-17th July 2014
$GEOSERVER_DATA_DIR/controlflow.properties
# don't allow more than 16 GetMap requests in parallel
ows.wms.getmap=16
Control flow
17%
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Auditing
Log each and every request
Log contents driven by customizable template
Summarize and analyze requests with offline tools
More info here
FOSS4G Europe 2014, Bremen 14th-17th July 2014
JVM and deploy configuration
Premise
The options discussed here are not going to help visibly if you did not prepare the data and the styles
They are finishing touches that can get performance up once the major data bottlenecks have been dealt with
Check “Running in production” instructions here
FOSS4G Europe 2014, Bremen 14th-17th July 2014
JVM settings
--server: enables the server JIT compiler
--Xms2048m -Xmx2048m: sets the JVM use two gigabytes of memory
--XX:+UseParallelOldGC -XX:+UserParallelGC: enables multi-threaded garbage collections, useful if you have more than two cores
--XX:NewRatio=2: informs the JVM there will be a high number of short lived objects
--XX:+AggressiveOpt: enable experimental optimizations that will be defaults in future versions of the JVM
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Native JAI and JDK
Install native JAI and use a recent Sun JDK!
Benchmark over a small data set (the effect is not as visible on larger ones)
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Setup a local cluster
Oracle Java2D locks when drawing antialiased vectors
Limits scalability severely
Two options
Use OpenJDK, it’s slower at rendering but scales up well
Use Apache mod_proxy_balance and setup a GeoServer each 2/4 cores
mod_proxy_balance
GeoServer GeoServer GeoServer
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Clustering advantage
66%
FOSS4G 2010 vector benchmarks (roads/buildings/isolines and so on, over the entire Spain)
GeoServer 2.2.x was benchmarked using Oracle JDK without local clustering
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Marlin renderer
The OpenJDK Java2D renderer scales up, but it’s not super-fast when the load is small (1 request at a time)
Marlin-renderer to the rescue: https://github.com/bourgesl/marlin-renderer
Complex map, 10 parallel requests, different zoom levels have different details showing up
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Upgrade!
Performance tends to go up version by version
Please do use a recent GeoServer version
FOSS4G 2010 vector benchmark with different versions of GeoServer
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Benchmarking
Using JMeter
Good benchmarking tool
Allows to setup multiple thread groups, different parallelelism and request count, to ramp up the load
Can use CSV files to generate semi-randomized requests
Reports results in a simple table http://jakarta.apache.org/jmeter/
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Using JMeter
Thread group: how many threads
Loop: how many requests
HTTP sampler: the request
CSV: read request params from CSV
Summary table
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Generating the CSV
Simple randomized generation tool built during WMS shootouts, wms_request.py
Generate csv with the bbox and width/height to be used in JMeter scripts: ./wms_request.py -count 1200 -region -180 -90 180 90
-minres 0.002 -maxres 0.1
-minsize 256 256 -maxsize 1024 1024
Get it here along with a corresponding JMeter script: http://demo1.geo-solutions.it/share/jmeter_2011.zip
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Checking results
Results table
Run the benchmarks 2-3 times, let the results stabilize
Save the results, check other optimizations, compare the results
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Real world deploy
Deploy configuration
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Raster data
Whole Italy at 50cm per pixel
Over 4TB, updated fully every 3 years (old data still available for historical access)
Custom pyramid
100 m per pixel: one image
20m per pixel: mosaic of 20 tiles
4m per pixel: mosaic of few hundred tiles
0.5m per pixel: 9000 tiles
Each tile is 10000x10000, with overviews
FOSS4G Europe 2014, Bremen 14th-17th July 2014
Vector data
Cadastral data for the whole Italy, with full history (interval of validity for each parcel)
100 million polygons
A query extracts a subset relative to a certain time interval and area the user is allowed to see
No data from this table is ever shown below 1:50000 (SLD scale dependencies)
Physical table level partitioning (Oracle style) of the table based on geographic area to parallelize and cluster data loading, plus spatial indexing and indexes on commonly filtered upon attributes
FOSS4G Europe 2014, Bremen 14th-17th July 2014
The End
Questions? andrea.aime@geo-solutions.it
simone.giannecchini@geo-solutions.it
FOSS4G Europe 2014, Bremen 14th-17th July 2014
top related