Optimize a Pentaho Data Service

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-09

As you Test a Pentaho Data Service, you might notice certain bottlenecks, or parts of the transformation that could run more efficiently. If you want to improve the performance of your data service, apply an optimization technique. Some techniques are specifically designed for Pentaho Data Services. See the Administer Pentaho Data Integration and Analytics document to learn about other general design and optimization techniques that can improve the performance of your transformation.

Optimization Technique When to Use
Service Cache

For a regular data service only, adjust how long data results are cached. Consider using this technique if either of the following situations apply:

  • Your result set contains modest data sizes.
  • You query Big Data sources. Increasing the cache duration can help subsequent follow-on queries run more quickly.
Note: This optimization technique is not available for a streaming data service. It will not appear as an optimization tab if Data Service Type is set to Streaming.
Query Pushdown

Handle input step queries at the source. Consider using this technique if both of the following situations apply:

Parameter Pushdown

Handle step queries at the source. Consider using this technique if both of the following situations apply:

  • Your transformation contains any step that should be optimized, including input steps like REST where a parameter in the URL could limit the results returned by a web service.
  • You do not use more complex WHERE clauses in your query that might contain IN or OR keywords such as WHERE REGION = "South" OR Code = "Yellow". Limits for the WHERE clause construction appear in Pentaho Data Service SQL support reference and other development considerations.
Streaming Optimization

For a streaming data service only, adjust the maximum number of rows and elapsed time to produce a new streaming window for processing. Consider using this technique if you are creating a data service from one of the following streaming data steps:

Note: This optimization technique is not available for a regular data service. It will not appear as an optimization tab if Data Service Type is set to Regular.