1.

Directorate B
Corporate statistical and IT services / Directorate General Statistics

Doc. SASG/2012-1/2.2.1

Original: EN
Available: EN

Joint Eurostat – ECB

Steering Group on Seasonal Adjustment

25 April 2012, 09:30 to 17:00

ECB

Room CB05, Commerzbankbuilding

Neue Mainzer Strasse 32-36

60311 Frankfurt

Item 2.2.1 of the Agenda
Support for the implementation of ESS guidelines for seasonal adjustment
(Interim technical report) by Jean Palate
Doc. SASG/2012-1/2.2.1

Support for the implementation of ESS guidelines for seasonal adjustment

Grant agreement N°61001.2011.004.2014.348

Interim implementation technical report

0. Introduction

The main goal of the action is the finalization of the Java implementation of Demetra+, including the core engines (Tramo-Seats and X13). A complete set of Java libraries should be available for the production of seasonally adjusted (SA) series in a way similar to the current .NET application. Moreover, several additional modules were planned for a better implementation of the ESS guidelines.

At the end of the action, other teams/institutions should be able to progressively take over the software or, at least, to contribute to its evolution.

The report focuses on the completion state of the tool (what has been done and what should still be done), following a functional point of view and following a more technical point of view. It also considers the actions that have been undertaken or that are planned for making easier the maintenance of the software.

1. Functional issues

We consider in this paragraph the stage of the developments from a functional point of view. In a first point, the implementation of the core engines is shortly discussed. We present in a second point the implementation of the SA diagnostics, defined in the Users' group (and not considered in the initial description of the grant). We also consider the additional modules, planned in the grant or asked by the Users' group, and we finally mention the peripheral features that are still missing.

For making the different points more practical, we provide in the annex 1 a table that maps the different topics to the current Java packages.

1.1. Finalization of the core engines

To a large extent, the Java implementation of the core engines can be considered as finished. We mention below the tasks that still must be fulfilled, with a raw estimation of the needed resources.

Method / Current status / Additional resources
(in working days)
Tramo-Seats / The current implementation doesn't contain all the recent improvements of the original programs. The missing features are mentioned below:
  • Tramo
  • Tests for over/under differencing
  • Seats
  • Stochastic trading days
  • Modification of non decomposable models
/ 15
The tests must be done (Demetra+ .NET has been extended with the last version of the core engines to simplify the task) / 10
X13 / The implementation is finished.
The tests must be done / 10

The options that are available in the new implementation of the two methods are listed in the annexes 2 and 3.

In comparison to the initial Java implementation, the most important improvements are the following:

  • Extension of the specifications supported by the Java implementation
  • Improvements of some important numerical routines
  • Optimization procedure (better implementation of Levenberg-Marquardt)
  • Better matrix computation (QR, Cholesky...)
  • Improved factorization of the covariance generating function of MA process (based on an algorithm proposed by Wilson, important routine in the canonical decomposition)

1.2. SA Diagnostics

The SA Users' group has defined the set of the diagnostics that must be provided by JDemetra+ (Vilnius, 8/2/2012). We give below a revised version of that list and their current development status.

We consider diagnostics that are integrated in the "quality report" (status = OK), diagnostics that have a graphical interface (status=UI) and diagnostics that are not implemented yet (status=TODO).

The choice of the diagnostics that have to be included in a final quality report should be based on actual tests, validated by experts of the domain. The tool should provide facilities to get information on all (pertinent) tests, either by integrating them in the current quality report (OK status) or by providing export facilities (not yet implemented).

Category / Name / Description / Current status
Decomposition, descriptive statistics / Definition / Inspection of the definition constraints (basic relationships between different components of the time series) / OK
Annual totals / The test compares the annual totals of the original series and those of the seasonally adjusted series. / OK
Average of the absolute value of the differences between the annual averages of the original series, SA series and TC / TODO
RegArima residuals / Normality of the residuals / Skewness and Kurtosis tests
Joint Doornik-Hansen normality test / OK
Independence of the residuals / Ljung-Box, Box-Pierce tests on residuals (at all lags, at seasonal lags) / OK
Linearity of the residuals / Ljung-Box, Box-Pierce tests on squared residuals / UI
Randomness of the residuals / Tests of up and down runs / UI
Spectral td peaks in the residuals / Test based on the periodogram of the residuals / OK
Spectral seas peaks in the residuals / Test based on the periodogram of the residuals / OK
RegArima modelling / Outliers / "Excessive" number of outliers / OK
RegArima forecasting / In-sample forecasting test / Tramo-like / OK (v0.0.10)
Out of sample forecasting test / Tramo-like / OK
(v0.0.10)
Seasonality tests / Spectral seas peaks / Visual significance of seasonal peaks / OK
Spectral td peaks / Visual significance of trading day peaks / OK
Friedman test / Non parametric seasonality test. / UI
Kruskall-Wallis test / Non parametric seasonality test. / UI
Evolutive seasonality test / X13-like / UI
Combined seasonality test / X13-like / UI
Residual seasonality / Residual seasonality on sa / F-test on stable seasonality / OK
Residual seasonality on sa (last 3 years) / F-test on stable seasonality / OK
Residual seasonality on irregular / F-test on stable seasonality / OK
Model-based diagnostics (Seats) / Seas variance / Variance of seasonal component, its theoretical estimator and empirical estimate / OK
Irregular variance / Variance of irregular component, its theoretical estimator and empirical estimate / OK
Trend variance / Variance of trend component, its theoretical estimator and empirical estimate / UI
SA variance / Variance of sa component, its theoretical estimator and empirical estimate / UI
Trend/seasonal/irregular / Comparison of cross-correlation between theoretical estimators with cross-correlation between empirical estimates / OK/UI
Theoretical and empirical autocorrelations / Check of autocorrelation coefficients of theoretical estimator and empirical estimate at lags 1 and 4 (or 12 for monthly series) / UI
Stochastic components / Component, forecasts and their standard deviations / UI (v0.0.10)
Analysis of the period to period rate of growth of SA / Model-based analysis, including standard errors / TODO
Errors and revision analysis / Model-based diagnostics / UI
X11 diagnostics / M-Statistics / OK
... should be completed by the Users group / TODO
Revisions.
Descriptive statistics / Relative (or absolute) difference between the initial estimate and the latest estimate of SA series / UI
Relative (or absolute) difference between the initial estimate and the latest estimate of the trend-cycle / UI
Sliding spans / Friedman test / Friedman test applied to the each span / UI
Kruskal-Wallis / Kruskal-Wallis test applied to the each span / UI
Evolutive seasonality test / Evolutive seasonality test applied to the each span / UI
Identifiable seasonality test / Combined seasonality test applied to the each span / UI
Abnormal values (%) / % of abnormal values in seasonal component, trading day component and seasonally adjusted series / UI

The 'TODO's imply additional work estimated at 10 working days.

1.3. Additional modules

The planned action mentions several additional modules:

  • Calendars
  • Other regression variables
  • Metadata report
  • Normalized input/output
  • Direct/indirect approach
  • Reduction of the number of calendar regressors

The Users' group put also forward the need of a univariate benchmarking module.

1.3.1. Calendars, other regression variables

The underlying routines for the generation of the calendar variables are available. However, the graphical interface to include them in the SA processing must still be developed (10 working days). The routines correspond to the .NET implementation, with some improvements for the Easter effect.

1.3.2. Metadata report

The metadata report is still to be developed (10 working days)

1.3.3 Normalized input/output

A complete framework for generating normalized input/output has been developed. However, part of the actual modules (30%) must still be adapted to use it (5 working days). The framework highly simplifies the following tasks:

  • Generating of xml or csv I/O
  • Communication with data bases
  • Creation of generic WEB services

The framework can be used for other algorithms

1.3.4 Benchmarking, direct/indirect approach

An implementation of the Cholette's method is provided (see the documentation of X13 for further description of the model). It is integrated in the new graphical interface of the SA methods.

The model has been extended to the multivariate case (with temporal and/or contemporaneous constraints). An external console application (called JBench), which can use the output of Demetra+ (.NET), has been developed for testing purposes. It will be the basis for the future reconciliation method in the direct/indirect SA analysis. Further information on JBench is provided in the annex 4.

1.4. Other missing features

Most of the missing features are related to I/O facilities, which are either not completely implemented or not available in the graphical interface. The main ones are listed below.

Feature / Status / Needed resources (days
Compatibility with Demetra+ files (xml) / 80% / 5
Other xml serialization / 70% / 5
Output (Excel, csv, xml, data bases) / 80% (missing UI) / 10

2. Technical issues

2.1. Simplification of the architecture

JDemetra+ uses a large set of packages developed by NBB or freely available on the Net (and compatible with the EUPL license). They are listed in the annex 4 and 5. Though the architecture is still complex, it has been significantly simplified in comparison with the initial Java project.

As showed in the following table, the different facets of the tools have been split in a strict way, so that the maintenance will be simplified.

NBB Modules / Facet
jtstoolkit.jar / Main statistical concepts, algorithms
jtss.jar / Rich time series framework, formatting
jtsui.jar, jtstoolkitui.jar / Graphical components
SdmxProvider.jar... / Time series providers
jappdemetra, jsacruncher, jbench... / Final application

It implies, for instance, that IT-teams that want to use SA routines in they applications only need jtstoolkit.jar (and its unique dependency, for logging)

Apart from the use of NetBeans (see below), the current architecture should no longer be changed.

2.2 Improvement of the design of the libraries

A large part of the work has been devoted to the cleaning and to the simplification of the libraries. Many classes' hierarchies have been streamlined, for an easier understanding and maintenance. That is especially true in the following domains:

  • Time series providers and browsers
  • Representation of complex processing (work still in progress)
  • I/O facilities (xml, csv, compatibility with original spec files...)

The possibility for external teams to enrich dynamically some aspects of the software by providing their own implementation has been enlarged. Beside time series providers, output, diagnostics and formatting (feature already available in the .NET version), they may now improve the main graphical components of the SA processing by adding their own graphical panels.

Future changes in the current design of the libraries should be marginal.

2.3. Use of the NetBeans platform

From a technical point of view, the most important decision is - by far- the use of the NetBeans platform for the main graphical application. That choice was mainly guided by the following considerations:

  • Large external support (for maintenance, documentation...)
  • Flexible architecture; possibility for external teams to add they own modules in the framework
  • Compatibility with previous developments (Swing components)
  • Deployment facilities; installation procedures automatically generated for different platforms; download of new plug-ins or of updates by Internet...

Using NetBeans will not imply the rewriting of the current modules, but simply the definition of a light layer that will make the connection between NetBeans and the NBB components. In that way, going to other technical choices would not be too expensive. It also needs to be mentioned that all the java packages may still be used outside NetBeans.

See for further information on NetBeans.

The integration of the current SA components is in progress. The task should be finished at the end of May 2012 (30 working days).

3. Code and Documentation

Putting the code and all its related information in an Open Source environment is one of the key factor for the maintenance in the long term of the project. Providing useful documentation is also a strategic aspect. We consider those points below.

3.1. Integration of the code in the JoinUp structure

The Java code of JDemetra+(version 0.0.9) is available on JoinUp (Seasonal Adjustment Toolkit or "sat", see The subversion repository with the different versions of the code is located at The maven repository, which will contain the modules and instructions for compiling them, is located at (soon available).

3.2. Documentation

Considering the size of the libraries (> 2000 classes, > 20000 properties/methods) and the available resources, a complete description of the API is unrealistic. It has been decided to focus in a first stage on the (computational) classes involved in SA and on features that can be extended by external teams. The API of more or less 50% of those classes is documented (in the code, following the "JavaDoc" conventions).

More fundamentally, even if the documentation of the API is necessary, it is not sufficient to allow an efficient use of such a complex set of libraries. It is planned to provide examples that explain how to use the modules to carry out some (simple) statistical tasks. For the moment, no well-structured document is available.

Finally, it was suggested to organize, in collaboration with Eurostat, training sessions on the use of the Java libraries. Such sessions could take place in June or, preferably, in September.

50 working days will still be devoted to the different documentation and training tasks.

4. Final considerations

Considering the state of progress of the work, the remaining actions described in this report (totalling around 180 persondays) should be finalised within the time range of the current grant. The final output will also include new features, like the integration in NetBeans or an advanced module for benchmarking.

Annex 1. Mapping between the topics of the report and the current libraries (jdemetra-0.0.10)

For some topics, several packages/classes may be mentioned. In that case, they usually refer either to implementation routines (I) or to presentation routines (P).

Only the most important classes, directly linked to the topic, are mentioned.

Topic / Packages/classes
Tramo / ec.tstoolkit.modelling.arima.tramo.*
Seats / ec.satoolkit.seats.*
TramoSeats / ec.satoolkit.tramoseats.*
RegArima / ec.tstoolkit.modelling.arima.x13.*
X11 / ec.satoolkit.11.*
X13 / ec.satoolkit.x13.*
Diagnostics. Basic controls / ec.tss.sa.diagnostics.CoherenceDiagnostics
Diagnostics. RegArima residuals / ec.tstoolkit.modelling.arima.PreprocessingModel (I)
ec.tstoolkit.stats.NiidTests (I)
...
ec.tstoolkit.data.Periodogram (I)
ec.tss.sa.diagnostics.ResidualsDiagnostics (P)
Diagnostics. RegArima modelling / ec.tstoolkit.modelling.arima.PreprocessingModel (I)
ec.tss.sa.diagnostics.OutliersDiagnostics (P)
Diagnostics. RegArimaForecasting / ec.tstoolkit.modelling.arima.diagnostics.OneStepAheadForecastingTest (I)
ec.tss.sa.diagnostics.OutOfSampleDiagnostics (P)
Diagnostics. Seasonality tests / ec.satoolkit.diagnostics.*
Diagnostics. Residual seasonality / ec.satoolkit.diagnostics.* (I)
ec.tss.sa.diagnostics.ResidualSeasonalityDiagnostics (P)
Diagnostics. Model-based decomposition(Seats) / ec.tstoolkit.ucarima.* (I)
ec.tstoolkit.ucarima.estimation.WienerKolmogorovDiagnostics (I)
ec.tss.sa.diagnostics.SeatsDiagnostics (P)
Diagnostics. M-Statistics / ec.satoolkit.x11.Mstatistics (I)
ec.tss.sa.diagnostics.MDiagnostics (P)
Revisions analysis / ec.tstoolkit.timeseries.analysis.RevisionHistory
Sliding spans / ec.tstoolkit.timeseries.analysis.SlidingSpans
Calendar variables / ec.tstoolkit.timeseries.regression.GregorianCalendarVariables
ec.tstoolkit.timeseries.calendars.*
Moving holidays variables / ec.tstoolkit.timeseries.regression.EasterVariable
Univariate benchmarking / ec.benchmarking.simplets.TsCholette2
Multivariate benchmarking / ec.benchmarking.simplets.TsMultiBenchmarking
Normalized input/output / ec.tstoolkit.information.* (mainly)

Annex 2. Tramo-Seats specification

Identifier (normalized I/O) / Description (Tramo-Seats code), remark / Default
tramo.transform.span / Series span / All
tramo.transform.function / LAM / Level (=1)
tramo.transform.fct / FCT / 1
tramo.arima.mean / IMEAN / 0
tramo.arima.theta / TH, JQR, Q / 1
tramo.arima.d / D / 1
tramo.arima.phi / PHI, JPR, P / 0
tramo.arima.btheta / BTH, JQS, BQ / 1
tramo.arima.bd / BD / 1
tramo.arima.bphi / BPHI, JPS, BP / 0
tramo.automdl.enabled / IDIF, INIC / false (=0, 0)
tramo.automdl.pcr / PCR / 0.95
tramo.automdl.ub1 / UB1 / 0.96
tramo.automdl.ub2 / UB2 / 0.88
tramo.automdl.tsig / TSIG / 1
tramo.automdl.pc / PC / 0.14286
tramo.automdl.cancel / CANCEL / 0.10
tramo.automdl.fal / FAL (obsolete) / false
tramo.automdl.compare / new parameter / true
tramo.regression.calendar.td.option / ITRAD (partim) / 0
tramo.regression.calendar.td.leapyear / ITRAD (partim) / 0
tramo.regression.calendar.td.holidays / IUSER=-2, HOLIDAYS
tramo.regression.calendar.td.user / IUSER=1
tramo.regression.calendar.td.stocktd / new parameter
tramo.regression.calendar.td.test / ITRAD (partim) / false
tramo.regression.calendar.easter.duration / IDUR / 6
tramo.regression.calendar.easter.type / IEAST (partim), new parameter / Default
tramo.regression.calendar.easter.test / IEAST (partim) / false
tramo.regression.outliers / ISER=2
tramo.regression.ramps / new parameter
tramo.regression.user*.name / IUSER=1
tramo.regression.user*.effect / REGEFFECT
tramo.regression.user*.firstlag / new parameter
tramo.regression.user*.lastlag / new parameter
tramo.regression.intervention*.name / new parameter
tramo.regression.intervention*.sequences / IUSER=0, ISEQ
tramo.regression.intervention*.delta / DELTA / 0
tramo.regression.intervention*.deltas / DELTAS / 0
tramo.outlier.span / INT1, INT2
tramo.outlier.types / IATIP, AIO / 0
tramo.outlier.va / VA / 0
tramo.outlier.eml / IMVX / false (=0)
tramo.outlier.deltatc / DELTATC / 0.7
tramo.esimate.span / new parameter / All
tramo.esimate.eml / INCON / true (=0)
tramo.esimate.tol / 1e-4 / 1e-5
tramo.esimate.ubp / UBP / 0.97
seats.xl / XL / 0.999
seats.rmod / RMOD / 0.4
seats.epsphi / EPSPHI / 3
seats.admiss / NOADMISS / true, (1)
seats.wk / new parameter (Wiener-Kolmogorov or Kalman filter) / true
benchmarking.enabled / new parameter / false
benchmarking.target / new parameter / y (original series)
benchmarking.lambda / new parameter (see JBench) / 1
benchmarking.rho / new parameter (see JBench) / 1
benchmarking.bias / new parameter (overall bias correction) / false

When default values in JDemetra+ are different from default values in the original programs, they are mentioned in bold.