Technical Annex I, Characteristics of the ECR Publications
I.Characteristics of the generalpublications
1.1.General presentation of publication
General publications differ largely from each other in type of publication (folder, booklet, brochure, book…), format (C5, B5, A5, A4…), language versions and number of pages. For your reference, in 2008, Unit B1 Cross-Media Publications finalised 8446 publications (electronic + paper) with a total of 698 050 pages produced.
For examples of general publications in PDF format, we kindly invite you to visit EU Bookshop:
Formex files will not be automatically provided for each and single general publication, but mainly on request. In a first phase, starting in 2010, only a very limited number of publications will also be published in a Formex format. This is (and will be) the case for the Special Reports of the European Court of Auditors, which are systematically published in Formex .
These Special Reportshave the following specifications:
- They are published 15-25 times a year
- They contain minimum 36 and maximum 128 pages and are available in all 23 official languages. An average issue contains 64 pages + cover.
Every issue of the special reports contains:
- a table of contents
- an introduction
- the main text of the Report, drawn up by the Court of auditors, consisting of one or more chapters
- conclusions and recommendations
- one or more annexes
- the reply of the commission
Furthermore, the report may contain a list of abbreviations, illustrations (photos) or maps, and more or less complex tables in the main text or the annexes.
Examples of ECA Special Reports can be found on EU Bookshop ( > 'Browse by author' > 'European Court of Auditors')
1.2.Frequency
With regard to the Special Reports: The European Court of Auditors publishes 15 to 25 Special Reports a year. The frequency is uneven and ranges from 0 (zero – in July e.g.) to 5 reports (in September) produced at the same time. These figures are indicative.
As for the general publications, it is not possible to provide a detailed forecast. With the ECA Special Reports included, we could envisage a Formex production of a total of 50 publications in 2010. This number will most probably increase. We could expect between 50 and 200 general Formex publications in 2011 and between 100 and 500 in 2012.
These numbers depend on the demand of different services (including the author service) and will be subject to changes.
1.3.Electronic versions
General publications are made available in PDF and (in specific cases) Formex/XML:
- one web-optimised PDF document containing the entire publication (one file per language version)
- (if applicable) two print-optimized PDF-files, one for the cover and one for the inside pages, per language version
- A Formex document describing the bibliographical metadata and the content of one language version. For large documents, or general publications that need specific (legal) treatment, the Formex documents can be split up in a part describing the contents and reference structure of a publication, and a part containing the main text, the annexes, tables etc.
Annexes III to XI set out the working instructions
Annexes II and IV set out the technical characteristics of Formex files.
1 / 22AO 10263 – Technical annexes – lot 2
Technical Annex II, Formex V4
II.Exchange format for data recorded on electronic media in XML (FORMEX V4)
- Validation of the tagging according to the semantic and structural rules defined in the formex XML Schema and documentation
- Control of the coherence between tagging (including attributes) and the contents
- Validation of correct use of processing instructions
- Control of the coherences between instance naming and corresponding internal information
- Validation of the correct selection and use of elements and attributes according to the Formex schema and documentation
- Validation of completeness of structural and semantic tagging
- Validation of mandatory elements in a given context although they might be defined as optional in the schema
- Control of the messages in the reports resulting from processing and comparison between Formex and PDF instances
- Control of synoptism between various languages versions (only on request)
- Valiation of quality of non-XML elements of a delivery (p.ex tiff files)
- Control of the completeness of deliveries
- Generation of a report on anomalies
- Creation of statistics on error type
See the website on this subject (
1 / 22AO 10263 – Technical annexes – lot 2
Technical Annex III, Working instructions
III.Working instructions
For each set of Formex files of general publicationsto be validated, the contractor will receive working instructions. These working instructions will be transmitted electronically in a structured format. The structure will be defined in a common agreement with the contractor.
The information on the working instructions might still undergo somemodifications but will in principle contain:
1) Order identification (number, type and date of the order, indication of the sender – the Publications Office – and of the recipient).
2) publicationreferences: file number and/or catalogue number(s).
3) Task to perform (for example PUB-VAL).
4) Reference to the"Bon de commande" (BDC = order form).
5) History of the order (also called “treatment request” or “validation request”): it consists of a link to other messages related to this order. The aim is to help the contractor perform the demanded tasks with the indication of previous orders or of particular tasks to perform, such as the languages concerned, the work to be done for a particular document, general remarks or anomalies found by the Publications Office. The Publications Office operator can send any supplementary information through
e-mail or by generating an extra message parallel to the validation request, thus ensuring a history of the validation request.
6) Identification of the languages to be validated, with the indication of the files annexed.
The anomalies found during the pre-processing step can beinserted in the documents to be validated in the form of processing instructions, at the spot where the possible error has been found.
Examples of working instructions:
(Important remark: as the validation chains for general publications are not yet completely developed, the following hypothetical example (based on OJ validation) of a validation order is provided for information only).
1) PUB validation order (PUB_request_VM)
Détail du message2.> Message OPOCXXX1000971
Cliquez ici pour télécharger le message en format XML
Identification
Type :PUB_REQUEST_VM
De :OP B1
A :Validation Partner
Date d'envoi :2010.03.01 09:54:48
Références : 2010/7531
Tâche(s) :TXT-VAL
BDC(s) :
Message associé :
Observations
Pièces jointes
Type PublicationLangageFichierFormatCompression
AUTO_VALID_REPORTBGAV_QDAB07010BGC.zipXMLZIP
AUTO_VALID_REPORTDEAV_QDAB07010DEC.zipXMLZIP
AUTO_VALID_REPORTELAV_QDAB07010ELC.zipXMLZIP
AUTO_VALID_REPORTENAV_QDAB07010ENC.zipXMLZIP
AUTO_VALID_REPORTFRAV_QDAB07010FRC.zipXMLZIP
AUTO_VALID_REPORTPTAV_QDAB07010PTC.zipXMLZIP
COMPARISON_REPORTBGCOMP_QDAB07010BGC.zipXMLZIP
COMPARISON_REPORTDECOMP_QDAB07010DEC.zipXMLZIP
COMPARISON_REPORTELCOMP_QDAB07010ELC.zipXMLZIP
COMPARISON_REPORTENCOMP_QDAB07010ENC.zipXMLZIP
COMPARISON_REPORTFRCOMP_QDAB07010FRC.zipXMLZIP
COMPARISON_REPORTPTCOMP_QDAB07010PTC.zipXMLZIP
FMX_PUBBGQDAB07010BG.xml.zipFMX4ZIP
FMX_PUBDEQDAB07010DE.xml.zipFMX4ZIP
FMX_PUBELQDAB07010EL.xml.zipFMX4ZIP
FMX_PUBENQDAB07010EN.xml.zipFMX4ZIP
FMX_PUBFRQDAB07010FR.xml.zipFMX4ZIP
FMX_PUBPTQDAB07010PT.xml.zipFMX4ZIP
PDF_PUBBGQDAB07010BG.pdf.zipPDFZIP
PDF_PUBDEQDAB07010DE.pdf.zipPDFZIP
PDF_PUBELQDAB07010EL.pdf.zipPDFZIP
PDF_PUBENQDAB07010EN.pdf.zipPDFZIP
PDF_PUBFRQDAB07010FR.pdf.zipPDFZIP
PDF_PUBPTQDAB07010PT.pdf.zipPDFZIP
Top of Form
Retour
Bottom of Form
VM2/OPOCE Version web:[3.3.4] WOOD:[3.3.4] Database:[3.3.0] (12-octobre-2009) © OPOCE
The part Attached files shows the language versions and the different files sent for validation purposes, i.e. the processing report (XML file – if applicable), the PDF-FMX comparison report (XML file – if applicable), as well as the PDF and FMX files. It also shows the compression protocol (ZIP) used to send the request; nevertheless, the compression protocol (TARGZ) could also be used.
The processing reports may contain "warnings" highlighting the fact that the contractor has to verify and, if required, correct the documents. The following pages are examples of thesevalidation reports.
2) Example of comparison report
As the processing chain for general publications has not yet been developed, we kindly refer to the example of the comparison report of Lot 1 instead. It is to be taken in account only as an example.
1 / 22AO 10263 – Technical annexes – lot 2
Technical annex V, TXT-VAL
IV.PUB-VAL: Checking the general publications and its reference strucure
The verification concerns the instances with the root element PUB. It contains references to all individual Formex and PDF documents building the complete Publication including bibliographic information. Checking consists of establishing whether the references are correct, the publication titles and subtitles are correct and have the correct formatting tags, whether any references or titles are missing or whether there are too many, whether (if required) the documents begin and end on the pages indicated, and whether the basic bibliographic data are correct. Any erroneous information contained in the document must be pointed out in the validation report, e.g.
_pi: VM/PUB-VAL/ERROR “referenced PDF document missing”
The errors reported by the processing or the PDF-FMX comparison have to be classified as follows:
_pi: VM/PUB-AUTOVAL/ERROR_CONFIRMED for ‘real’ errors,
_pi: VM/PUB-AUTOVAL/NO_ERROR for wrongly reported errors.
If the processing (if applicable) produces systematic incorrect errors or warning messages, this has to be reported via an anomaly report as well (Technical annex XII).
The final structure and wording of these and the following processing instructions may be modified in agreement with the contractor and according to the production rules to be established at the beginning of the contractor’s service provision.
Examples of some PUB instances are available in Formex format on the following site:
The tags and the possible values in the following description may undergo minor changes.
Major PUB Subelements
ThePUB element describes the composition of publications. Therefore it mainly consists of bibliographic data and a table of contents, i.e. the titles of the publications as well aslinks to chapters, annexes, images, graphs and tables and/or related publications. The main Formex elements are described below.
<PUBLICATION>The PUBLICATION element is used as root element for instances describing a publication.Not only the L and C collection of the OJ, or the special editions (secondary legislation), are covered by this element, but more general publications as well.
It is used to identify all the bibliographical elements and the documents contained in the publication.
the bibliographical description (BIB.GEN.PUB), the contents of the publication, and physical references to the documents belonging to the publication are there provided.
<BIB.GEN.PUBcontains meta-information on a publication. It is composed as follows
AUTHORdescribes the author of the publication;
TITLEencapsulates the title of the publication;
PUBLISHERdecribes the editor;
SIZEdescribes the physical size of the publication
NO.CATthe catalog number;
NO.ISBNthe International Standard Book Number;
<NO.DOI>this element is used as a container for the DOI in the context of general publications.
NO.ISSNthe Internation Standard Series Number;
INFO.PUBLISHERany information added by the editor;
Pany supplement informtion;
FMX.GENreferences to Formex instances composing the publication;
PAPER.GENinformation for the creation of a table of contents such as published on paper;
PDF.GENreferences to PDF files composing the publication;
<TITLE>contains the title of the publication;
GR.SEQcontains any additionally text published on the cover page(s);
TOCcontains the structure of contents of the publications and the references to the cases; The TOC element may contain the following elements:an optional title (TITLE), a contents which consists of optional headers (TOC.HD) and entries logically grouped together (TOC.BLK).
<TOC.HD>is used to markup the header of a table of contents. This header may appear at the beginning of the table of contents;
<TOC.BLK>is used to markup entries in a table of contents which are logically group together. The TOC.BLK element can contain the following elements: a title (TITLE) or a table of contents header (TOC.HD), one to several entries (TOC.ITEM), one to several nested groups (TOC.BLK).
<TOC.ITEM>Basically, a table of contents is a series of entries. Each entry consists of an optional number, a entry title, and an optional reference to the full contents (ITEM.REF).
<ITEM.REF>Entries in a table of contents associate titles or references to page numbers. These page numbers are marked up using the ITEM.REF element. The contents of this element generally consists of a page number.The REF.XML attribute gives the possibility to add the URI to an XML instance, if the goal of the reference is external to the document.
DOC.MAIN.PUBwithin the bibliographical description of a document, this element is used to mark up informations concerning the main document part.
It can contain information like the language version of the document (LG.DOC), the title of the document, if it is given in the table of contents (TITLE), the legal value or type of document (LEGAL.VALUE), the mandatory date of the document (DATE), generally provided within its title, if not the date of the publication has to be used, the volume identifier (VOLUME.ID), the number of the first page of the document (PAGE.FIRST), the number of the last page of the document (PAGE.LAST), the total number of pages in the document (PAGE.TOTAL), the sequence number of the document on the first page (PAGE.SEQ), the name of the file containing the text of the document (REF.PHYS).
DOC.SUB.PUBthis element is used to mark up information concerning secondary documents like annexes or other related documents.
It can contain similar information and subelements as the DOC.MAIN.PUB element.
V.TXT-VAL: Checking the textual correctness
The principle reasoning for the text verification is to see whether the text can be reproduced from the Formex/XML format.
The checks include:
all characters printed like words, figures, dates and spaces
character formatting like superscript and sub-script (H2O, 2nd)
text formatting like words in italic or bold
presence and contents of footnotes.
The following page layout elements are excluded from the checks:
column layout,
line spacing,
centred text,
all formatting that could be derived from the tags; example: Titles in bold do not necessarily need separate formatting tags,
position of footnote text.
It is possible that the reports generated during both the processing including the PDF-FMX comparison steps, annexed to the treatment order, show the presence of possible errors that have to be checked and, eventually, changed to ‘warnings’.
The errors reported by the processing includingthe PDF-FMX comparison have to be classified as follows:
_pi: VM/TXT-AUTOVAL/ERROR_CONFIRMED for ‘real’ errors,
_pi: VM/TXT-AUTOVAL/NO_ERROR for wrongly reported errors.
If the processing produces systematic incorrect errors or warning messages, this has to be reported via an anomaly report as well (Technical annex XII).
If other errors are found during the checking, this has to be indicated as follows (example):
_pi: VM/TXT-VAL/ERROR “extra space should be removed”
1 / 22AO 10263 – Technical annexes – lot 2
Technical annex VI, STRUCT-VAL
VI.STRUCT-VAL: Checking the structure
The verification of the structure concerns the checking if the XML tagging conforms to the Formex specifications (Technical annex II) and the logical structure of the document. The position and contents of XML tags and attributes must be checked. There is a fixed price per page, independent of the number of tags on the page.
Examples of verifications are:
the use of the correct root element,
the correctness of bibliographic data,
the conformant use of markup as specified by the specifications (if text parts like the title, cover page, subject, etc. are tagged correctly),
the correct markup of tables (if tables have the correct structure, not only with the right number of columns and rows but also with the correct horizontal and vertical associations),
synoptical markup as far as possible,
presence of textual contents of images,
correct numbering in attribute values,
correct markup of dates,
markup of quotations,
etc.
It is possible that the reports generated during the processing includingthe PDF-FMX comparison, annexed to the working order, show the presence of possible errors that have to be checked and, eventually, corrected as ‘warnings’.
If, for example, a wrong attribute value is found, this has to be indicated as follows (example):
_pi: VM/STRUCT-VAL/ERROR “Attribute NO.SEQ should be 0016”
Errors reported from processing have to be classified as follows:
_pi: VM/STRUCT-AUTOVAL/ERROR_CONFIRMED for ‘real’ errors,
_pi: VM/STRUCT-AUTOVAL/NO_ERROR for wrongly reported errors.
If the processing produces systematic incorrect errors or warning messages, this has to be reported via an anomaly report as well (Technical annex XII).
1 / 22AO 10263 – Technical annexes – lot 2
Technical annex VII, BM-VAL
VII.BM-VAL: Checking the Bookmark structure
The validation of the PDF bookmark structure consists of the following subtasks:
the bookmark structure is complete, i.e. all the titles and headings are listed in the PDF bookmark pane
the links of the bookmarks refer to the proper page and relative page position, i.e. after clicking a bookmark the referred title or heading is visible in the main text pane.
If, for example, a bookmark link points to the wrong page, this has to be indicated as follows (example):
_pi: VM/BM-VAL/ERROR “bookmark link points to wrong page”
1 / 22AO 10263 – Technical annexes – lot 2
Technical annex VIII, META-VAL
VIII.META-VAL: Checking the Meta Data Fields
The validation of the PDF meta data set consists of the following subtasks:
-checking the completeness of the meta data, i.e. the following meta data fields must be filled (the following is a sample list of metadata which might be extended if necessary):
- TITRE
- NUMERO_CATALOGUE
- AUTEUR
-checking the correctness of the meta data fields, i.e. checking whether they are in-synch with the bibliographic information given in the Formex/XML files.
If, for example, metadata is missing or not in-synch with the Formex fields, this has to be indicated as follows (examples):
_pi: VM/META-VAL/ERROR “metadata field NUMERO_DOI missing”
_pi: VM/META-VAL/ERROR “metadata field NUMERO_DOInot equal to Formex element NO.DOI”
1 / 22AO 10263 – Technical annexes – lot 2
Technical annex IX, LG-SYNC
IX.LG-SYNC: Checking the Synoptic Pagination of PDF Documents
On request; for specific publications the contractor should check if the various language versions of the same document in PDF format are in-synch, i.e. synoptic. This means that every title, heading and numbered paragraph must be in all language versions on the same page.
This working instruction always refers to two documents: One document in the reference language version, the other in the language version to be checked. The Publications Office will make the reference language version available together with the other language version.