Meeting Minutes XLIFF OMOS TC Feb 9, 2016

Rollcall

dF: Loïc, do we have quorum? We received regrets from Yves.

Loic: Yes, 60% of voting members.

Approve meetingminutesfrom Jan 26 meeting

dF: I am sharing my screen now, to show the agenda. After roll call, we can proceed to approvingminutesfrom 26th January which was posted today, anyone seconds approving theminutes fromprevious meeting?

Loic: I second.

dF: If there is no objections, it’s approved. We can proceed to part B, material;

Technical deliverables

dF: Chet, the OASIS admin resolved our requests, most importantly we have now TC wiki and Yves has started using it already. It’s located here: also available from TC Home Page. So you can use this URL to work on wiki. Wiki is a collaborative tool, but from the past we’ve seen It works the best if we choose a “wiki-gardener”: someone who takes care of the structure. It’s good to have links to important topics or pages on the wiki front page. We agreed to table the call for spec editors, but having wiki now, we’re in need of a wikigardener. The wiki gardener is in charge of clear layout and accessibility of wiki. Yves created a sample webpage ( This is publicly available, editing is possible for OASIS members. It should become our main spec development tool until we have something more structured that can be transferred to specifications. The wikiticket is closed as we have one now.

Spec GIT request

We requested a GIT for our spec authoring. There’s no admin form for official request and Chet told me to put into the note, but Chet actually created an SVN. So we have a version control ( but not GIT as requested. So my question to the TC is that if SVN is enough or we should insist on getting git?

Felix: We discussed it at first meeting and the majority supported GitHub, which is easier for people from outside to contribute by creating a new branch. Has this idea been dropped now?

dF: No, but OASIS does not provide a form for it. Chet said he will look into it butresolved our requestby creating an SVN, meaning it was either impossible or very difficult to get us a git. In general, I think OASIS would have issue with third party contributors to the version control because it would not be under the IPR policy.

Chase: Do they host GIT for other groups?

dF: They replied that we’re the first one to request GIT for spec authoring, therefore there’s no form for it. Should we continue looking for GIT or SVN is enough? Actually we could not take advantage of third party contribution as it’s not allowed by the IPR policy; they should contribute only through the commenting facility as they must agree with IPR.

Chase: I do prefer GIT, but SVN is fine too. Is it only for spec authoring or for supplemental materials?

dF: Purely for spec authoring, if we will have any affiliated open source project with code-development, we will go for GIT. They support git for that.

Chase: A lot of benefit of GIT is mitigated when you’re working with formats like Word or OpenOffice, In SVN the merge has to be resolved by handin case of any conflicts. Unless if we’re gonna make the spec in plain text.

dF: The plan is that I would migrate a DocBook template here for each of the specs.

Chase: In that case I would rather go for git if possible at all.

dF: As people prefer GIT, I will follow up with Chet to see if we can get git also for spec editong. Felix, Chase, do you support this decision?

Felix and Chase: yes.

dF: Admin created for us 3 document models for the 3 deliverables we have currently planned: JLIFF (For the JSON Searialization of XLIFF), XLIFF 2 to TBX mapping and the Abstract OM. Nothing has been started on SVN yet, so switching to GIT soon would be harmless.

Action Item- dF: Ask Chet why we didn’t get a GIT and if it is possible to get one soon.

dF: We have the document model for JSON serialization, docx models, but we will be authoring in DocBook most probably, unless there is other opinions. We have a good spec authoring of XLIFF in DocBook. Any other suggestions?

dF: So we have all 3 document models. We can now discuss JSON serialization over the example by Yves. Are there any other proposals than this?

JSON Serialization

dF: We have examples webpage, created proposal by Yves. Simple XLIFF file. Proposal suggests a way to represent the only unitin JSON, going back to the original idea of him on the mailing list; type of inline would be indicated as “kind” with a number code.

Simple Unit

A simple unit with one translated segment and one ignorable element. The unit includes also an extended attribute.

<?xml version="1.0"?> <xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" version="2.1" srcLang="en" trgLang="fr"> <file id="f1"> <unit id="u1" xmlns:my="myNS" my:xattr="extValue"> <originalData> <data id="d1">[C1/]</data> <data id="d2">[C2]</data> <data id="d3">[/C2]</data> </originalData> <segment canResegment="no" state="translated"> <source<ph id="c1" dataRef="d1"/> aaa <pc id="c2" dataRefEnd="d3" dataRefStart="d2">text</pc</source> <target<ph id="c1" dataRef="d1"/> AAA <pc id="c2" dataRefEnd="d3" dataRefStart="d2">TEXT</pc</target> </segment> <ignorable> <source>. </source> </ignorable> </unit> </file> </xliff>

Possible JSON representations of the unit element (Yves)

{ "id": "u1", "myNS:xattr": "extValue", "parts": [{ "seg": true, "state": "translated", "canResegment": false, "source": [{ "kind": 2, "id": "c1", "data": "[C1\/]" }, " aaa ", { "kind": 0, "id": "c2", "data": "[C2]" }, "text", { "kind": 1, "id": "c2", "data": "[\/C2]" }], "target": [{ "kind": 2, "id": "c1", "data": "[C1\/]" }, " AAA ", { "kind": 0, "id": "c2", "data": "[C2]" }, "TEXT", { "kind": 1, "id": "c2", "data": "[\/C2]" }] }, { "seg": false, "source": [". "] }] }

Ryan: I personally don’t like that proposal, but didn’t have time to provide my own. I don’t like using “kind” but actually the names of the inline tags. I will need to put an example there.

dF: OK. So we have the wiki now and would be good if you could upload your counterproposals. If we’d have something more explicit, that’s fine too. The only issue is not to be tooverbose, but it's definitelymore informative.

Patrik: About the structure: we have parts and then “seg:true”. I think it’s different structure. I think certain types in XLIFF should be defined by the Object Model. Meaning that we have “segment” without saying that it’s “true”.

dF: it’s probably trying to address what is the XLIFF fragment that is represented.

Patrik: But then you have “seg:false” for “ignorable”. So let’s take this XLIFF as the smallest possible unit and try to come up with different representations.

dF: If you and Ryan come up with different representations of this example, that would be good and we could discuss it next time.

Patrink: For example, I would like to avoid using strange names like “kind”, it makes it hard to understand without background information.

dF: Yves tried to avoid using the same terminology as the XML spec; “type”.

Patrik: So you don’t want to use any attribute names, if possible, which might be confused with XML representation?

dF: Yes, not if they are something completely different.

Ryan: I agree we should not use “pc”, “ph”, etc. But those objects have names: “spanning codes”, “standalone codes”. So that actualobject names could be used in JLIFF.

dF:That’s right. It would be more verbose, but more human-readable.

Patrik: As we’re trying to create the smallest possible unit, some kind of streaming, it’s not going to be the verbose or containing too much text.

dF:We still need to go back and define some viable fragments based on the XLIFF spec. Because you can do a “unit” representation, but the problem is that in XML XLIFF you can have stuff inherited form the higher structure levels, so you’d need an understanding that everything needs to be instantiated at given fragment level when you decide to go JLIFF. That would be translatability, segmentation behaviour or so.

Patrik: In this example the languages are not defined at all.

dF: Yes, that’s another thing. This obviously must have source and target languages and source and target both are present. So we need to start from the notion of a fragment that has all the data categories instantiated.

Patrik: I also think that default values should be omitted in JLIFF.

dF: I think that’s something lost, because when Yves first posted his proposal, he said there’s a number of assumptions and omitting default values was one of them. This assumption obviously needs to be explicit in the spec.

Patrik: So before talking about examples we should define what fragment structure is.

dF: Yep! Probably we just need a wiki-gardener. Patrik, do you think you could do it?

Patrik: I could try, but not much of experience.

dF: It’s pretty much about making the front page and make things clickable there.

Patrik: Yes, I can do that.

dF: You could also make the initial structure. We need to agree on the definition of fragment and we need a structure for that. Record the assumptions, as we cannot discuss the serialization exmaples w/o their assumptions.

Action Item- Patrik: Make the initial front page and wiki structure.

dF: To advance JLIFF we need some minimum of theabstract object model defined,including fragmnets, sassumptions etc. We don’t need to separate these things on wiki, we will only do so in the specs when we get there.

Patrik: Would you agree that we should try to build the model, as was discussed earlier, which is separated from the XML specifics?

dF: Build in what sense?

Patrik: For example name the objects which can be serialized in JSON. Like for the “file”, should it be still “file” or differently in JLIFF. Or “unit” would be called “unit” or “parts”.

dF: Yves started from the most logical place, assuming that “unit” would be “parts” to avoid confusion with XML. That’s an issue worth talking about: do we want to go absolutely different?

Patrik: I don’t think so. That’s why we should come up with the model, so if we start calling things differently, we’ll soon lose the track on mappings.

dF: Even if we don't gofor completely separate terminology, we need to maintain the mapping.

Patrik: I’m not sure if there is anything XML-specific on calling units “units” and so on.

dF: I don’t think it is. I don’t see any issue with calling a unit, “unit”. The issue comes when things need to be expressed differently in JSON and XML: the JSON typeand XML XLIFF typeare completely different, therefore "kind" is better in this case.

Patrik: As Ryan said, we don’t even need to call it “type”, but to create an object of a specific name.

dF: Based on the initial discussion we had, JSON would have issues with duplicate object names.

Patrik: Why should it?

dF: What I understood from the discussion is that most of JSON parsers would not be able to resolve issues with the duplication. Ryan, could you talk about it please?

Ryan: That info came from Yves, I am not 100% sure about that. It was mentioned by someone that JSON libraries do not allow duplicated keys. Not sure myself. But it is not part of JSON standard.

dF: Yves said it was in the JSON spec, but as a “SHOULD”. It was not absolute ("MUST"), the reason is that a lot of processing in JSON relied on keys not being duplicates. Even if it is not a “MUST”, we’d better not go against the “SHOULD”. Basically it’s Yves’ argument.

Patrik: I’ll look it up myself.

dF: I think the issue was with the top level object name.

Ryan: Yes, only the top level.

dF: Yeah, not inside..

Ryan: Yeah, in the last example I sent, the internal tags were embedded within JSON, you can see that only the parts that have plain text, on the top level, you can have a duplicated key for that.

dF: We have 19minutesleft, do you want to continue with JSON specific discussion or maybe something about defining the fragments, we probably would need some wiki proposals beforehand to have more specifics to go on.

Alan: I had an assignment which was to continue looking for syntax for theabstract representation. So since the last meeting, I met with prof. Embley, a data modelling expert; author of a book on conceptual data modelling. So the good news he is the right person to talk to. But the bad news is that he is not aware of any tool of conceptual modelling that would be exactly suitable for us. He said that one of his PhD students wrote a relevant disertationabout going betweenXML and JSON. He sent me a copy. I’ll need few weeks to go through the dissertation.

dF: Thanks, Alan. Seems that this should be a close match if it is about XML to JSON representation. I would be happy to read through it if you could share the dissertation, is it public?

Alan: Sure, I’ll send it to the list. [

dF: Nice, I’ll definitely have a look myself. If it's too big you could also uploa dit to the TC dcomunet library

Alan: He suggested us to do other thing, however it’s not ideal, using JAXB. It’s a Java library converting XML to Java internal structures. Is there anybody who worked with it?

Chase: I have. It’s pretty widely used in the Java world for web service implementation and things like that. I think it’s likely that a Java reference implementation might use JAXB, because it’s very easy to use. But it would not solve the problem we’re trying to solve.

Alan: He also said there might be atool forJAXB to createJSON, do you know about that too?

Chase: There should be, definitely.

Alan: OK, so that was my report.

Action Item- Alan:Send the dissertation on XML to JSON mapping to the mailing list.

dF: Still 10minutesleft. We covered the JSON serialization and the OM. About the TBX mapping, I’ll not start working on the DocBooktemplates until I hear from Chet WRT git.

Please feel free to contact me or volunteer on the listif you want to be an editor of the specs, but I will not raise it formally until there’s anurgentneed.

We have no updates on TMX transfer form ETSI.

Action Item- dF: Ping Jamieabout TMX.

Action Item- dF: Add current liaisons to the list.

dF:Also call for papers for FEISGILTT 2016 ( is still open. We will also organize the first TMX symposium. All submissions via Easychair

Please consider coming to Dublin, as F2F meetings usually produce great progress on building a consensus on technical issues.

dF: Any other topics to add to the agenda fornext meeting? Or breaking news for now?I don’t hear anything, that then should be all for today.

Action Items summary:

Action Item- dF: Ask Chet why we didn’t get a GIT and if it is possible to get one soon.

Action Item- Patrik: Make the initial front page and wiki structure.

Action Item- Alan:Send the dissertation on XML to JSON mapping to the mailing list.

Action Item- dF: Ping Jamieabout TMX.

Action Item- dF: Add current liaisons to the list.

dF: Meeting adjourned.