Is a webbased tool designed to make translation and translation management easier and designed to increase the quality of translation.
As a first functionality, Pootle is a file server for translatable files in PO or XLIFF format. The files are loaded to the server classified by Projects (for us a Project is a specific version of a piece of software that needs to be translated to a given language). Translators, reviewers and translation managers can either edit the files on-line (using Pootle’s On-line Editor)or download them to their computer, translate them with an Off-line Translation Editor and then upload them again to the Pootle, replacing the old un-translated or partially translated files.
Translatable files for different FOSS projects are converted to PO or XLIFF format (if needed) using Conversion Filters (Inbound Conversion Filters in this case). Files are then uploaded to Pootle. Once they are translated and reviewed, they are converted back to the original format in which the FOSS project needs the data (using again the Conversion Filters, but now the Outbound Conversion Filters) so they can be integrated in the source, producing localised applications.
In addition, Pootle can use Glossary and Translation Memory (a database of previously translated messages) to improve the contents of PO or XLIFF files that are handed to translators and reviewers, facilitating the job of the translator, and ensuring quality of translations. Quality is ensured through a number of Tests that can either run by Pootle when files are uploaded back by the translator, or by an Off-line Translation Editor that integrates the tests.
All the components of the WordForge project are modular. They can either be integrated with Pootle or with other applications (such as Off-line Translation Editors). Glossary Management Tools integrate with Pootle by generating (and accepting back) glossaries stored in Open Standard formats, such as TBX, a standard specifically developed for sharing glossaries. Conversion Filters and Translation Memory Tools process PO or XLIFF files, and can be called by Pootle as external applications. Translation Memory Tools use also Open Standard file formats designed for sharing translation data information (TMX).
Pootle also manages information that is used to follow the translation process, such as when or by whom a translation was done, what is its deadline, if there are any detected errors in the translations, or any non-compliance with the glossary. Process information strongly simplifies the job of the Reviewer and the Translation Manager, who need to check the quality of the translations that have already been finished by the Translator.
And all this brings us to what Pootle really is, and what it calls from outside applications. It is probably better to start with what Pootle is not.
But, besides not doing many things by itself, Pootle does have a number of in-house functionalities:
Pootle stores four major types of files:
Nevertheless, there are other forms of information that probably need to be stored:
Pootle takes a file based approach for the core data (PO and XLIFF files). Supporting data (TBX and TMX) does not need to reside in is base format and could be placed in some data store. Meta files (Statistics, etc) could be placed in a Database if needed. Pootle uses technologies such as PyLucene to provide database level performance. With a file based approach and some thought out into pregenerating files it is possible to have flat file performance.
Priorities could also be an interesting concept to manage. A volunteer tanslator reaches Pootle and wants to translate something. Can we respond to the question: what should I start with? Priorities for volunteers could quite well be very different from priorities for professional translators. The response to this question must be a combination of information in the Project information file, statistics on already translated materials and goals established inside the files.
A Project as a set of files for particular version of a given piece of software translated or being translated to a specific language (such as OpenOffice 2.0.2 for Spanish).
Template files are uploaded to the system by the Project liaison, only once for each version of a given software (the Liaison never uploads the same set of files twice). The template files are grouped into a Project template set (the set of XLIFF or PO template files that belong to the same version of the same software). These Templates files can be used to either:
Sometimes two different versions of the same software are maintained (different branches). They are considered as different Projects, such as OpenOffice 1.1.5 and OpenOffice 2.0, when both were being developed separately (1.1.5 for fixes of the stable version and 2.0 including big changes for future advanced development).
{JS Based on a prior comment form CS, I think that creating a concept of Group could be a good idea. At the top level we would have a group, which could be OpenOffice or Debian, and inside that, the real projects. Please see structure below
Pootle | |- OpenOffice | | | |- OpenOffice 2.0.3 | | | | | |- French | | |- Khmer | | | |- OpenOffice HEAD | | | | | |- French | | |- Xhosa | | |- Debian | | | |- Debian Installer HEAD | | | | | |- Vietnamese | | |- Khmer | | | |- Another Debian appli 1.0 | | | | | |- English ZA | | |- Lao | |
If we consider a language based view, it would be:
Pootle | | Vietnamese | |- OpenOffice | | | | | |- OpenOffice 2.0.3 | | | | | |- OpenOffice HEAD | | | |- Debian | | | |- Debian Installer HEAD | | | |- Another Debian appli 1.0 | |- Khmer | | | |- OPenOffice | | |
{CS In terms of practical management, we need capacity for super-projects. Large, independent projects like OpenOffice, Debian, Gnome and KDE will be easier to manage if there is a super-project, with all its related projects listed and organized under it. Language-team leaders and translation co-ordinators will need to be able to cross project boundaries within that super-project. The distinction between OpenOffice 1.1.5 and OpenOffice 2.0 should not be the same as that between OpenOffice and Debian. We need an effective hierarchical structure that can be expressed in terms of projects [from super-project] or languages. If the meta-data structure below is capable of actual manipulation by such categories, that is what we need. A Gnome translation manager doesn’t want to have to deal with Gnome HEAD, Gnome 2.12 and Gnome 2.14 always separately.}
Projects can be static (such as OpenOffice 2.0.2) or dynamic, such as OpenOffice HEAD. In the first case the project is only upgraded to a new version when the Translator Manager for that language presses the Upgrade button for that Project. In the second case, when the Project liaison upgrades a new version of template files for that software, upgrade to that version takes place automatically. Pootle uses external version control to protect against data loss during upgrades.
Projects are linked together to common meta data. So it is possible to list all Projects that relate to a given language (Thai), or to a given upstream FOSS piece of software (Mozilla/GNOME).
The page on Analysis of Process and WorkFlow studies what phases a translation and localisation process can go through, analysing the possible players who might participate in the process, the rights that each player must have in order to participate in the process and, finally, the creation of workflows for localisation composed of different phases. The page on Process information on XLIFF files proposed a way of encoding the different phases inside this format, and the changes in the information about the translations that must take place when new phases take place.
A file in Pootle will undergo several of the following processes:
For all this it is necessary to define which Process Information will be coded in the XLIFF, TMX and TBX files, and how this coding will take place. {CS We need to get input and feedback on this from both translators and team/project leaders. Can we display user-specific sets of this information? For example, a translator may not want to see most of that info., if focussing only on translating a part of a file during her/his lunch-break.}
See also version control for specifics about how to implement version control and sharing.
In order to protect against data loss (vandalism, bad translations) we will use a version control system. The version control system will be used internally by Pootle and will remain transparent to a Pootle user, unless of course a system problem arises. Pootle being file based will work from the latest version. Files held under version control will allow users to see changes that have been made and allow users to revert to old versions.
NOTE The part of version control that relates to sharing data with upstream projects is a separate issue.
{CS SVN is an excellent choice, and extends to the key issue of currency of files. Different projects have different procedures for getting/committing files. Some use SVN, some use CVS, Debian (for example) uses CVS, SVN and email. We need to be able to interact with their procedures, to get and commit current files. How are we going to do that? Until we have currency, Pootle is not really part of the main flow of translation work, but is simply somewhere you can use if you have first, got the current file, then uploaded it, and are willing to download it and commit it manually afterwards. It’s more of a bottleneck right now than something that facilitates the translation process. I want this to change.}
Pootle includes a translation editor allowing translators to translate applications. Using AJAX we can make Pootle behave more like a desktop application then a webbased tool. Thus you could use Pootle as a desktop translation replacement and it saves us having to work on functionality integration.
{CS This has real potential for Pootle. Otherwise, translator need to swap between translation interfaces. There’s less mind-share associated in using only one interface. Extending Pootle to the desktop will make Pootle less work to use, and will improve opportunities for users with bad or unaffordable Net access.}
The translation interface needs to maximise space for the translation while giving useful feedback. {CS Yes! Don’t clutter.} The following are included:
Where you actually input the translations. The default would be:
The source text is uneditable unless you wish to make a change that will be reported as an error to the programmer (You will need to give a comment explaining why it should be so). If Pootle cannot deliver your error report (unkown contact info) then it will store that information so that other might benefit and until such time as it can be delivered to the programmers
Source text is usually in English, however many translators are not English speakers so you can also display the source text in another language. If no translation is available in the other language then you can make use of some of the online translation tools allowing you to see a rough translation of the English. You can also view multiple translations in multiple languages (see the parallel translation section below)
The widget for your language is of course always editable allowing you to enter your text as needed. If you cannot type your script using your current keyboard then a character selector is available. However character selectors are tedious and a pointer to good instructions on how to get your languages input methods working is preferable.
{CS The introductory page, also available as or with the general information page for that language team, should include information on how to access the most effective input methods, keyboard layouts etc. for that language, for as many OS/distros. as possible. A translator spending what available time s/he has, inputting translations one character at a time, is incredibly frustrating and inefficient and plain unnecessary. It does happen, and we can avoid it by including this information, which would also maintain contact information, mailing lists etc., for discussing any input-related or other problems for that language and for the Pootle interface in general. Translators must be made to feel welcome, accepted and encouraged to ask questions and contribute.}
If you have created a translation that you know to the untrained eye looks wrong, for instance if you decide to correct it so that in the running application it is more effective, then the comment widget can be opened and you can give feedback to future translators. The same comment window can be opened to allow you to make a comment that is available to all translators. To avoid confusion we will separate these functions or create a clear visual cue as you want global comments to be in English while a comment specific to your language can be in your own language. For instance, you saw a piece of text e.g. “Select DOMAIN” and on investigation you discovered that DOMAIN is a variable and should not be translated, then you can make a comment that will be shared with others. If possible Pootle will return general comments to the programmer for inclusion, while language comments will be embedded in the translation file, if these are upstreamed then they will be shared with non-Pootle translators
AJAX is used so that when you select save you see the next translatable string quickly. Thus moving forward and backwards is quick and you do not wait for the page to refresh before you can translate the next item.
When you are translating, and Pootle detects a word that appears in the glossary, it will highlight that word in the source text, and place it and its equivalent in your language in the glossary lookup. So every word that has a glossary entry defined that appears in the source text will be there.
Pootle can reference multiple glossaries, the user or the language team can select which ones to enable and give a priority order. If glossaries exist for your language these may be uploaded to the glossary server.
Using your arrow keys you can select a glossary word and have it copied to your current cursor position in your translation. The user may override which keys are used. By using AJAX we hope to make this seamless.
{CS This could be the same as the Mark Place feature we need. We must be able to see a list of live links of these different kinds of marks [Mark Place, Mark Unfinished, Mark Query, Mark Needs Review ...], and to search by them. They could be categories in the main Mark list.}
{DB Not sure exactly what CS wants here. Perhaps a method to push items into another process. Not sure if this requirement would go beyond the workflow ideas. Good idea though}
If no translation exists in your language, then these entries will also be marked, but in such a way as to show that you still need to add a word for this. Such entries will also be placed in the glossary lookup but as blank entries. In a similar way as you copied the glossary entry to your translation, you can highlight and copy a word from the translation file to the glossary. You may need to add more information for the glossary entry in which case a screen will appear allowing you to add this information. Your contribution to the glossary will still need to be reviewed by the glossary team.
If words or phrases appear in the source text that you feel should appear in the glossary you can highlight them and nominate them for inclusion.
“How did I translate this before?” When you are presented with a piece of text, Pootle will also populate the TM lookup with potential translations. These will be your own, those of your team or those from other applications. You can classify the order in which these appear. For instance, if you trust another translator over your own translations, then you can make those appear first. The list will also give an indication of how well the text matched, (less than 100% since otherwise the match would have been automatically populated from TM/glossaries) {DB Should the translator not be required to sign off on a 100% match? Never trust a machine}, and you will probably need to make some editing changes.
By scrolling through the available translations in the lookup widget, you can view them and choose to copy one to your editing widget. You can at any time copy from the TM lookup widget, but this will overwrite your current editing.
You need to see the translations before and after the active translation, in the file, so that you can add (if possible) to the context of your current translation. You can have a user specified amount of context (you might want to adjust this if you have a small screen).
{DB This is related to context comments and queries}
Ideally, this will be supplied by the developer comments and translator comments displayed for that string. If not, a query widget could allow you to contact the developers and query the meaning of that string. {CS I think my favourite, among countless examples, concerns the huge GIMP PO file, which is completely bare of context. I queried the verbose string “H”. (That’s it: the entire string.) The developer told me I should have enough experience of the application to know what that meant! (Hot? High? Horrible lack of context?) Getting context is not always straight-forward, so we need to be able to ask for it.}
Across the top of the screen you can see the visual progress indicator. This looks like a ruler and shows you where you are located in the current translation file. It also highlights the blocks that are currently reserved for your translation, those that have been completed by you and those that have been reviewed and accepted. Your current translation speed together with those of your team members is shown. This could be your project team eg OpenOffice.org or your language team eg Zulu. Your translation speed ranking and other stats are displayed. If your file is one of many in a larger project then you will see in context of the larger body. If you are working on an abstract goal then your progress indicator will also indicate that progress.
Other statistics shown will be the ratio of the number of words in the glossary and those defined by your team.
If you can imagine a multicolour ruler that looks a bit like a slide rule. The slide part indicates you current place in the workload and the breadth of your reserved translations. Your reserved translations will appear in your colour but grey. Once you translate them they appear in a stronger colour and once they have been reviewed you will see them in the full colour. The other people on your team will appear in other colours so you can see a patchwork of work complete.
So at a glance you can see how fast you are, how well your team is doing and how the global effort is progressing.
{CS Is speed of translation really something we should be emphasizing? Surely quality is what we want. Especially for new translators, and for those who have little available time, comparisons of this kind will be intimidating, and discouraging. I strongly suggest that the display be configurable, so the translator can choose how much comparison, what kinds of information, s/he wants displayed. Completing any string as well as possible should be encouraged.}
{DB I think I now agree with CS on this. We need to look at this and try to create something that doesn’t emphasise speed of quality}
Very often you are not sure of the exact meaning of a phrase even with context information. However, you might be able to read other languages. In which case it might be helpful to see how other people translated the text. For instance in OpenOffice.org all strings are translated into English and German by a fulltime translation coordinator. Thus the German translation could provide a very good alternate translation since it would have been validated by both the programmer and the coordinator.
The parallel translation appear above the source text but after the context information
Translators often make errors by translating variables or leaving out XML tags. The tracker provides a list of these items and removes them once they are added to the translation. Thus the list should be empty before the translation can be completed.
Live alerts are needed for the above, also for variables not matching, missing spaces at the beginning or end of the string, missing or extra \n (PO), missing end tags or broken tags (XML, XLIFF) [if any of these aren’t covered by the above paragraph]. These are common errors. It saves time to spot them while translating.}
These are also tracked live to ensure that they are correctly inserted. They are able to track different forms of punctuation for different languages, dialets and scripts.
Some checks will not be implemented live. In this case they are checked when the translation is submitted or on request.
{CS It’s important to realize that spelling-checking is not useful for some languages. In Vietnamese, which is a mono-syllabic language based on accented vowels which represent the tones that differentiate meaning in words, there are very few possible combinations of letters and accents which are not valid words. Only a grammar checker, or a configurable checker simply looking for phrases showing common errors, would be of any use.}
Some tests don’t work in certain languages (e.g. spell checking in Vietnamese). The user can make configuration changes to switch off certain tests or to configure the operation of the test for that language.
A user can add custom checks (really just simple regex searches) which fail of pass depending on relationships between items found in the source or target string
Without leaving the interface, a translator can request feedback from another translator. The feedback is only sent if Pootle has the information required to request the feedback. If it does not, then the query is stored until such time as it has that information, and the translator is made aware of this status. The translator’s message is sent via any number of media. The translator moves on, and can choose to mark the message as translated or untranslated. S/he will be flagged and taken to the message once the response is obtained.
{CS This will work very well where teams are concerned, but for a single translator it only reinforces the loneliness and lack of help. If no other translator is available via Pootle, perhaps the translator can input an external contact or mailing list to query instead.}
Although themeabilyt is a general requirement for Pootle so that it can feel at home in anyones project, in the editor their is a specific needs for themeing beyond looking good. We need to be able to theme the editor so that it is usable by people with disabilities, although most people working using the interface will want some kind of customisability.
Specific themes with high contrast and large font size could cover disbailities.
The ability customise the folling is needed:
The console is the first place the translator goes to, and it acts as a hub for all his/her work. From here s/he can:
{CS
Thus the translator can oversee the status of all his/her files, distill a group of files needing current attention, classify them, and work on them one-by-one, while keeping an eye on the goal group.}
All translators are in some way part of a team. Usually they would be in a team translating OpenOffice.org into Zulu or a team translating Fedora into Khmer. Some translators work alone but their output goes towards helping their language.
In Pootle we will see functionality that makes it much easier to work as a team and much easier to expand the concept of team.
Each user is required to login to Pootle and no one is allowed to simply edit text (c.f. bug feedback in which it is allowed). Users specify which language they wish to translate. The language coordinator then authorises them and allows them to work on certain projects. Only once authorized can they contribute to their language on this instance of Pootle.
If they are the first user in their language then by default they are the language coordinator and are either approved by the project coordinator or the Pootle administrator.
Within the group of translators we also have reviewers. A reviewer reviews :). You also need to be authorised into this position. Some projects might set a project bar before you are eligible for reviewership (e.g. translated 1000 strings). Mutual reviewing could be authorized, or enabled as a optional process. Simply “Would you please cast your eye over those strings for me?” is very useful and catches typos and simple errors. This would be between translators of similar experience or translators who work together.
A team consists of more than one person. A team has a name and can have certain goals. There are one ore more team leaders and they can change the goals, invite people onto a team or accept membership applications to the team. The team can set minimum entry requirements, this is useful in the case where there are many translators and the team leaders simply want to see a list of who would be eligible to join the team. Really its not that complicated :) just simple mechanisms.
Each team can have an introductory page. Each existing team has an overall goal, as does each Translation Project. New translators coming to Pootle will read this page. New projects and languages will be asked for their goal. Some general goals, could be shown as suggestions if required.
Teams without goals work on... nothing. So we need to create goals
A goal is some target that you aim for. A goals can have the following criteria:
Goals can be set by:
Goals are by default shared if set by the upstream maintainer. They may be shared if they are set by a language coordinator. They are not shared if they are set by an individual.
You have a team and you have a goal or set of goals. You need to associate your team with a goal. Once that is in place then the associated priorities within the goal will be used to determine what work is to be done by your team.
We define a workspace as a work area with no associated goal. So a workspace could be KDE, GNOME, etc... or a custom selection of projects and files. There may be some priorities but generally a translators or Pootle is free to choose what to translate. You could think of a translator who is not part of a team would have a workspace of anything that is on the Pootle server (if they have the correct rights). So a workspace is a goalless set of files. A workspace can thus be associated with a team, ie the KDE workspace could belong to the KDE team. Anyone who is part of the KDE team can translate in the KDE workspace. The KDE team itself might set some other goals that need first priority. Once these goals are met the team can work on anything within the workspace. This also has the side effect of protecting KDE workspace files from arbitrary translation. It also would allow meshing of the concept of teams as defined by the KDE process itself.
{DB needs thought on mutually exclusive idea} Goals or workspaces need not be mutually exclusive. But rights dictate what a person can do to certain files.
Goals can be measured. We would like to give feedback to participants so that they are motivated and we’d like to give feedback to the language manager as to whether the goal is attainable, if not then they can adjust their goals.
Typical feedback would be:
Similar to Project Gutenberg’s distributed proof readers it is possible to place new translators into a beginners section. This would be a standard named workspace which the language coordinator can move files into. Typically these would be less used files such as games, incidental toys and non standard apps. The idea is that any translation performed in this area is marked as beginner work. Reviewers are encouraged to give positive feedback and to explain corrections that they have made to the translators work.
{CS The more you can follow the example of DP, the better. They have an excellent compound process, a really enthusiastic and supportive community, and produce a huge amount of work from people contributing their bits and pieces of available time. I was with DP before my brain damage got to the point that I couldn’t reliably pick errors in English, and I can’t think of a better example for Pootle. Encouraging messages, plenty of support information and personal contact is essential to their success. Much of this is based around their forums, but they also have some really ingenious and effective procedures. If we can combine experienced translators and managers with new people as well as they do, Pootle will flourish and not just incidentally, provide an opportunity for people with possibly little experience and available time, to contribute something worthwhile and form an enjoyable community. Contact via Jabber IM, including groupchats, some used as tutorials for specific techniques, was a very successful step for DP.}
Teams can discuss work and place calls for people to join. News of targets reached. This is not a direct component of Pootle but the login should be seamless and shared between Pootle and the discussion boards
{CS As above. We can use the forums also to provide information, FAQs etc. Tutorials or specific discussions via Jabber or IRC could be scheduled. Forums for specific uses will be useful. Certainly own-language forums should be possible. A good deal of the team communication and translation queries could take place in the forums. Links to the forums, and to specific information, could be on each Team Page, and introductory pages. Encourage people to compete with each other if they want to (this works extremely well for DP), and to form teams which don’t necessarily mirror language or project groups. Number of strings translated can be used for fun competitions, and to encourage pride in one’s achievements and in one’s team. See DP for more details. These processes break down barriers, and encourage a lot of contribution.}
By tracking the number of translations performed and by monitoring quality of translation it is possible to automatically create a status measure. This can be used to automatically move people into different classes of translator. The team can choose to make this automatic, semi-automatic (requires human review) or non-automatic (ideal for existing teams with established roles)
{CS This could be part of the voluntary competition/encouragement mentioned above. Celebrate achievement: build fun and interest around it. Encourage people to try for certain goals, and reinforce their progress.}
The translation process is fixed within reason. There are a few that are defined:
By keeping the flows fixed we can cover the needs of most teams without having to implement a large workflow system.
However, it should be relatively easy to implement any variation on the theme.
The language coordinator or team leader will choose which process to follow and place people into the different roles.
It is critical that it is easy for upstream projects to add their work to Pootle and to get translations from Pootle
They need to be able to:
Upstream projects usually have these concerns over tools like Pootle:
The first will be integrated into the Pootle workflow, and the second is shown in our registration and supervision procedures.
Projects have names, descriptions, websites, etc. This data needs to be made available to translators so that they understand the project and can thus decide if they want to translate the software.
Other data would include:
The project can be defined in Pootle using a web interface or a flat file that is stored in the projects CVS and is submitted to Pootle using a simple updating tool. The second option makes it possible to update descriptions as part of a build process.
Also in these files would be descriptions of branches, thus allowing stable and unstable branches for translation. Or allowing legacy branches to remain and be updated.
At any time the project coordinator can upload new translatable files. These are merged with existing files on the named branch. When merges are performed teams or translators associated with these files are informed of this status change. This is via email or an IM protocol.
Similarly to the information update a script can be used to upload the latest translations. This would allow for instance a nightly build to upload translations to Pootle.
The important idea is that Pootle works with their current strategies eg CVS scripts to make it easier for the project to adopt Pootle. In some ways using Pootle to help them manage their files, ie enhance what they alreay do
Pootle is themeable. Thus even though the Pootle server is hosted elsewhere it should be possible to theme and template Pootle so that it looks like the upstream projects own server.
When a team meet a target, an email or IM message is sent to the project liaison to inform them that a certain language has met the target requirements. They can then take action or write scripts to take automatic action and import translations.
This is also automatically announced on the appropriate forum, and shown as a newsflash on an entry page. As any goal met is an achievement and the credit creates community.
Similar to the upload process it is possible for a project liaison to download all translations for languages that meat a certain target. This could be the publicly stated target or could be a hidden target. It is also possible for them to override this consideration if a certain language specifically requests inclusion of their translations for beta testing.
In order to not be another cog in the works Pootle must be able to perform the roles that certain project require of translators. That is send files via email. Submit files through SVN/CVS (To achieve this - we might sync to the machine of the person with CVS access and allow them to then perform the correct submission of completed work). The main aim of this is to make integration of Pootle seem seemless to an upstream project, or to appear as if there is no connection at all making adoption easier.
The long term aim is to have features in place that allow Pootle to be integrated into the workings of current projects without much or any changes.
Programmers can write cryptic messages and also forget to define what data will be contained in a variable. The feedback mechanism which operated from the editor allows Pootle to send requests for clarity to programmers. The programmer either adds this to the code or replies with clarification or does both. This occurs via email or IM. The clarification is attached to the message and forwarded to the requester. This thus reduces the number of queries and focuses the programmer on providing comments where needed. It also plays a useful role in educating programmers about localisation needs.
The widget can optionally supply a template string, to save time, and also suggests appropriate ways to make or phrase queries.
Pootle will manage unsuccessful queries: address not known, auto-responders.
Pootle needs to work with existing systems but mostly it needs to work between itself. Pootle needs to share TM and TBX data between instances. When Pootle is run in a standalone mode it needs to be able to reserve and download files for offline localisation and resubmit those entries.
See also: version control
Things that Pootle(s) are doing during this are:
Things that Pootle and upstream are doing are:
Translation quality relates to continuous improvement of existing translations. Plus improvement and quality assurance of TM and TBX data. By a process of managing changes that increase quality we can filter improvement back into the existing work.
This is another type of achievement we will emphasize; the number of strings added to glossaries/TMX and assessed as quality translations. Improving the shared resources should be seen as equally or even more important than simply completing translation strings in a file. It’s not easy to compare these things, but the more we emphasize and appreciate effort, the more will be contributed.
User will be able to see graphically how much of the global standard glossary is translated, reviewed, approved. The same would apply to the global language TM.
If an improvement or correction is made to the TM or to the glossary these must be able to be filtered back into the original texts. Changes to TM would require comments so that translators can understand what has been changed and why. The original text would be marked for review with a note on the change in the TM or Glossary. These are then sent straight to the review step in the translation process. Some teams are small or operate very independently so this step of jumping it straight into review should be optional. Marking the item for review will also not take the current translation our of service just as is normal in the translation process. Optionally the person improving the TM or glossary could review all the changes that is if they have sufficient rights.
In order for these resources to be useful they require continual weeding. The glossaries need to be reviewed to ensure that words are consistent. Changes to glossary terms should flag all instances of that use of that word. The glossary review process also allows glossary managers to review terms that translators have suggested be added to the language glossary.
The glossary creation should be able to suggest potential glossary words based on frequency of words within a project/domain. The glossary words that need to be supplied should be only those that occur within the project that is being translated. This process should happen before translation commences.
The top level glossary reviewers are also in a position to accept new words into a language or standard glossary. They would for instance create or examine the suggested glossary words. These are extracted automatically when a new project is uploaded. The reviewer will eliminate words that should not be in the glossary. They also review words that translators suggest should be in the glossary. These are often people seperate from the translators or language specific glossary maintainers.
As a glossary is very useful and we have mechanisms to flag glossary changes within the translations we make the glossary available at all stages. The user might see that an item is/is not reviewed but they always see the complete glossary.
TM management has many similarities with glossary management. Every time a translator translates something they create new TM data. The TM management allows translators to search for the occurrence of certain words or phrases and mark these for review or correction. Some of this is automated from the glossary management role.
The TM manager can also rate contributions. Placing a professional translators work on a higher footing or downgrading a translator for consistently incorrect translations.
The TM management also allows a translator to monitor contributions as they are made during a translate@thon. They can quickly correct spelling or grammar and provide feedback as needed.
Incorrect translations should be corrected at source. Translations that are sourced from other Pootle servers should be marked as incorrect and that information fed back to the source Pootle server.
The TM manager should be able to track the quality of translations froma certain translator and observe their skills level. This allows differing level of feedback depending on how green the translator is. TM feedback should be through the normal feedback mechanism and be private to encourage and not embarass contributors and to offer it as a learning experience in the same vein as PGDP.net.
The translate toolkit is a command line set of tools used by Pootle but also used on their own by many developers. The toolkit will still continue in its own right but offer more functionality.
Currently the Gettext PO format is the basis of all localisation in FOSS. However, the XLIFF standard from OASIS is an emerging standard. Therefore the toolkit will be abstracted so that either PO or XLIFF can be used. This allows the tools to continue being useful for localisers while they remain on PO but also allow migration to the richer XLIFF standard.
There are no plans in the wordforge project to add new formats. But consideration of what new formats to add would be based on:
The toolkit has specific converters to process files from one domain to another. Eg Mozilla to PO. The converters are being abstracted in such a way that the formats should be pluggable allowing any format to be converted without having to create a convertor helper application. This should make it possible to create a converter by simply creating a class that understands the format eg Framemaker with a layer that can supply the data required by the converter and thus you create a converter for FrameMaker to PO or XLIFF.
There are other convertors that exist for some formats not covered by the toolkit (e.g. po4a which can transform a number of FOSS specific file types to PO). One option is to reuse the code or wrap such tools so that the toolkit can use those formats. One problem is that they might be PO specific and not be able to do XLIFF. The same list of determinants applies here as it does to adding new formats.
Some comments show that filters can be improved with language-specific information. An example is in this Language Specific Comments page.