Monday, April 25, 2016

Cross-Border Intelligence Sharing

Every European country sees the activities of their intelligence services as dirty laundry, not to be discussed and definitely nothing to be shared, especially with your neighbors. The legacy of the past like STASI in Germany and Franco in Spain combined with current scandals of mass-surveillance as well as our reluctance to see a problem for what it is, hinder the evolution of effectively targeted investigations.

This behavior has caused loss of life in multiple terrorist attacks in Europe. Regrettably besides this national fragmentation of intelligence there is the hairy problem of cross-language content. First the communication inside each country needs to be secured, but right after it must work also across borders. Technically this will need to be supported by national legislation which makes reporting into a shared repository mandatory. It is in the national interest of all European nations to safeguard their population and thus such laws should be relatively easy to pass.

Cross-border Intelligence - the Missing Link


The lack of intelligence sharing across borders is hampered by language borders; reports written in the native language of police officers across the continent. Even the spelling of names of suspects can vary in different languages. Content will span any type of wording without an intelligent link to find matches between synonyms or related words, such as "terrorist attack", "explosion", "bomb", "explosive materials". Hard enough in one language; dealing with many languages multiplies the problem.
The technical problems can be solved. But the political problem needs to be tackled internally in each country while the EU supports the process by providing a legislation to share across borders. And our intelligence services need to stop wasting their time on surveilling ordinary citizens and rather target those that have a criminal background as well as having been active militants abroad.

Terrorist Threat Repository Visualized as a Concept Map


You surely remember the film "Minority Report" and the software that foresaw crimes since it was linked into the brain of psychics. We don't need the psychics - we need smart intelligence officers sharing their data across borders with access to multilingual data processing, so that they are able to predict the next terrorist attack before it hits us. We need a Terrorist Threat Repository visualized as a concept map. The concept map displays potential terrorists and their relations to each other, names and pictures combined with police reports ‒ in any language available.


Holding this together is a multilingual knowledge system (MKS) that can cater to related terms, also by proximity in geographical locations. In order to cross languages borders the MKS must be available in our official languages and those used by the offenders. This allows the police reports to be written in any language since the MKS would be there to find relevant content in any language. When found the texts can be passed to machine translation to gist the information. If the alarm bells ring but threat is not immediate human translation will provide some more insights. But in most cases the dots will already be connected enough that security agencies can see a pattern to act on. 

Friday, April 1, 2016

Shortcuts to Cross-border Interoperability?

Almost everybody thinks, lets translate and we can operate in another country. Many even believe, let a machine do it and we are good - among these are high officials in the European Union, arguing this more out of avoiding a problem than addressing it. Actually the recent events in Brussels show we're not very successful in extracting information from unstructured data, not even monolingually.

Structuring Unstructured Information


A text, such as this one, is an unstructured information resource. However, the text must be first found and it is more valuable if combined and understood together with more such texts. Then conclusions can be made and actions taken based on the information inside the texts. This has been done for centuries, for example by lawyers and courts in countries that work with case based law. Already for a while, the multilingual angle is becoming increasingly important due to international trade, globalization, social media, and in Europe due to the central organization of the European Commission governing so many diversified countries.

Everyone knows from painful searches that a smart organization of filed documents is key for finding anything back. The best strategy for retrieving information was the library hierarchy and its terms used to organize the resources. This proven approach has been neglected in the digital age, because users were told search, folders, and titles will do the job. Today, most organizations barely know or care whether terms are used correctly in authoring or in translation. Therefore their texts are no longer useful when queried to support decision making. This is even more evident cross-border when imperfect translation, especially when automated, not only loses terms but also introduces errors.


Interoperability by Machine Learning or Linked Open Data?

Eager researches and software companies will say: never mind, Big Data methods using Machine Learning or Linked Open Data will come to rescue. Indeed, both are great for gisting. But they fail to function reliably beyond that point.

Imagine asking them what is 1 + 1. Machine Learning will tell you the answer is in the range of 1.5 to 2.5. Linked Open Data will tell you that the answer might be in the following 42 documents.


Ensuring Cross-border Interoperability


We seem to have forgotten that library science such as classification systems, taxonomy hierarchies, and thesauri are the core for reuse of textual data. When these knowledge resources are multilingual they become a Multilingual Knowledge System. An MKS can extract insights even of texts in and across multiple languages.

I am not saying terminology is the answer - terms are mostly flat, unrelated and mostly compiled for translation support. Instead we need a structure to give us the context and to be able to drill through a concept map to find relationships. The term resources are rather an asset that can be levitated to become a knowledge structure.

Multilingual Knowledge Systems provide cross-border data processing possibilities, often called semantic or information interoperability. Actually MKS's are the only possible path to achieve cross-border interoperability. They retrieve the needle in the haystack - in all languages. They make it possible to pinpoint the units sought for while linking to all information related to that unit. And if an MKS is supporting Big Data and Linked Open Data these technologies will also efficiently support cross-border interoperability!

Authors CEO Jochen Hummel @JochenHummel
and CSO Gudrun Magnusdottir @GMagnusdottir