Patents4Science

The analysis of patent data and the exploitation of technical knowledge and trends from patent documents confronts research with major challenges and at the same time offers unimagined opportunities for the exploitation and use within the scientific value creation process, in particular linking essential information from patents with scientific literature and further (domain-specific) sources. Furthermore, the new possibilities that arise from enriched and linked patent knowledge not only allow the derivation of new and efficient indicators for future innovations and developments, but also provide scientists with access to new approaches and solutions, experiments, technical specifications or detailed information such as chemical structures, ready to be used in the context of their everyday research work.

In the area of Patents & Scientific Information, we therefore research and develop new approaches and methods for indexing, analyzing and linking of patent content based on machine learning (ML) and semantic technologies such as Natural Language Processing (NLP), Deep Learning (DL), Knowledge Graphs (KG), etc. addressing different target user groups in the sciences.

Contact

Dr. Hidir Aras

Head Department Patents4Science

Phone:

+49 7247 808 306

hidir.aras [at] fiz-karlsruhe.de

News
Staff
Current projects
Publications
Other Publications

Information Mining and Semantic Technologies

Growing volumes of data and increased information needs of users, e.g. in the context of search and analytics, pose big challenges to scientists and information specialists. Hence, new concepts and efficient methods for data processing and analysis using, for example, Text and Data Mining (TDM) are required in order to combat information overload and data complexity. In our applied research, we are therefore working on enhanced ML techniques, distributed data processing for querying and analyzing large amounts of patent data (Big Data Patent Analytics), and semantic mining and enrichment of patent full texts by using knowledge from ontologies and Linked Open Data (LOD) as well as from large language models such as BERT, word2vec, etc.

Patent Text Mining and Semantic Enrichment
In order to obtain high-quality analysis results from patent data, advanced methods of structuring and enriching patent document content, i.e. its full text, areis essential. Therefore, one of our main areas of work is the deeper structuring and mining of the patent claims and the detailed description section employing enhanced information extraction (IE) and ML techniques. Mined (semantic) entities such as bio-medical or chemical information or entities by exploiting domain-specific vocabularies can be used to build search queries more precisely and help to improve methods for the automatic analysis of complex knowledge.

Knowledge Graphs and Linked Open Data
In order to enrich patent text documents systematically, a semantic representation of essential components of the document content (e.g. in RDF/OWL) is required. This concerns both, the representation of metadata and the mapping of relevant automatically extracted entities and relations of a domain. Semantic representation and indexing of essential information in patent data enables the linking of relevant knowledge to external sources and the integration and use of specialized (domain-specific) knowledge bases, e.g., for bio-chemical data.
Furthermore, in addition to established approaches to information retrieval used in patent search (e.g. Boolean model) enhanced options employing semantic or knowledge-based methods can be developed. The goal is to explore and implement different forms of semantic search using knowledge graphs and pre-trained language models. For this purpose, domain-specific ontologies can be utilized and relevant entities and their relations identified, extracted and disambiguated, e.g., by means of state of the art DL models or other ML approaches.

Big Data Analytics
The immensely growing data volumes of scientific and technical information also require novel and scalable methods for querying, analysis and visualization. We research and develop enhanced solutions that empower the efficient analysis of large patent corpora in order to enable answering complex research questions or allow for detecting relevant technological trends.

News

Strong FIZ Involvement in SemTech4STLD’25 Workshop at ESWC 2025

The 3rd International Workshop on Semantic Technologies for Scientific, Technical and Legal Data (SemTech4STLD’25) was held on June 1, 2025, in Portorož, Slovenia, as part of the Extended Semantic Web Conference (ESWC 2025).

SIGIR 2024 in Washington with strong FIZ participation and fifth PatentSemTech workshop

This year's SIGIR (Special Interest Group on Information Retrieval) conference, one of the leading events in the field of Information Retrieval organized by the Association for Computing Machinery (ACM), once again took place with a strong FIZ presence.

DFG project develops first knowledge graph for patent information

In the "Patents4Science" project, four Leibniz institutes have joined forces to build an information infrastructure for the easy use of patent knowledge in science.

Patent knowledge for research: start of DFG project “Patents4Science“

Together with its partners in research, FIZ Karlsruhe develops an innovative information infrastructure for patent information to be used by scientists. Entirely new: a patent-centered knowledge graph.

Second PatentSemTech workshop at SIGIR'21: "In 20 years’ time, AI methods in patent analysis will be standard"

On 15 July 2021, the PatentSemTech'21 workshop took place as a one-day online event in conjunction with the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'21).

Program for PatentSemTech workshop now available

Registration deadline is July 10, 2021

2nd Workshop on Patent Text Mining and Semantic Technologies at @SIGIR’21

On July 15, 2021 the 2nd Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech) will be held in conjunction with SIGIR’21 - the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.

Dr. Hidir Aras

Head Department Patents4Science

Phone:

+49 7247 808 306

hidir.aras [at] fiz-karlsruhe.de

Ahmad Alrifai

Phone:

+49 7247 808 545

ahmad.alrifai [at] fiz-karlsruhe.de

Jeenu Joy

Phone:

+49 7247 808 595

jeenu.joy [at] fiz-karlsruhe.de

Jehona Kryeziu

Phone:

+49 7247 808 197

jehona.kryeziu [at] fiz-karlsruhe.de

Dr. Farag Saad

Phone:

+49 7247 808 548

farag.saad [at] fiz-karlsruhe.de

Dr. Markus Sitzmann

Phone:

+49 7247 808 299

markus.sitzmann [at] fiz-karlsruhe.de

Dr. Mustafa Sofean

Phone:

+49 7247 808 167

mustafa.sofean [at] fiz-karlsruhe.de

Dr. Lei Zhang

Phone:

+49 7247 808 364

lei.zhang [at] fiz-karlsruhe.de

Current projects