New Derwent Markush Resource Now Available on STN®

The completely redeveloped database "Derwent Markush Resource" has now been implemented on the new STN platform for the search and analysis of chemical Markush structures in patents.

By providing the Derwent Markush Resource (DWPIM) produced by Thomson Reuters, FIZ Karlsruhe has responded to an urgent demand from its customers in the chemistry and pharma sectors. The new DWPIM database comprises more than 1.9 million generic chemical structures and thus complements STN’s patent portfolio with an essential part of chemistry information. Together with the company InfoChem, Munich, we have developed a Markush search engine specifically designed for handling the data of the Derwent Markush Resources database and in line with the STN standards.


Markush Structures are Very Complex

Markush structures are often used in patents, mainly in the field of chemistry and pharmacy, in order to cover and claim as many similar chemical structures as possible with one application. Markush structures are very complicated and can cover a wide range of chemical structures, including valuable compounds that may be used in drugs.

 

To provide a platform for Markush structure searches is a highly ambitious endeavor. On the one hand, Markush searches are very complex and difficult to process, on the other hand the customers have high expectations and demands on such searches. To cope with the often enormous complexity of Markush structures, powerful, sophisticated search processes in comprehensive databases are required.


Markush Structures in Patents

A Markush patent describes an invention consisting of a chemical core structure and its variable chemical substituents. By combining the core structure with varying substituents many new variants of the core structure are generated. A Markush patent is intended to protect the core structure and its potential variants.

 

A Markush database is a chemical database whose data sets do not consist of individual structures but a large number of chemical variants.


Fig. 1: Molecular structure

Markush structures are generic chemical structures, i.e., a combination of a core structure and its potential variants which the user wants to protect. They have placeholders for certain variants (substituents), e.g., R for organic moieties or X for halogens (fluorine, chlorine, bromine, iodine).

 

Thus, a large variety of individual structures can be represented by one Markush structure. The placeholders R and X in the Markush structure below have two different substituents each, i.e., four potential structures can be represented by one formula.

Fig. 2: Patented Markush structure
Fig. 3: Derived structure for Atorvastatin

Markush structures are often used in patents, mainly in the field of chemistry and pharmacy, in order to cover and claim as many similar chemical structures as possible with one application. In reality, Markush structures are much more complicated than shown in our side note 1 and can cover a wide range of chemical structures, including valuable compounds that may be used in drugs.

 

A good example is the chemical structure of Atorvastatin (known by its trade name Sortis® in Europe and Lipitor® in the USA), one of the most successful drugs of the past years. For the first time, the chemical structure of Atorvastatin has been patented using a Markush structure (basic protection in Europe: EP0247633B1). Finding such specific compounds within a Markush formula is often like finding a needle in a haystack. The specialized search engine enables information professionals to conduct accurate and targeted chemical structure searches in the DWPIM Database.

 

This example illustrates how important it is to include detailed Markush structure searches into patent searches at an early stage.

Fig. 4: Markush search structure, Ak: alkyl chain, e.g. benzene; Cb: Carbocyclus, e.g. propane
Fig. 5: Display of assembled hit structure

STN offers its users three different display options that vary with regard to their level of detail. While the “full display” shows the complete information on the Markush structure with all potential variants, the “assembled display” allows for a very quick evaluation of the results.

 

A software-generated assembled hit structure shows the search result with the relevant parts of the structure highlighted in color, see example below.  Thus, the retrieved structure can be quickly compared to the one that was searched for to see whether it is relevant.


How to Make Markush Structures Searchable

Due to their inherent complexity, Markush structures can hardly be fully allocated to specific individual compounds. On the one hand, because the number of individual compounds covered may be very high; on the other hand, because certain placeholders within the Markush structures do not describe a clearly defined number of individual compounds but new ones can be added at any time. Therefore, a new search concept for Markush structures is required. This has been realized by FIZ Karlsruhe together with InfoChem with the implementation of the DWPIM database on STN.

 

However, there are some challenges to be met:
The result of a Markush search should contain all relevant compounds, also (or especially) those that are hidden in any Markush variants.

 

The solution: a special Markush search engine meeting this challenge.


System Requirements

The basic functionality of STN results in special requirements. STN’s search system is designed for using the same structure query to search all structure databases, if possible. This USP should, of course, be maintained with the new functionality. Therefore, FIZ Karlsruhe has combined and integrated Derwent’s indexing concept with STN’s search concept.

 

The new system not only offers the functionality of the existing DWPIM database (up to now known as “Merged Markush Service“) but it also supports new features and has solutions for problems with the existing system.

 

It is very time-consuming and therefore important for the user to check the retrieved Markush structures for relevance. For better guidance, the searched structures are highlighted in color and the core structure and its relevant variants are displayed in a useful manner.

To this end, FIZ Karlsruhe has developed clear and efficient display options enabling the user to quickly and reliably browse and evaluate the results.


An Important Milestone

The current release of the DWPIM database marks an important milestone in the further development of chemical structure searches to include Markush structures.

 

But this is by far not the final stage of the project! The Markush search engine will be further developed step by step in close collaboration with our partner InfoChem. To this end we regularly assess our customers’ requirements in close dialog with important customers from industry. These requirements will then be implemented with future releases. We continuously assess our customers’ requirements in close dialog with important customers from industry. These requirements will be implemented with future releases. Thus, we support our customers’ complex and ambitious patent searches in the chemistry sector and significantly contribute to “Advancing Science“.

 



Editors: HAU, BAB
Translation: RKA