T54-2024 Protect temporary data in a BIOVIA Direct domain index from general access

For easier administration, BIOVIA Direct domain index data are partially visible to all Oracle users. The domain index data allows you to draw conclusions about the indexed original data, albeit limited in terms of content and time. This technote explains the technical background and describes how you can block access to the index data and what administrative restrictions are associated with this.

Program

BIOVIA Direct (all versions)

Operating System

All supported operating systems

Summary

For easier administration, BIOVIA Direct domain index data are partially visible to all Oracle users. The domain index data allows you to draw conclusions about the indexed original data, albeit limited in terms of content and time. This technote explains the technical background and describes how you can block access to the index data and what administrative restrictions are associated with this.

Background

BIOVIA Direct is an Oracle Data Cartridge that extends Oracle's ability to store and search for chemical molecules and reactions. The search for chemical structures and reactions is considerably accelerated by the use of a specific chemical index (domain index). The data of a BIOVIA Direct domain index is technically stored as Oracle Secondary Tables of the indexed table.

The calculation and storage of the chemical index data is relatively complex compared to the storage of the actual data to be indexed. To ensure that new registrations or changes to chemical data are not slowed down by the index calculation, the index data of the domain index is adapted in two separate steps: in the first step, index data that can be used without further processing (such as the molar formula) is calculated and ultimately written to the index tables - these are thus immediately available for an indexed search. Other index data, e.g. molecular keys or fingerprints for the substructure or similarity search, are also calculated immediately, but must then be entered into the index tree by means of various transformations. For this reason, this index data is initially only saved as raw data in certain secondary tables of the domain index and is only available for an indexed search after a manually or automatically triggered final index update. With the complete update, the raw data is then deleted from the secondary tables of the domain index.

As can be seen, careful planning of the final index update is essential for a permanently high performance of the indexed chemical search. BIOVIA Direct therefore provides functions with which a database administrator can determine how much index data is due for the final update. Based on this information, the administrator can then decide whether and when the final index update should be triggered.

In larger installations, there are usually multiple Oracle accounts with chemical data and corresponding BIOVIA Direct domain indexes, and the volume and frequency of chemical data changes is often not consistent or predictable in these accounts. The monitoring and triggering of pending index updates is therefore usually not carried out manually, but by means of PL/SQL scripts as part of a scheduled job.

To facilitate the automation of this administrative task, the current implementation provides that those secondary tables of the domain index that contain the temporary raw index data receive SELECT access for PUBLIC at index creation time. This enables pending index updates to be recorded from a central technical Oracle account. As the raw index data contains information that allows conclusions to be drawn about the indexed original data, albeit only to a limited extent and, as explained, only temporarily, this implementation poses a certain risk with regard to the confidentiality of the original data. The following solution describes how you can eliminate this risk and which functional restrictions you must observe.

At the same time, we would also be pleased to receive feedback and comments on the extent to which this simplified administration is actually desired and applied, or whether you would prefer to implement these functions more restrictively in the interests of increased data security.

Solution

A typical SQL command for creating a BIOVIA Direct domain index looks like this:

CREATE INDEX MOLECULES_MDLIX 

   ON MOLECULES(ctab) 

       INDEXTYPE IS C$DIRECT2024.MXIXMDL 

           PARAMETERS ('fingerprint=ecfp_6');

This command creates a set of 15+ secondary tables that conform to the following naming scheme:

<index_name>_<XXXX>

<index_name> represents the index name used in the CREATE INDEX command, and <XXXX> is a combination of three or four letters indicating the type of index data in the respective table. Please note that in the case of long index names, the index name may only be used in abbreviated form in the table name for Oracle backward compatibility reasons.

For the index name 'MOLECULES_MDLIX' used in the example above, the following tables are generated with the SELECT grant for PUBLIC:

MOLECULES_MDLIX_PROP

MOLECULES_MDLIX_SKY2

MOLECULES_MDLIX_SFPK

MOLECULES_MDLIX_FSUP

MOLECULES_MDLIX_FSDL

Pay particular attention to the last four letters of the table name, as these are always used in the same way regardless of the index name.

If you can accept the disadvantages mentioned below in the Limitations section, you can now remove the SELECT grant on these index tables that was issued when the index was created. To do this, execute the following commands:

REVOKE SELECT ON MOLECULES_MDLIX_PROP FROM PUBLIC;

REVOKE SELECT ON MOLECULES_MDLIX_SKY2 FROM PUBLIC;

REVOKE SELECT ON MOLECULES_MDLIX_SFPK FROM PUBLIC;

REVOKE SELECT ON MOLECULES_MDLIX_FSUP FROM PUBLIC;

REVOKE SELECT ON MOLECULES_MDLIX_FSDL FROM PUBLIC;

You can also extend the access rights again by subsequently assigning the SELECT grants to individual users again:

GRANT SELECT ON MOLECULES_MDLIX_PROP TO DIRECT_ADMIN;

GRANT SELECT ON MOLECULES_MDLIX_SKY2 TO DIRECT_ADMIN;

GRANT SELECT ON MOLECULES_MDLIX_SFPK TO DIRECT_ADMIN;

GRANT SELECT ON MOLECULES_MDLIX_FSUP TO DIRECT_ADMIN;

GRANT SELECT ON MOLECULES_MDLIX_FSDL TO DIRECT_ADMIN;

Limitations

Due to the lack of general access to these index tables, the following administrative BIOVIA Direct functions are now restricted in their use.

SELECT MDLAUX.PENDINGINVERSIONS

SELECT MDLAUX.PENDINGFASTSEARCH

SELECT MDLAUX.INDEXVERSION

These functions can now only be executed by the Oracle account that owns the domain index and, if you have reapplied the grants to specific users, by those users. If you execute these functions from another Oracle account, you will either receive an ORA-00942 error 'table or view does not exist' (for MDLAUX.PENDINGINVERSIONS and MDLAUX.INDEXVERSION), or the function will return the possibly incorrect value 0 (for MDLAUX.PENDINGFASTSEARCH).

More detailed information on the structure of the domain index tables and the administrative functions mentioned can be found in the BIOVIA Direct Administration Guide. The BIOVIA Direct Administration Guide is part of the BIOVIA Direct documentation package which you can download as a licensed customer from the Dassault Download Portal.

We are happy to answer your questions or receive your feedback on this topic, which you can post in the public BIOVIA Community forums or send as a support request to BIOVIA Support.

Need Assistance?

Our support team is here to help you make the most of our software. Whether you have a question, encounter an issue, or need guidance, we've got your back.