User-defined file readers

OVITO already comes with a collection of built-in file readers. But if you need to import a custom format, or some new file format not yet supported by the software, OVITO’s programming interface gives you the possibility to write your own file reader in the Python language.

To implement a custom file reader you need to define a new Python class, similar to the advanced programming interface for Python modifiers:

from ovito.data import DataCollection
from ovito.io import FileReaderInterface, import_file
from typing import Callable, Any

class MyFileReader(FileReaderInterface):

    @staticmethod
    def detect(filename: str):
        ...

    def scan(self, filename: str, register_frame: Callable[..., None]):
        ...

    def parse(self, data: DataCollection, filename: str, frame_info: Any, **kwargs: Any):
        ...

You can freely choose the class name (we use MyFileReader as an example here) and the class must derive from the base ovito.io.FileReaderInterface.

Once your class has been registered, OVITO Pro and the import_file() function will try to open files with the help of all installed file readers by calling their detect() methods. The first one that returns True from its detect() method will be used by the system to actually import the requested file.

The `detect` method

The detect() method is a static method of the file reader class and takes the path of the file to be inspected as the only input parameter (no self parameter).

This method is called by OVITO whenever the user tries to import a new file to determine whether that file can be parsed by your file reader. That means your implementation should return True if your reader class can process a given file and False otherwise. For efficiency, the decision should be made as quickly as possible, i.e. by reading and inspecting just the first few lines of the file, in order to not slow down the import of files that will be handled by other file readers.

Let’s consider the following example, where our file reader looks for text files containing the string “Header of MyFileFormat” on the first line:

from ovito.data import DataCollection
from ovito.io import FileReaderInterface, import_file
from typing import Callable, Any

class MyFileReader(FileReaderInterface):

    @staticmethod
    def detect(filename: str):
        try:
            with open(filename, "r") as f:
                line = f.readline()
                return line.strip() == "Header of MyFileFormat"
        except OSError:
            return False

    def scan(self, filename: str, register_frame: Callable[..., None]):
        ...

    def parse(self, data: DataCollection, filename: str, frame_info: Any, **kwargs: Any):
        ...

Our implementation of the detect() method opens the file, reads one line, and returns True in case it matches the key string we are looking for.

The `scan` method

The scan() method is an optional method that should be implemented only if the files to be read by your reader can store multiple frames of a trajectory. It will be called by the system to index all frames in the imported file, populate the timeline in OVITO, and enable quick random access to individual trajectory frames.

An implementation of the scan() method usually reads the whole file, discovers all frames, and communicates each frame’s metadata to the OVITO system. This happens via invocation of the register_frame callback function for each discovered frame. The callback is provided by the system and has the following signature:

register_frame(frame_info: Any = None, label: Optional[str] = None)

frame_info can be (almost) any type of Python value and is used by your file reader to describe the storage location of each frame in the file. One might, for example, use the line number or the byte offset where each frame begins in the file as frame_info. Or, for a database format, one might use the unique record key of a frame as its frame_info, which can later help to access the data of the frame efficiently. The frame_info values will be stored by the OVITO system as part of the trajectory index and will be made available later again to your file reader’s parse() method when loading specific frames from the file.

The label parameter is optional and specifies a human-readable text to be used as a descriptive label for the trajectory frame in the OVITO timeline. It has purely informational character, e.g., the simulation timestep.

In our example file format, each frame begins on a new line with the format “Timestep <T>: <N> particles”. Here, T denotes the simulation timestep, and N the number of particles in the simulation snapshot. We might write the following scan() method, which specifically searches for these frame headers using a regular expression:

from ovito.data import DataCollection
from ovito.io import FileReaderInterface, import_file
from typing import Callable, Any
import re

class MyFileReader(FileReaderInterface):

    @staticmethod
    def detect(filename: str):
        try:
            with open(filename, "r") as f:
                line = f.readline()
                return line.strip() == "Header of MyFileFormat"
        except OSError:
            return False

    def scan(self, filename: str, register_frame: Callable[..., None]):
        expr = r"(Timestep \d+): (\d+) particles"
        with open(filename, "r") as f:
            for line_number, line in enumerate(f):
                match = re.match(expr, line.strip())
                if match:
                    number_particles = int(match.group(2))
                    label = match.group(1)
                    register_frame(frame_info=(line_number, number_particles), label=label)

    def parse(self, data: DataCollection, filename: str, frame_info: Any, **kwargs: Any):
        ...

Here, both the line number at which a frame starts and the number of particles it contains are stored as tuple in the frame_info for later use. The string “Timestep …” is specified as a label when registering trajectory frames with OVITO.

The `parse` method

The parse() method is the main function you need to implement for a file reader. It will be called by OVITO to load actual data from the file, one trajectory frame at a time, and has the following basic signature:

def parse(self, data: DataCollection, filename: str, **kwargs):

The first time your parse() implementation gets called by the system, it receives an empty DataCollection object, which should be populated with the information loaded from the input file. This typically involves creating one or more data objects, e.g. Particles, SimulationCell, SurfaceMesh, TriangleMesh, within the DataCollection, or populating the DataCollection.attributes dictionary with auxiliary metadata parsed from the file.

On subsequent invocations of parse(), the DataCollection provided by the system may already contain objects from a previous trajectory frame, and your implementation should update or add only information that has changed in the current frame. That means, for example, that particle types shouldn’t be recreated by the file reader every time. Rather, existing data in the collection should be touched only selectively by the file reader to preserve any changes the user has made in the GUI in the meantime. This applies, for instance, to parameters of particle types such as color, radius, and name but also settings of visual elements, which can be concurrently edited by the user in the GUI.

Tip

The Python API of OVITO provides special functions that create new data objects only if needed and otherwise preserve existing information and visualization settings associated with these objects:

In case you are developing a file reader for a trajectory file format, you can use the following extended signature of the parse() method:

def parse(self, data: DataCollection, filename: str, frame_index: int, frame_info: Any, **kwargs):

The frame_info value, which was generated by the file reader’s scan() method introduced above, and the zero-based trajectory frame to be loaded are passed to your parse() method by the system. Any further keyword arguments from the system go into the kwargs dictionary.

Important

The trailing **kwargs parameter must always be part of the method’s parameters list to accept all further arguments, which may be provided by future versions of OVITO. It’s there for forward compatibility reasons and to receive all unused arguments your parse method is not interested in.

Please have another look at our example file format defined above, for which we will now implement a parsing method. The following parse() implementation skips through the initial lines of the input file until it reaches the one where the requested frame begins. Then a Particles object with a data array for the Position property is created before the xyz coordinates of the particles are parsed from the file line by line:

from ovito.data import DataCollection
from ovito.io import FileReaderInterface, import_file
from typing import Callable, Any
import re

class MyFileReader(FileReaderInterface):

    @staticmethod
    def detect(filename: str):
        try:
            with open(filename, "r") as f:
                line = f.readline()
                return line.strip() == "Header of MyFileFormat"
        except OSError:
            return False

    def scan(self, filename: str, register_frame: Callable[..., None]):
        expr = r"(Timestep \d+): (\d+) particles"
        with open(filename, "r") as f:
            for line_number, line in enumerate(f):
                match = re.match(expr, line.strip())
                if match:
                    num_particles = int(match.group(2))
                    label = match.group(1)
                    register_frame(frame_info=(line_number, num_particles), label=label)

    def parse(self, data: DataCollection, filename: str, frame_info: tuple[int, int], **kwargs: Any):
        starting_line_number, num_particles = frame_info

        with open(filename, "r") as f:
            for _ in range(starting_line_number + 1):
                f.readline()

            particles = data.create_particles(count=num_particles)
            positions = particles.create_property("Position")

            for i in range(num_particles):
                positions[i] = [float(coord) for coord in f.readline().strip().split()]

While this first example introduced the basic principles of user-defined file readers in OVITO, code example FR1 will present a more thorough implementation of a custom file reader, focusing on how to load more particle properties and the simulation cell geometry into OVITO.

Testing your file reader

To test your file reader outside of OVITO Pro, you can add a Python main program to the .py file in which you define the file reader class and invoke the general import_file() function:

class MyFileReader(FileReaderInterface):
    ...

if __name__ == "__main__":
    pipeline = import_file("myfile.dat", input_format=MyFileReader)
    for data in pipeline.frames:
        ...

Note that we pass the custom file reader class to the import_file() function. This circumvents the automatic detection of the file format (your detect() method won’t be called by the system!) and the scan() and parse() methods will be invoked immediately. The for-loop iterates over all trajectory frames registered by the file reader and loads them one by one.

The approach described above is good for testing or if you just want to use your custom file reader from a standalone Python program. For full integration into OVITO Pro and to make your file reader participate in the automatic file detection system, it needs to be installed and registered as a discoverable extension. This process is outlined in the section Packaging and installation of user extensions for OVITO.

User parameters

Your Python file reader can optionally expose adjustable user parameters based on the Traits framework, which are displayed automatically in the user interface of OVITO Pro. The system works analogously to Python modifier classes that expose user-defined parameters.

The import_file() function will forward any additional keyword arguments to your file reader class and initialize its parameter traits with matching names.