Both VMD and PyMOL take ages to load a large trajectory at a large interval.

I have a 100GB trajectory of simulated 2 microseconds time recorded  at every 10 picoseconds. Every 1000th frame should give me 200 frames with a 10ns timestep.

PyMOL with the load_traj tells you which frame is being scanned, but it freezes the user interface completely. Finally after 30 minutes the command line frame counter reached the end of the trajectory. That’s when my computer started slowing down. I opened ‘htop’ and could see that the PyMOL was taking more and more memory – finishing at slightly over 8GB of RAM memory. Apparently, PyMOL only started loading the frames into memory after iterating over all of them – which is difficult to justify. Furthermore, it is difficult to imagine how loading 200 frames can take PyMOL to eat 8GB of RAM memory. Even using the trick for the large trajectories, “set defer_builds_mode, 3”, does not help.

How does VMD perform? The interface is almost frozen during the process but you can see the small redraws each time a new frame is loaded. Loading the trajectory’s every 1000th frame is equally slow. The memory using during loading is very low and a single CPU is sweating to the fullest, which I guess means that the trajectory is being uncompressed. This time there was no unpleasant suprises with the RAM memory usage but the process of loading the trajectory also took around 30 minutes.

What if we could extract the frames with some other software? Gromacs tools behave equally slowly. However, MDAnalysis is doing something magical:

u = MDAnalysis.Universe('npt2us_protCent_pbcMol.gro', 'npt2us_protCent_pbcMol.xtc')
with MDAnalysis.Writer("npt2us_protCent_pbcMol_10nsStep.xtc", u.atoms.n_atoms) as W:
    for ts in u.trajectory[::1000]: #every 10ns
        W.write(u)

And this takes just about 20 seconds to extract every 1000th frame and save it into another compressed .xtc file. How is this possible that the other, more widely accepted packages do this so inefficiently?

Now that I have the 200 frames extracted I can check if PyMOL really needs 8GB of RAM memory to load it. And yes it does. For the record, the 200 frames occupy 93MB on the drive, and equally little when VMD loads it. In addition, PyMOL still takes a lot of time to load the 200 frames. Whatever data structures it uses underneath, the user’s comfort is not the priority here.

 

 

MDAnalysis: 0.17.0
PyMOL: 2.1.0
VMD: 1.9.4a9