## Polymake & OpenMP

General discussion on polymake goes here.

Moderator: Moderators

nkriha
Posts: 5
Joined: 26 Nov 2011, 21:59

### Polymake & OpenMP

I've been working on getting OpenMP to work inside polymake with mixed results. So far there are two major issues with OpenMP and polymake.
The first problem looks to me like a bug in the openmp implementation of gcc. If polymake is compiled with CFLAGS=-fopenmp CXXFLAGS=-fopenmp LDFLAGS=-fopenmp to enable OpenMP something simple as

Code: Select all

#pragma omp parallel for for(unsigned long int i = 0; i < 1000000000; ++i){ d = sqrt((double)i); } 
pasted into some function (for example lp_projection in the polytope app) indeed uses multiple threads when executed. The problem is that the load of each core stays below 80% and the runtime increases to seriel_runtime*number_of_threads. I tried to build a simple test case to simulate what it going on in polymake with a perl script that calls a function inside a shared library that uses openmp but I wasn't able to reproduce the problem. I suppose the procedure isn't that simple in polymake.
I have only tested this with gcc version 4.6.1 so far. I considered giving icc a shot but failed already at the ./configure stage of building polymake with it so I gave up on it. I figured out that I don't have that problem if I use polymake as a callable library so for me this isn't that much of an issue right now.

The other issue are the smart objects used in polymake. As far as I understand it almost every access to those shared objects can trigger a change to the alias set, even read only access. This creates problems if multiple threads require access to one of those objects because most likely it will leave the aliasset in an inconsistent state after a parallel section. I didn't see an obvious way to get a 'dumb' reference to the data encapsulated in a smart objects when poking around in polymakes internals so I created very crued wrappers for the Vector and Matrix classes. They just forward access to the memory location of the shared object data. It's an ugly hack and I wouldn't be surprised if it breaks things in other places because my understanding of the polymake internals is very limited, but it seems to work. At least for the specific case where I tested it. Obviously things will fall apart if the memory allocated by the sharedobject/array gets relocated.

I've attached the wrapper and a version of the lp_projection function that uses openmp. It gives a reasonable parallel speedup, if used via the callable library mechanism.
Attachments
polymakemp.zip
'dumb' Matrix & Vector wrapper and parallel lp_projection

gawrilow
Main Author
Posts: 307
Joined: 25 Dec 2010, 17:40

### Re: Polymake & OpenMP

Actually, alias handlers are not active in read-only access mode, they are only used for tracking of mutable aliases like matrix minors. What indeed may change anytime are the reference counters of the data instances. Surely we can modify them with atomic operations, but it would incur quite some overhead, I'm afraid. Generally, I would refrain from making the basic objects of the template library like matrices or sets thread-safe. This kind of fine-grained parallelization wouldn't bring much, but will inevitably lead to yet more errors than we have now and some performance degradation in average.

Another thing to be considered in a MT-setup is memory allocation. All container classes in the polymake template library make extensive use of the old good pool allocator inherited from the early days of gcc. It is a singleton, thus using it from several threads in parallel, although possible and safely implemented, leads to a dramatic loss in performance as soon as you start some calculations with Rationals. Surely we could consider migration to another allocator, using thread-local storage, but again we'll have to pay a price for it: either every object will have to contain a private reference to the allocator object, which would unnecessarily blow the size of nested containers like Array<Set> and such, or we'll have to retrieve a thread-local key for each and every allocation of a little bit of memory like an element of a Set. For Rational and Integer numbers the first approach is at all infeasible, because GMP requires a static allocation function.

Finally, the thread model of the perl interpreter is completely unsuitable to polymake needs. Thus all function calls from C++ code back into the perl world, like give() or take(), have to be synchronized. Again, if done on a general basis, it would incur unnecessary costs.

I think, first of all we have to discuss the most probable and promising use cases of parallelization, and then carefully introduce the necessary support for it step by step. Probably it would be an interesting topic in the agenda of the upcoming workshop in March.

joswig
Main Author
Posts: 203
Joined: 24 Dec 2010, 11:10
Contact:

### Re: Polymake & OpenMP

I think, first of all we have to discuss the most probable and promising use cases of parallelization, and then carefully introduce the necessary support for it step by step. Probably it would be an interesting topic in the agenda of the upcoming workshop in March.
I am quite sure that starting with lp_projection as suggested above is perfect; and the preliminary results shown indicate that parallelization of this kind actually pays.