pynari 1.1 coming up: Support for system-level ANARI installs, and CPU fallback

Good news; getting closer to the next version of pynari: Feedback to 1.0 was overwhelmingly positive, but two things stood out: First, there seems to be some demand for having a CPU-fallback for the kind of hardware that doesn’t have any CUDA capable GPUs (MacBooks in particular seem to be en vogue here :-/). I mean, you definitely should be using hardware with NVIDIA GPUs when doing graphics [Disclaimer: yes, i do work for NVIDIA in my day job], but still, I get it. Second, there was some interest in having pynari also support other – system-installed – ANARI backends, not only the barney backend it’s currently shipping with. This of course is partially related to ‘a’ – the C++ ANARI SDK comes with some CPU devices as well – but is also partly based on the fact that that baked barney i’m currently including doesn’t have all the bells and whistles that a ‘full’ build of barney has. In particular, full barney can do data-parallel rendering over MPI, the baked version can’t, and since even one of the samples on the pynari web pages uses MPI this of course is a bummer.

Based on that feedback, version 1.1 is currently coming along as follows:

a) pynari will still ship with barney as a ‘baked’ backend that you can always use even if you have not manually installed any system-wide/non-python ANARI libaries. Creating a “default” device will still use that baked barney backend, just as in 1.0 – but if you specify any other backend name during device creation it’ll also look up system-installed ANARI devices as well. Ie, you don’t have to install any external ANARI SDK (there’s still the baked ‘default’), but you can if you want to (and yes, that also means you can use a system-installed barney with MPI support).

b) In addition to the baked CUDA/OptiX version of barney I’ll also include a CPU fallback device that’ll also work on non-NVIDIA hardware. This fallback uses Embree (obviously), but given that your CPU will likely have quite a bit less “oomph” than your GPU that fallback version will be “quite a bit slower” than the CUDA/OptiX accelerated version (in particular when using any wetextures of volume data – you really want to have texture units for that). I mean – it’s still better than nothing at all, and certainly faster than writing a native Python ray tracer … but …. ugh, it is going to be slower than barney_cuda, believe me that.

In theory that version should already be done – I already have 1.1-to-be building and running on both linux and windows – all the way to locally built wheels – but i’m still hitting some hiccups on the github actions builds I need for uploadable wheels – and since each build attempt now takes well over an hour, ironing these hiccups out just takes a while. Anyway – with the holidays coming up I should have some time to finish this up, so hopefully this 1.1 should come out before the year’s end.

PS: At least for the CPU fallback version I should in theory also be able to make it build on Macs… we’ll see. I got myself a shiny new macbook for just that very reason (Christmas present to myself, kind of!), and that just arrived this morning – but it still needs some setup, and i’m sure it won’t just build all that software out of the box, so even with the holidays coming up I don’t have any ETA on that just yet.

Leave a comment