How to “redirect” filesystem read/write calls without root and performance degradation?

70
July 16, 2018, at 00:50 AM

I have non-root access to a server that is shared by many users. I first develop and run some code locally, and then I want to rsync my data to a temporary location on a remote server and run my code on a remote server without changing any file paths.

I want to transparently hijack filesystem reads and writes and redirect them to different folders, like, if I run

redirect /home/a /home/b/remote-home/a python code.py

and then code tries to read from /home/a/a.txt, it should get content of /home/remote-home/a/a.txt, and same with writes.

I am particularly interested in doing this for a python process if that is necessary. I use a lot of third-party libraries that do file IO, so just mocking builtins.open is not an option. That IO is pretty intensive (reading and writing gigabytes of data), so performance degradation that exceeds something like 200-300% is an issue.

Options that I am aware of are:

  • redefining read,read64, write, etc. calls with a LD_PRELOAD that would call real functions with different paths under the hood
  • same with ptrace
  • unshare and remount parts of the filesystem, but userspace namespacse are disabled in my particular case for whatever security reasons

First two options seem not very reliable (and ptrace must be slow), unless there is some fairly stable piece of code that does exactly that so I could be sure that I did not make any obvious buffer overflow errors there. Containers like docker are not an options because they are not installed on the remote server. Unless, of course, there are some userspace containers that do not rely on linux namespaces under the hood.

READ ALSO
How add edgelines in contourf , for make one 3D mesh

How add edgelines in contourf , for make one 3D mesh

Have some way of plot or make edgelines in one contourf plot?

54
Ticker deadlocks sometimes

Ticker deadlocks sometimes

My Ticker class is intended for a master thread to increment the ticker every time a set of registered slave threads have completed one iteration of their jobsThis seems to work fine but once in a while I end up with a deadlock

37
What is the difference between the terms accuracy and validation accuracy

What is the difference between the terms accuracy and validation accuracy

I have used LSTM from Keras to build a model that can detect if two questions on Stack overflow are duplicate or notWhen I run the model I see something like this in the epochs

37
How to add hours to specific timestamps in a pandas df

How to add hours to specific timestamps in a pandas df

I have a pandas df that contains a Column of timestampsSome of the timestamps are after midnight

44