Best practice for using GIT on a shared lab computer?

In summary, the conversation discusses the use of GIT in a lab environment where multiple people share the same computer and Windows account. The main issue is that it becomes difficult to track changes and determine who has made them. Various workarounds have been suggested, such as using scripts or a -author parameter when committing, but there is a need for a GIT Windows client that can handle this scenario. The conversation also touches on the importance of tracking code changes and the use of GUIs for less formal programming tasks.
  • #1
f95toli
Science Advisor
Gold Member
3,509
1,071
TL;DR Summary
How to best use GIT on a shared lab computer with one account?
Firstly, I don't expect there to a single good answer to this question. I've done some googling and it seems there are multiple options, but I am still interested in suggestions and/or what your experiences are.

We started using GIT (GITLab) a couple of years ago and it works well when we are developing SW (mainly Python for analyzing data, simulations etc ) on our office computers. However, much of the software we work on is used to control experiments in the lab. Our lab computers are (obviously) shared machines and we use a single account for all machines which everyone uses (you can't really log on or off in the middle of a measurement run and several people work on the same machines) .

My question is if someone has experience of using GIT in this scenario? If so, what is "best practice"?
We haven't really settled on a way of of working, and right now GIT Is essentially only being used as a backup system.
Obviously, this is not ideal and means that we are not really keeping track of changes.

I know we are not the only ones with this problem and after some searching I've found a number of partial workarounds. However, they all seem to be very inconvenient and/or using e.g. various Linux BASH scripts which is not really an option since we are using WIn10 machines.

I suspect a partial answer is to force the use of the -author parameter when committing,

e.g.
git commit --author="Someone Unknown <uknown@example.com>"

But ideally, I would like to be able to find a GIT Windows client that can handle this...
 
  • Like
Likes Twigg
Computer science news on Phys.org
  • #2
There are no best practices that I know of. You should know how to clone/ fork a git project, how to checkin your changes and how push your changes to the master.

Sometimes things get out of sync and you’ll need to know how to get things resynced.

Sometimes your repo may get too large to checkin and you’ll have to rethink what you want saved by making a new repo. I had this issue saving large changing binary images which made a really large repo.
 
  • #3
Is your problem that your uses conflict with others? Perhaps you need version N of some software, while others need version M.
 
  • #4
anorlunda said:
Is your problem that your uses conflict with others? Perhaps you need version N of some software, while others need version M.
Well, the problem is that we have multiple people using and modifying the same code (which continuously evolves during an experiment) using the same computer and the same Windows account while in the lab.
When we then go back go to the office and continue working on the same code there is no way to see who has done what or even your own changes. Forking (when needed) also gets very messy for the same reason (who created the fork?).

For most of what we do in the lab, everyone should (ideally) be using the same version of the software; and this is also used on multiple computers.

A large part of the problem is that much of the GIT configuration (including the name of the author) is tied to the computer account; a method that asked the user to identify themselves before committing would probably solve most problems. I have seen such functionality implemented under Linux but nothing that would work under windows.
 
  • #5
f95toli said:
My question is if someone has experience of using GIT in this scenario? If so, what is "best practice"?
We haven't really settled on a way of of working, and right now GIT Is essentially only being used as a backup system.
Obviously, this is not ideal and means that we are not really keeping track of changes.
This confuses me. Doesn't Git always track changes in source code? In a test lab situation, I would check in all code used for any significant set of test runs. It was useful when looking for a change that might have caused a problem and also during audits to prove that changes were being adequately managed.
f95toli said:
I know we are not the only ones with this problem and after some searching I've found a number of partial workarounds. However, they all seem to be very inconvenient and/or using e.g. various Linux BASH scripts which is not really an option since we are using WIn10 machines.
You can use scripts on Win10 machines. Python, Perl, BAT scripts should work in a Command Prompt window.
f95toli said:
I suspect a partial answer is to force the use of the -author parameter when committing,
It might be more useful to be able to track code changes to specific change requests so that you can find the paperwork that documents the change.
f95toli said:
e.g.
git commit --author="Someone Unknown <uknown@example.com>"
This might be a good idea. We would track changes to the appropriate change request paperwork which would tell us who made the change. In reality, there were not as many people making changes at the same time as one might expect.
f95toli said:
But ideally, I would like to be able to find a GIT Windows client that can handle this...
It might have been ignorance on my part, but I never felt confident that the GUIs would give enough utility. But they were good enough for less formal use as a programmer's tool.
 
  • #6
f95toli said:
Well, the problem is that we have multiple people using and modifying the same code (which continuously evolves during an experiment) using the same computer and the same Windows account while in the lab.
When we then go back go to the office and continue working on the same code there is no way to see who has done what or even your own changes. Forking (when needed) also gets very messy for the same reason (who created the fork?).
This sounds like a very challenging configuration management problem. If you can force all the programmers to check-in code using a Python, Perl, or .BAT script, then you can have the script prompt for the programmer name. But there might still be a lot of problems with uncoordinated code changes.
 
  • #7
  • #8
You could try creating different users under Windows Subsystem for Linux (WSL) and enforcing commits from there? Not sure how this would work with authentication to GitLab though, I can't see it working through https so you'd need ssl keys for each user on each machine.

Or you could install (probably Linux) virtual machines with individual users set up.

I don't think you will find an easy way around this because sharing accounts simply isn't best practice, or even close. It's a bit like saying 'to make it easier for everyone to get into the office we all use the same ID key card which we currently keep under the doormat; what is best practice for doing this?' :-p
 
  • #9
@pbuk, even if it isn't 'best practice', wouldn't using Mercurial suffice to serialize updates?
 
  • #10
sysprog said:
@pbuk, even if it isn't 'best practice', wouldn't using Mercurial suffice to serialize updates?
How would that be different? Hg is not going to have any more information about the user making the commit than git would have.
 
  • #11
pbuk said:
How would that be different? Hg is not going to have any more information about the user making the commit than git would have.
They're sharing a single ID on git. They're not required to use the "author=" parm, and they're not consistently using it, so they don't always know who did what.. They could serialize and track using multiple IDs locally with Mercurial, and upload the consensus once a day.
 
Last edited:
  • #12
Actually I've just thought of the obvious solution: prefix all commit messages with the user name eg @pbuk: Add unit conversion.

Advantages:
  • easy to do
  • safe fallback if it is not done (unlike e.g. manually updating git user when you switch machines)
  • easily visible in the commit history
Disadvantages:
  • `git blame` and similar tools are not going to be helpful (although you could write a commit hook that changes the git user according to the commit prefix).
 
  • Haha
Likes sysprog
  • #13
sysprog said:
They could serialize and track using multiple IDs locally with Mercurial, and upload the consensus once a day.
But won't they still have to remember to switch Id in Mercurial, with the added complication of now having 2 different VCSs to manage?
 
  • #14
Thanks for all comments/suggestions

pbuk said:
Actually I've just thought of the obvious solution: prefix all commit messages with the user name eg @pbuk: Add unit conversion.

Advantages:
  • easy to do
  • safe fallback if it is not done (unlike e.g. manually updating git user when you switch machines)
  • easily visible in the commit history
Disadvantages:
  • `git blame` and similar tools are not going to be helpful (although you could write a commit hook that changes the git user according to the commit prefix).

yes, that might be the easiest solution.
It would be nice if there was a more "automated" solution, but for now that might have to do

pbuk said:
I don't think you will find an easy way around this because sharing accounts simply isn't best practice, or even close. It's a bit like saying 'to make it easier for everyone to get into the office we all use the same ID key card which we currently keep under the doormat; what is best practice for doing this?' :-p
Indeed, but shared accounts are unavoidable in a lab setting (and in many other settings as well).
We can't have a piece of software that is controlling a large experimental setup that is used by multiple people and is running 24/7 be associated with a single user.

It would be nice there was a "switch user" functionality in Windows which allowed multiple people to share the same "Instance" of the Windows desktop but where Windows was still "aware" of who the current user was and could therefore control permissions etc.
But since this is not possible we have to find workarounds.
 
  • #15
f95toli said:
It would be nice if there was a more "automated" solution, but for now that might have to do
As I say you could automate it with a commit hook, or even a script that hacks the commit history after the event.

f95toli said:
Indeed, but shared accounts are unavoidable in a lab setting (and in many other settings as well).
We can't have a piece of software that is controlling a large experimental setup that is used by multiple people and is running 24/7 be associated with a single user.
But that's exactly what you do have - it's just that you have different people pretending to be that 'single user'. You shouldn't really have this control software running in userspace at all - the right way to deal with this would be for the control software to run as a service in the background. Users would log in individually using their own Windows accounts.
 
  • #16
pbuk said:
But that's exactly what you do have - it's just that you have different people pretending to be that 'single user'. You shouldn't really have this control software running in userspace at all - the right way to deal with this would be for the control software to run as a service in the background. Users would log in individually using their own Windows accounts.
"Control software" in this case means scripts (usually a Jupyter Notebook or a Matlab script) either controlling and acquiring data from a bunch of instruments (waverform generators, digitizers, oscilloscopes, spectrum analyzers et) directly via Ethernet/USB or controlling a "virtual" instrument frontpanel provided by the instrument manufacturer (which in turn controls the electronics in the instrument, this is becoming more common). This is the suite of SW we often need to develop/modify while in the lab (because you need to be able physically monitor what happens when you run your code by e.g. looking at an oscilloscope).

While the measurement is running we usually also need to plot lot of diagnostic information in various graphs. Everyone who is in the lab needs to be able to see this information.
Obviously, a single measurement instrument can only do one thing at a time; meaning we can't have multiple users trying to use the same setup at the same time.

Hence, whereas I agree that a "service" model would in theory be better; it is usually not realistic.

That said, some of our newer instrument can in fact be set up so one computer runs a server which acts as a "master instrument" and then you can have multiple clients connecting via TCP/IP
However, this is still "blocking" action since we can not -as mentioned above not- have several people using the same measurement setup at the same time (there are also very good reasons for why you don't want instruments connected to very sensitive samples suddenly jumping between settings).

It is possible to set up a "cloud access" model for this with a queue system etc (something similar to IBM's Qiskit platform). However, such platforms are all proprietary and there will still only be one instance of the the software handling the low level control; the latter is the type of SW we are working on my lab.
 
  • #17
f95toli said:
However, this is still "blocking" action since we can not -as mentioned above not- have several people using the same measurement setup at the same time (there are also very good reasons for why you don't want instruments connected to very sensitive samples suddenly jumping between settings).
Sorry, I don't understand the problem. You say that multiple users cannot control the "master" computer at once. Isn't that a feature, not a bug (for the reasons you list)? What's wrong with that model?
 
  • #18
pbuk said:
But won't they still have to remember to switch Id in Mercurial, with the added complication of now having 2 different VCSs to manage?
No, each participant would use his own Mercurial ID, and only the faculty supervisor would have the password to the git account. That's simple enough. I imagine that requiring every participant to have his own git account might be exceeding the academic authority.
 
  • #19
Twigg said:
Sorry, I don't understand the problem. You say that multiple users cannot control the "master" computer at once. Isn't that a feature, not a bug (for the reasons you list)? What's wrong with that model?
I did not mean that it was a problem; I was just trying to explain why there usually isn't much point in running control software as a service as pbuk suggested.
I should have explained it better.
 
  • Like
Likes Twigg
  • #20
f95toli said:
twigg said:
Sorry, I don't understand the problem. You say that multiple users cannot control the "master" computer at once. Isn't that a feature, not a bug (for the reasons you list)? What's wrong with that model?
I did not mean that it was a problem; I was just trying to explain why there usually isn't much point in running control software as a service as pbuk suggested.
I should have explained it better.
The single-threaded access to the 'master' computer provides adequate serialization; however, it does nothing to track who did what, so that participants can communicate collaboratively.
 
  • #21
Do you intend to have users remote connect to the master computer (e.g., each user AnyDesk/TeamViewer's into the master computer and runs the script from there) or do you intend to have users run the script locally on their own machines, using the resources/peripherals of the master computer shared over TCP/IP?

In the latter case, I believe you do not need everyone to share one account. No?
 

FAQ: Best practice for using GIT on a shared lab computer?

How can I safely share my code on a lab computer using GIT?

The best practice for using GIT on a shared lab computer is to create a separate branch for each user and to regularly commit and push changes to the remote repository. This way, each user's code will be kept separate and can be easily accessed by others.

Can I use a single remote repository for multiple lab computers?

Yes, you can use a single remote repository for multiple lab computers. Just make sure to regularly pull changes from the remote repository to keep all computers up to date with the latest code.

How can I avoid conflicts when multiple users are working on the same file?

To avoid conflicts, it is important to communicate with other users and coordinate your changes. It is also helpful to break down large files into smaller ones, so that multiple users can work on different sections without causing conflicts.

Is it necessary to create a README file for each project on the lab computer?

While it is not necessary, it is highly recommended to create a README file for each project on the lab computer. This file should include important information about the project, such as how to run it, any dependencies, and the purpose of the project.

How do I handle sensitive information in a shared lab computer environment?

If you need to include sensitive information in your code, such as API keys or passwords, it is best to use environment variables or a configuration file that is not tracked by GIT. This way, the sensitive information will not be shared with others through the remote repository.

Similar threads

Replies
8
Views
1K
Replies
16
Views
2K
Replies
3
Views
2K
Replies
5
Views
856
Replies
4
Views
1K
Replies
2
Views
2K
Back
Top