Web lists-archives.com

git, monorepos, and access control




Hi,

I'm an engineer with AMD. I'm looking at whether we could switch our 
internal version control to a monorepo, possibly one based on git and 
VFSForGit.

One obstacle to moving AMD to git/VFSForGit is the lack of access 
control support in git. AMD has a lot of data whose distribution must be 
limited. Sometimes it's a legal requirement, eg. CPU core designs are 
covered by US export control laws and not all employees may see them. 
Sometimes it's a contractual obligation, as when a third party shares 
data with us and we agree only to share this data with certain 
employees. Any hypothetical AMD monorepo should be able to securely deny 
read access in certain subtrees to users without required permissions.

Has anyone looked at adding access control to git, at a per-directory 
granularity? Is this a feature that the git community would possibly 
welcome?

Here's my rough thinking about how it might work:
  - an administrator can designate that a tree object requires zero or 
more named privileges to read
  - when a mortal user attempts to retrieve the tree object, a hook 
allows the server to check if the user has a given privilege. The hook 
can query an arbitrary user/group data base, LDAP or whatever. The 
details of this check are mostly in the hook; git only knows about 
abstract named privileges.
  - if the user has permission, everything goes as normal.
  - if the user lacks permission, they get a DeniedTree object which 
might carry some metadata about what permissions would be needed to see 
more. The DeniedTree lacks the real tree's entries. (TBD, how do we 
render a denied tree in the workspace? An un-writable directory 
containing only a GITDENIED file with some user friendly error message?)
  - hashes are secret. If the hashes from a protected tree leak, the 
data also leaks. No check on the server prevents it from handing out 
contents for correctly-guessed hashes.
  - mortal users shouldn't be able to alter permissions. Of course, 
mortal users will often modify tree objects that carry permissions. So 
the server should enforce that a user isn't pushing updates that alter 
permissions on the same logical directory.

I would welcome your feedback on whether this idea makes technical 
sense, and whether the feature could ever be a fit for git.

You might ask what alternatives we are looking at. At our scale, we'd 
really want a version control system that implements a virtual 
filesystem. That already limits us to ClearCase, VFSForGit, and maybe 
Vesta among public ones.  Am I missing any? We would also want one that 
permits branching enormous numbers of files without creating enormous 
amounts of data in the repo -- git gets that right, and perforce (our 
status quo) does not. That's how I got onto the idea of adding read 
authorization to git.

Thanks,

John