Discovering repos in a fleet

A design spec for a fleet discovery phase

Having discussed the design constraints, I'll briefly cover how it will work:

The first step here is a discovery phase, which would either take the user's GitHub username as a CLI flag or pick it up from the gh tool (not having this tool installed is not fatal).

In the first use, we would pick up the username from gh status if not provided, or else fail fast: we need to know which user to discover repos for. When we get one, we would store it in the user-level config for future reference without having to keep referring to gh status.

On subsequent use of our discovery tool, we can then search for repos pushed to more recently than the most recent pushed_at value of the previous discovery run.

We then do a sparse checkout shallow clone of the git repo with only those particular files checked out. If any files have been added, edited, or deleted, we blow away the local git repo and recreate it (this is not strictly incremental but less fiddly than the alternative and we do not expect the cloning here to take substantial time, so it is acceptable to repeat download here, initially).